This blog will be a good practice for big data developer/engineers to learn and execute common hive technical interview questions like the query to find friends of friends, transpose rows to columns.
Before moving to this blog user can refer our below link blogs to more on hive.
https://acadgild.com/blog/hive-beginners-guide
https://acadgild.com/blog/bucketing-in-hive
https://acadgild.com/blog/hive-real-life-use-cases
Then, let us begin with queries.
Scenario 1: write a query to find friends of friends i.e, mutual friends of a user.
DATASET:
Here we are using dataset freind_details to achieve the above objective which contains 2 columns user_name, and user_friend_name.
DOWNLOAD LINK:
You can download the friend_details dataset from the below link.
Friends_List_Dataset_Link
Once the input dataset is downloaded you use the below command to create table and load the dataset into the created table.
CREATE TABLE:
LOAD DATA:
You can view the table values using the select all command.
VIEW DATA:
Now, let us write the query to convert to find friend’s friends of a user.
QUERY:
OUTPUT:
Here in the above query, we are creating a alias table name as f1 and f2, applying where condition to get mutual friends of the user prateek.
As we can see from the above result Onkar is a friend of Prateek and ajit and sumit are friends of Onkar, thus mutual friends of Prateek are ajit and sumit.
Scenario 2: In the SALES table quantity of each product is stored in rows for every year. Write a query to transpose the quantity for each product and display it in columns?
DATASET:
Here we are using dataset prod and sales to achieve the above objective.
DOWNLOAD LINK:
You can download the product details and product sales datasets from the below link.
PRODUCT_DETAILS_LINK
PRODUCT_SALES_LINK
DATASET DESCRIPTION:
PROD:
- PROD_ID
- PROD_NAME
SALES:
- SALE_ID
- PRODUCT_ID
- YEAR
- QUANTITY
- PRICE
Once the input dataset is downloaded you use the below command to create table and load the dataset into the created table.
CREATE TABLE PROD:
LOAD DATA:
VIEW DATA:
CREATE TABLE SALES:
LOAD DATA:
VIEW DATA:
Now, let us write the query to convert to transpose the quantity for each product and display it in columns.
Here the result should contain the product name and the total product quantity sold in each year.
EXPECTED RESULT:
prod_name quantity_sold_in_2016    quantity_sold_in_2017 quantity_sold_in_2018
QUERY:
In the above query, we have used IF condition to create year columns and performed groupby operation based on the product name.
OUTPUT:
We can see from the output we have successfully shown the result of the total amount of products sold in yearly wise.
Scenario 3: Write a query to find the number of products sold in each year?
For, the above query we will be referring to the sales table which we have used in the above scenario.
QUERY:
OUTPUT:
As we can observe we have successfully achieved our object to find the number of products sold in each year.
We hope this post has been helpful in understanding how to perform data analysis using Hive. In the case of any queries, feel free to comment below and we will get back to you at the earliest.
Keep visiting our site ACADGILD for more trending blogs updates on Big Data and other technologies.
Related Popular Courses:
BIG DATA DEVELOPER
INSTALL NIFI ON LINUX
DATA SCIENCE COURSES ONLINE
BEST ANALYTICS COURSES