In this blog, we will learn how to execute Script File in Hive. Hive is a critical component of Hadoop and your expertise in Hive can land you top-paying jobs!
Three ways to start Hive.
1. Hive shell: Command line interface
2. The Hive Web Interface is an alternative to using the Hive command line interface. The Hive Web Interface abbreviated as HWI, is a simple graphical user interface (GUI).
3. Writing a Script File in Hive: Write all the queries inside a .sql file, and run the queries without logging into Hive shell (Directly from Linux terminal).
First, we will go through the normal process to query a file in the Hive. Refer to the steps mentioned below to complete jobs run by a user. Later we will find out how to write Script File in Hive.
Here we deal with two different sample datasets as described below.
Cust: ID ,NAME ,AGE ,ADDRESS ,SALARY
Order: OID ,DATE ,CUSTOMER_ID ,AMOUNT
To load data in a Hive shell, we need to “create table cust”. Refer to the following screenshot.
Now load the data with the following query:
Refer to the following screenshot for creating table order:
Refer to the following screenshot for loading data inside the table order:
Finally, once we have all the data inside Hive tables, we can now run our query.
In this case, we perform “join” on the sample tables and query the customer NAME and AGE whose ID is 3.
It took us 5 jobs to run the query until its end.
We will query the same for the second time, but while running the Script File in Hive, it is important to note the number of times users have to run a job.
Create a Script File in Hive, which will have all the above individual queries in one. In this case,
we name it as hive_script.sql.
*Note: Remember to save the file in SQL format.
Refer to the screenshot of Script File in Hive below where you can see that queries are exactly the same as all the above queries.
On a terminal, run the following syntax to initiate the job.
$ hive -f <location of hive script>;
In this case
Hive -f /home/prateek/Documents/HIVE/script_HIVE.sql
We can see in the screenshot above that all the different queries are run as a job and we finally get results in just 2 steps.
This is how Script File in Hive are run and executed on any Linux machine.
For more information on how to reduce query complexity in Hive, Keep visiting www.acadgild.com for more updates on the courses.