In this instructional post, we will see how to run Hive queries using the Hive Web Interface (HWI). Hive Web Interface (HWI) is a simple graphical user interface (GUI) of the Hive. It is an alternative to using the Hive command line interface.
Features of HWI
⦁ Schema Browsing
An alternative to running ‘show tables’ or ‘show extended tables’ from the CLI is to use the web-based schema browser. The Hive metadata is presented in a hierarchical manner that allows you to start at the database level, and by clicking one can get information about tables, including SerDe, column names, and column types.
⦁ Detached Query Execution
A power user issuing multiple Hive queries simultaneously would have multiple CLI windows open. The Hive Web Interface manages the session on the web server, and not from inside the CLI window. This allows a user to start multiple queries and return to the web interface later to check the status.
⦁ No Local Installation
Any user with a web browser can work with Hive. This has the usual web interface benefits. In particular, a user wishing to interact with Hadoop or Hive requires access to many ports. A remote or VPN user would only require access to the Hive Web Interface running by default on 0.0.0.0 tcp/9999.
Before proceeding further, I assume that you have your Hadoop and Hive setup ready. If not, check the following links:
Once you have your setup ready let us get started and see how to use the HWI. The first thing to consider is the configuration. You may have to consider some of the below mentioned properties according to your needs.
NOTE: I have used Cloudera VM, you can try on your existing Hadoop system as well.
⦁ hive.hwi.listen.host: The host address, the Hive Web Interface, will listen on.
⦁ hive.hwi.listen.port: The port the Hive Web Interface will listen on.
⦁ hive.hwi.war.file: This is the WAR file with the jsp content for Hive Web Interface.
*Note: I have considered the default values, however, if you want to change these properties, you can change it in hive-site.xml file present under $HIVE_HOME/conf directory.
Take a look at some default properties:
<description>This is the host address the Hive Web Interface will listen on</description>
<description>This is the port the Hive Web Interface will listen on</description>
<description>This is the WAR file with the jsp content for Hive Web Interface</description>
*Tip: While working with HWI, I found that the version of Hive that I was working with had a .war file missing inside the $HIVE_HOME/lib library, hence I had to work around this issue. I downloaded Hive 0.12, unzipped it, and copied the .war file from it to my current working hive/lib.
To start HWI, type the following command:
bin/hive –service HWI
Once the HWI service is up, you can go to your browser and then use the HWI address.
You need to first create a “Session” and only then can you perform your Hive query.
⦁ Click “Create Session” under “Sessions”
⦁ You can enter any name in the “Session Name” box and then click “Submit”
⦁ Once the session is created, you will see the session details page as shown below:
⦁ In the Result File box, you will need to enter a filename where you will see the result of your query. This file is local to the web server. I have given the filename as result.txt.
⦁ In the Query box, you will have to type your Hive query.
⦁ In the Start Query dropdown, select “Yes.”
⦁ Finally, click “Submit” to run the query.
⦁ The above screenshot shows the data loading process.
I’m running a simple SELECT *.. query to cross check the result. You can see the result by clicking on “View File” present next to the “Result File” box.
Hope this instructional post helped you in running your first Hive query through HWI. Keep visiting www.acadgild.com for more updates on the courses.