Big Data Hadoop & Spark

Getting started with Apache Phoenix – A SQL Interface for HBase

In this blog we will discussing about what is phoenix and how to integrate with Hbase. First of all lets get started with Apache phoenix.

What is Apache Phoenix?

Phoenix is an open source SQL skin for HBase. You use the standard JDBC APIs instead of the regular HBase client APIs to create tables, insert data, and query your HBase data.

100% Free Course On Big Data Essentials

Subscribe to our blog and get access to this course ABSOLUTELY FREE.

Doesn’t putting an extra layer between my application and HBase just slow things down?
Actually, no. Phoenix achieves as good or likely better performance than if you hand-coded it yourself (not to mention with a heck of a lot less code) by:

  • compiling your SQL queries to native HBase scans
  • determining the optimal start and stop for your scan key
  • orchestrating the parallel execution of your scans
  • bringing the computation to the data by
  • pushing the predicates in your where clause to a server-side filter
  • executing aggregate queries through server-side hooks (called co-processors)

In addition to these items, we’ve got some interesting enhancements in the works to further optimize performance:

  • secondary indexes to improve performance for queries on non row key columns
  • stats gathering to improve parallelization and guide choices between optimizations
  • skip scan filter to optimize IN, LIKE, and OR queries
  • optional salting of row keys to evenly distribute write load

With this we can clear that we can work on Hbase using sql interface. Now lets move on to the quick installation of Phoenix.

Note: We hope that hadoop and hbase are installed in your system to carry on with phoenix installation.

Phoenix Installation

Download phoenix-4.7.0 from the below link

https://drive.google.com/open?id=0ByJLBTmJojjzaVBRRXdLVU15RTg

After downloading, untar the file using the command

tar -xvzf phoenix-4.7.0-Hbase-1.1-bin.tar.gz

Extracting phoenix

Now after the untar you will be able to see a file phoenix-4.7.0-Hbase-1.1-bin

There are many jar files in that file. To integrate with Hbase, you need to copy phoenix-4.7.0-Hbase-1.1-server.jar into the class path of Hbase. Generally classpath is the lib directory of your Hbase.

Now open your bashrc file and export the phoenix binary path by adding the below lines into your bashrc file.

#set pheonix home

PHEONIX_HOME=/complete path of/phoenix-4.7.0-HBase-1.1-bin

export PATH=$PATH:$PHEONIX_HOME/bin

After adding the lines save and close the file and update the bashrc file using the below command

source .bashrc

Now start all your hbase daemons by moving into the bin folder of hbase by using the below command

./start-hbase.sh

Now start the hbase shell using the command

./hbase shell

Now you can see the list of tables in hbase using the command list

You can see that there are no tables in my hbase. Now we will login to phoenix using the below command

sqlline.py localhost
Hadoop

Then you will get the phoenix jdbc console in which you can write sql queries to create tables in hbase. You can refer to the below screen shot for the same.

After establishing connection between phoenix and hbase the below tables will be created by default. In phoenix list of tables can be checked using the command !tables

SYSTEM.CATALOG

SYSTEM.FUNCTION

SYSTEM.SEQUENCE

SYSTEM.STATS

This says that connection has been established successfully. Now we check the same in hbase shell.

So we have successfully integrated phoenix with hbase.

If these tables are not getting created automatically and there is an exception like connection loss in phoenix. Then you need to clean your hbase zookeeper by using the below command.

Stop all the hbase daemons and then type

./hbase clean --cleanZk

Now after the successful clean, start your hbase daemons and start phoenix sqlline you will able to see a successful connection between hbase and phoenix.

Let’s create a table in phoenix and query it in hbase.

Creating table in phoenix

create table emp(ID INTEGER NOT NULL,NAME VARCHAR(255),CONSTRAINT pk_emp PRIMARY KEY (ID,NAME));

You can see that in the above screen shot a table with name EMP has been created successfully. Lets check for the same in hbase shell.

In the below screen shot you can see that EMP table in the hbase shell.

This is how we can integrate HBase with phoenix. In our next blog we will work on some basic queries on phoenix.

We hope this blog helped you in integrating hbase with phoenix. Keep visiting our site www.acadgild.com for more updates on Bigdata and other technologies.

Suggested Reading

Different Types of Filters in HBase shell

 

 

Hadoop

Tags

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Close
Close