Free Shipping

Secure Payment

easy returns

24/7 support

  • Home
  • Blog
  • Loading Data Into HBase Using PIG Scripts.

Loading Data Into HBase Using PIG Scripts.

 July 14  | 0 Comments

In this blog we will be discussing the loading of data into HBase using Pig scripts.

Before going further into our explanation we will be recalling our Pig and HBase basic concepts with the given blog for beginners on Pig and HBase.

Link for HBase and Pig blogs:

Beginners-Guide-for-HBase

Beginners-Guide-for-Pig

To implement the concepts discussed further in the blog , user is expected to have a Hadoop cluster with Pig and HBase running on it.

Note: You need to download the following versions of Hadoop, HBase and Pig to implement the steps discussed to load the data into HBase using Pig.

  • Hadoop version: hadoop-2.6.0
  • Hbase version: hbase-0.98.4-hadoop2-bin
  • Pig version: pig-0.14.0

Moving forward to the aim of this blog let us see step by step clarification regarding transferring data into HBase using Pig.

We are taking sample data set of student which will be loaded into HBase. We have attached snapshot with every step for better understanding.

You can download this sample data set for your own practice from the below link.

DATASET

Please refer the description for the above data set containing  seven columns named as:

StudentName, sector, DOB, qualification, score, state, randomName.

We will be copying the data set in to HDFS which will be further loaded into HBase.

We will be including few jar files of HBase to the Pig classpath.

PIG_CLASSPATH=/home/hadoop/HADOOP/hbase-0.98.4-hadoop2/lib/hbase-server-0.98.4-hadoop2:/home/hadoop/HADOOP/hbase-0.98.4-hadoop2/lib/hbase-*.jar;

We will now start HBase shell and create a table.

We only need this table as skeleton so PIG can Store data inside this by referring the table name.

We can come out from HBase by typing exit and switch to PIG grunt shell.

Once we are inside PIG mode we can load data from HDFS to Alias relation.

Now we can transfer the data inside HBase by STORE command.

We need to ensure that we give the correct name for table name created inside HBase. Also the parameters should be kept in mind to avoid mistake.

 

Once the success message comes as shown below , it is confirmed our data is loaded inside HBase.

The result can be displayed through scan command followed by table name inside quotes( ‘ ‘ ).

Keep visiting our website Acadgild for more updates on Big Data and other technologies. Click here to learn Big Data Hadoop Development.

>