In this blog we will be discussing the loading of data into HBase using Pig scripts.
Before going further into our explanation we will be recalling our Pig and HBase basic concepts with the given blog for beginners on Pig and HBase.
Link for HBase and Pig blogs:
To implement the concepts discussed further in the blog , user is expected to have a Hadoop cluster with Pig and HBase running on it.
Note: You need to download the following versions of Hadoop, HBase and Pig to implement the steps discussed to load the data into HBase using Pig.
Moving forward to the aim of this blog let us see step by step clarification regarding transferring data into HBase using Pig.
We are taking sample data set of student which will be loaded into HBase. We have attached snapshot with every step for better understanding.
You can download this sample data set for your own practice from the below link.
Please refer the description for the above data set containing seven columns named as:
StudentName, sector, DOB, qualification, score, state, randomName.
We will be copying the data set in to HDFS which will be further loaded into HBase.
We will be including few jar files of HBase to the Pig classpath.
We will now start HBase shell and create a table.
We only need this table as skeleton so PIG can Store data inside this by referring the table name.
Once we are inside PIG mode we can load data from HDFS to Alias relation.
Now we can transfer the data inside HBase by STORE command.
We need to ensure that we give the correct name for table name created inside HBase. Also the parameters should be kept in mind to avoid mistake.
Once the success message comes as shown below , it is confirmed our data is loaded inside HBase.
The result can be displayed through scan command followed by table name inside quotes( ‘ ‘ ).
Keep visiting our website Acadgild for more updates on Big Data and other technologies. Click here to learn Big Data Hadoop Development.