Big Data Hadoop & Spark

Performing CRUD Operations on HBase Using Java API

In this post, we will be discussing the procedures to implement CRUD operations in HBase using Java APIs. Before moving forward, it is best if readers can brush up on the working of HBase and its operations.
Introduction to HBase
Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google’s Bigtable: A Distributed Storage System for Structured Data. Written in Java, HBase has a Java Native API, which provides programmatic access to users to perform data manipulation operations.
Before beginning, please makes sure that Apache HBase is installed in your Linux machine. You can refer to this Link to Install Apache HBase.
CRUD Operations
There are four main operations performed in HBase and they are:

  • Create
  • Read (Get and Scan)
  • Add (update)
  • Delete.

Hadoop

Let’s start working on these operations, by creating a new table in HBase default database.
Create Table Operation:
If any operation is to be performed on any data, file or table, first that particular content should exist in the Filesystem or in the Database.
Therefore, to perform different operations on a table in HBase, we need a table. To create a table, we need to create an instance of HBaseAdmin and then ask it to create the table and with at least one column family name in it. To perform DDL operations, you must use HBaseAdmin class instance.
Here in our program, where we are creating a table named Acadgild with two column families named Emp_name and sal.

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.MasterNotRunningException;
import org.apache.hadoop.hbase.ZooKeeperConnectionException;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.util.Bytes;
public class CreateHbaseTable {
Configuration config = HBaseConfiguration.create();
public void createtable(String name,String[] colfamily) throws MasterNotRunningException,
                                               ZooKeeperConnectionException, IOException
{
HBaseAdmin admin = new HBaseAdmin(config);
HTableDescriptor des = new HTableDescriptor(Bytes.toBytes(name));
for(int i=0;i<colfamily.length;i++){
des.addFamily(new HColumnDescriptor(colfamily[i]));
}
if(admin.tableExists(name)){
System.out.println("Table already exist");
}
else{
admin.createTable(des);
System.out.println("Table: "+name+ " Sucessfully created");
}
}
public static void main(String args[]) throws MasterNotRunningException,
                               ZooKeeperConnectionException,IOException{
CreateHbaseTable op = new CreateHbaseTable();
String tablename = "Acadgild";
String[] familys = {"Emp_name","sal"};
op.createtable(tablename, familys);
}
}

 
CreateHbaseTable Output:
We can observe in the below image that no table exists in the default database of HBase. So, let us execute the CreateHbaseTable Java API program to create a table name Acadgild in HBase default database.

After executing the CreateHbaseTable Java API program, we can see the below image where a  message is displayed stating table named Acadgild has been successfully created.
This creates a table Acadgild with two columns; Emp_name and sal

We can cross check by going back to HBase shell and firing ‘list’ command, as shown below.

From the above program, we learned how to create a table in HBase using Java API.
Put Operation:
We can use Put class for performing inserting rows into an HBase table.
Here, in our program, we are performing put operation on the table created above, column families Emp_name and sal, where their values will be set as Kiran and 100000 respectively.

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
public class putHbase {
public static void main(String args[]) throws IOException{
//create instance of default hbase configuration object
Configuration config  = HBaseConfiguration.create();
//Get the table instance
HTable table =  new HTable(config, "Acadgild");
//create the put object
Put put = new Put(Bytes.toBytes("row-1"));
//Add the column into the column family Emp_name with qualifier name
put.add(Bytes.toBytes("Emp_name"), Bytes.toBytes("Employee1"),Bytes.toBytes("Kiran"));
//Add the column into the column family sal with qualifier name
put.add(Bytes.toBytes("sal"), Bytes.toBytes("sal_Employee1"), Bytes.toBytes("100000"));
//insert the put instance to table
table.put(put);
System.out.println("Values inserted : ");
table.close();
}
}

putHbase Table Output:
After executing the above putHbase Java API program, we can observe in the below image that a successful message is thrown stating values inserted .

Thus, by using scan command in HBase shell with the table name, we can see the table values of column family Emp_name and sal as kiran and 100000, respectively.

Scan Operation:
Scan class scans and prints the entire table contents or a particular row value.
Here, in our program, we are trying to print Emp_name column family value using scan object.

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Delete;
public class scanHbase {
public static void main(String args[]) throws IOException{
Configuration config = HBaseConfiguration.create();
HTable table = new HTable(config,"Acadgild");
Scan scan = new Scan();
scan.addColumn(Bytes.toBytes("Emp_name"), Bytes.toBytes("Emp_name"));
scan.setStartRow(Bytes.toBytes("row-1"));
//scan.setStartRow(Bytes.toBytes("row-4"));
ResultScanner result = table.getScanner(scan);
for(Result res:result){
byte[] val = res.getValue(Bytes.toBytes("Emp_name"), Bytes.toBytes("Emp_name"));
System.out.println("Row-value:"+Bytes.toString(val));
}
table.close();
}
}

scanHbase Table Output:
After executing the above scanHbase Java API program, we can observe in the below image that the column family Emp_name value Kiran has been printed.

Get Operation:
Class Get performs Get operations on a single row.
Here, in our program, we are trying to get and Emp_name column family value using Get class object.

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;
public class getHbase{
public static void main(String args[]) throws IOException {
Configuration config = HBaseConfiguration.create();
HTable table = new HTable(config,"Acadgild");
Get get = new Get(Bytes.toBytes("row-1"));
get.addColumn(Bytes.toBytes("Emp_name"), Bytes.toBytes("Emp_name"));
Result result=table.get(get);
byte[] name = result.getValue(Bytes.toBytes("Emp_name"),Bytes.toBytes("Emp_name"));
System.out.println("Name: " +Bytes.toString(name));
table.close();
}
}

getHbase Table Output:
After executing the above getHbase Java API program, we can observe in the below image the column family Emp_name value Kiran is printed.

Delete Operation:
Class Delete performs delete operations on a single row.
Here, in our program, we are trying to delete an entire row where column family name and column qualifier name is Emp_name.

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.client.Result;
public class deleteHbase{
public static void main(String args[])throws IOException{
Configuration config = HBaseConfiguration.create();
HTable table=new HTable(config, "Acadgild");
Delete del=new Delete(Bytes.toBytes("row-1"));
del.deleteColumn(Bytes.toBytes("Emp_name"),Bytes.toBytes("Emp_name") );
table.delete(del);
System.out.println("value-delted");
table.close();
}
}

deleteHBase Table Output:
We can observe in the below image that we have 4 column qualifier names (i.e, Emp_name, Employee1, sal and sal_Employee1) respectively, of column family name Emp_name and sal of table Acadgild

So, in our program, we are deleting an entire row where column family name and column qualifier name is Emp_name.

Therefore, after executing deleteHBase Table, we can observe in the below image a successful message is displayed stating value-deleted.

Now, we can use the scan command in HBase terminal to see the contents of Acadgild table, after performing delete operation on a row using Java API program.
We hope this post has been helpful in understanding how CRUD operations are performed on HBase using Java API programs. In case of any queries, feel free to comment below and we will get back to you at the earliest.
Keep visiting our site Acadgild for more updates on Big Data and other technologies.

Hadoop

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles

Close