Big Data Hadoop & Spark

Basic Interaction With Apache Zookeeper's Server

This post includes topics that give you some basic understanding of Zookeeper at node level and basic commands for interacting with Zookeeper node, such as create, delete, update, etc.
We recommend users to go through our previous post to brush up on Zookeeper fundamentals and the step-by-step guide to installing Zookeeper.
Zookeeper Fundamentals and Applications
Whenever there are multiple tasks and jobs running within a single distributed environment, there is a need for configuration management and synchronization of various aspects of naming and coordination. Zookeeper is an open-source Apache Foundation project that is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Zookeeper manages a naming registry and effectively implements a system for managing the various statically and dynamically named objects in a hierarchical manner, much like a file system. In addition, it enables coordination for exercising control over shared resources (such as distributed files or shared data tables) that may be touched by multiple application processes or tasks.
Let’s start with the basics of ZooKeeper.
ZooKeeper is a service with which clients can connect. It provides a tree-like structure or a hierarchical namespace. But, why do you need this structure? For storing data, of course! This is why it is called a “data tree”. On this tree, you can do the entire set of CRUD operations. On top of this, you can use GET and SET operations for data manipulations. It uses the standard UNIX notation for file system paths. For example, we use /A/B/C to denote the path to zNode C, where C has B as its parent and B has A as its parent. Here is a sample:

That looks very much like a UNIX file system, right? Here’s the explanation for the terminologies used here:
zNode represent each node in the tree.
Every zNode in the tree is identified by a path.
Two types of zNode – persistent and ephemeral.
Each zNode will store a value or data and may be child nodes.
We cannot rename zNodes.
We can add/remove WATCH to zNodes. You mean from?
Using ZooKeeper is so much easy from a client perspective. WATCHes are interesting as you can SET a WATCH for a zNode path to let you know if something changed. It is something like subscribing to changes on a path.
Let’s go a little bit deeper.
Let’s consider another example where the goal is to store some configurations in a <K,V> format and make it available across a cluster of machines. The <K,V> should be persistent a.k.a disk-based and should be High availability or Replicated and fault tolerant. ZooKeeper is the best choice for this use case.

Once zNodes are created with the desired path, you can use GET and SET to use it as a distributed <K,V> store or hashmap.
Next, let’s look at how Kafka uses ZooKeeper?
Hadoop
How Kafka uses ZooKeeper?
As of v 0.8, Kafka uses ZooKeeper for storing a variety of configurations as K,V in the ZK data tree and uses them across the cluster in a distributed fashion. Let’s take 2 simple use cases for which Kafka maintains values in ZooKeeper. They are as follows:

  1. Topics under a broker –/brokers/topics/[topic]
  2. Next Offset for a Consumer/Topic/Partition combination –/consumers/[groupId]/offsets/[topic]/[partitionId]

Zookeeper Server Basic Interaction

Starting the CLI:

Once the Zookeeper server is running successfully, we can start the CLI (Command Line Interface) to interact with the server. You can use the following command to do the above operation.

cd zookeeper-3.4.6/
bin/zkCli.sh -server
With this command, the console will go into the Zookeeper command line mode where we can use the Zookeeper specific commands to interact with the server. Try the following commands to get a feel of it.

Zookeeper Command Line Interface

Creating the First Znode:

Every znode in the ZooKeeper data model maintains a stat structure. A stat simply provides the metadata of a znode. It consists of Version number, Action control list (ACL), Timestamp, and Data length.
There are 3 types of znode. They are:

  • Persistence znode – It is alive even after the client.
  • Ephemeral znodes – They are active until the client is alive.
  • Sequential znodes – They can be either persistent or ephemeral.

Let us start creating a new node. The following is the Zookeeper command to create a new znode with the dummy data.
create /firstnode HelloThisIsACADGILDserverFirstnode
Here, firstnode is the name of the znode which will be created on the root path as indicated by / , and HelloThisIsACADGILDserverFirstnode is the dummy text stored in the znode memory.

Create znode in Zookeeper

Retrieving Data from the First Znode:

Similar to how we had created a new znode, we can get back the details and data of the znode using the CLI (Command Line Interface). The below command is for getting the data from znode.
get /firstnode
In the below screenshot you can see that along with the data we stored in the znode while creating, the server also returned some metadata related to this particular znode.
Some of the important fields in the metadata are:

  • ctime – Time when this znode was created.
  • mtime – Last modified time.
  • dataVersion – Version of the data which changes every time the data is modified
  • datalength – Length of the data stored in the znode. In this case, the data is HelloThisIsACADGILDserverFirstnode and the length is 19.
  • Numchildren – Number of children of this a particular znode.

    Getting data from znode in Zookeeper

    Modifying Data in Znode:

    If we want to modify data in a particular node, Zookeeper provides a command for that too. The below command is used for modifying the data in an existing znode.
    set /firstnode HelloPrateek
    Here, firstnode is the existing znode and HelloPrateek is the new data, which needs to be written in the znode. Consequently, the old data will be removed when the new data is set.

    Modifying data in an existing znode

    In the above screenshot, the datalength, mtime, and dataversion is also updated when a new value is set.

    Creating a Subnode:

    Creating a subnode in an existing node is as easy as creating a new node. We just need to pass the full path for the new subnode. You can do this using the below command.
    create /firstnode/subnode subnodedata
    get /firstnode/subnode

    Creating a subnode for an existing node

getting data in subnode

Removing a Node:

Removing a node is quite easy and can be done using the rmr command in the Zookeeper CLI. Removing a node also removes all its subnodes. The below command removes the firstnode, which we have created for this example:
rmr /firstnode

Removing a node from Zookeeper

This brings us to the conclusion of this topic – Interaction with Apache Zookeeper. In this post, we saw how to use Zookeeper CLI to interface with the Zookeeper Service. Hope this post has been helpful in understanding this topic, making it easier for you to practice the commands necessary for basic interactions.
In case of any queries, feel free to comment below and we will get back to you at the earliest.
For more information and latest updates on Big Data and other technologies, visit our website www.acadgild.com.
Hadoop

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Close
Close