Big Data Hadoop & Spark

Install Nifi On Linux

In this blog, we will discuss how to install NiFi on Linux Operating system.

As we know data is stored on different machines, databases, and other sources.

Regularly, the user has to write API’s in different languages in order to collect or store data from source to destination. If the user is not good at coding, user can use NiFi which is a drag and drop based GUI data flow framework which allows the user to connect multiple sources and allow to flow data to the destination without or much need of programming.

Apache NiFi is a data flow management system which comes with a web UI built providing an easy way to handle data flows in real time. Most important aspect understand for a quick start into nifi is flow-based programming. In plain, terms you create a series of nodes with a series of edges to create a graph that a data moves through. In nifi these nodes are processors and edges are connectors, and the data is stored within a packet of information known as a flow file. This flow file has things like content, attributes, and age.

Download and Install NiFi in Linux system.

Download:

We can visit the official website of Apache Nifi and download the NiFi.

http://nifi.apache.org/

Now, select the Downloads button, then select nifi-1.3.0-bin.zip

Select the below link and download the file

http://apache.mirror.serversaustralia.com.au/nifi/1.3.0/nifi-toolkit-1.3.0-bin.zip

Once the file is downloaded unzip the nifi zip file

nifi.properties is the file where we can edit the web URL port number where the user will be creating workflows.

Next, create a copy of nifi.properties file in case the original nifi.properties file is edited and corrupted in the conf directory.

You can do it by below command

cd nifi-1.3.0

cd conf

Copy the nifi properties.

cp nifi.properties nifi.properties.old

 

Now, we can work around with original nifi.properties file if incase nifi.properties file is corrupted we can use the copied file to view and change the original nifi.properties file.

Open the nifi.properties file using gedit

gedit nifi.properties

 

We can see the web port url number in the web properties block. Whereupon using the port number user can work on nifi.

# web properties #

nifi.web.http.port = 8080

 

We can change the above port number if you need not use the above port number

Example:

nifi.web.http.port = 9999

 

Save and close the nifi.properties file

Start NiFi

To start nifi the user should go to the nifi bin directory and type the below command to start the nifi

./nifi.sh start

 

NOTE: If we want to know the stats of the service you can follow the below path of log directory to know that.

cd logs

tail -f nifi-app.log

 

Which references the web server or the application that runs the nifi in the background.

Now you can see the message stating nifi is successfully started. And the UI for nifi is available at the following URLs:

localhost:8080/nifi

We can use the above URLs in the Linux operating system to start using nifi.

From the above image, we can see nifi is successfully installed and the empty campus where we can add processes.

NiFi Working Example:

In the next example, we will be creating a workflow where random files are created from the source and stored in a specific location.

To create workflow the user should use Add processor component which is available in the Components Toolbar of NiFi.

Click of Add processor component button, drag and drop this buttion into the canvas.

Right click on the drag and dropped Add processor component and select the type Generateflowfile.

GenerateFlowFile is a processor type which is used to create flow files with random data.

Once the GenerateFlowFile processor is created, create a new processor again by selecting the processor component, then drag and drop the processor component into the canvas.

Now, right click on the new processor, configure and select putfile process

PutFile is a processor type which is used to write the contents of a flow file into the local file system.

Now, right click on putfile processor, select configure option

In configure, processor window Select Automatically Terminate Relationships failure and success checkbox button.

Now, go to properties tab of configuring processor window

– In the property select Directory and in the value enter the path where the randomly generated files will be stored.

/home/acadgid/Desktop/NiFi/StorageFolder

– In the Conflict Resolution Strategy set the value replace.

– In the Create Missing Directories set the value as true.

– Now, click on the Apply button to save the changes made.

Now link both Generateflowfile and putfile process in order to create a successful workflow.

In the below image we can see workflow to generate the random file and store these randomly generated files in the local file system is created successfully.

Now right click on GenerateFlowFile processor and select option start to generate random flow files.

And then right click on PutFile processor and select option start to collect and store the random flow files which is been generated by GenerateFlowFile processor.

After a few seconds, we can see random files are generated and stored in the specified path.

Right, click on both processor and select Stop option to stop the currently running processes.

Now go to the StorageFolder and see the result where you can see files which got generated and stored in the specified destination folder /home/acadgid/Desktop/NiFi/StorageFolder

cd StorageFolder.

ls

To check the size of the present working directory use the below command.

du -sh StorageFolder

To see the number of files generated in the destination directory using the below command.

ls -l | wc -l

From the above example, we can see we have successfully created a workflow where random files are generated and stored in the specified location.

We hope this post has been helpful in understanding the working of NiFi. In the future, you can expect more blogs on nifi, until that keep visiting our website Acadgild for more updates on Big Data and other technologies.

Manjunath

is working with AcadGild as Big Data Engineer and is a Big Data enthusiast with 2+ years of experience in Hadoop Development. He is passionate about coding in Hive, Spark, Scala. Feel free to contact him at [email protected] for any further queries.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Close
Close