As we know data is stored on different machines, databases, and other sources.
Regularly, the user has to write API’s in different languages in order to collect or store data from source to destination. If the user is not good at coding, user can use NiFi which is a drag and drop based GUI data flow framework which allows the user to connect multiple sources and allow to flow data to the destination without or much need of programming.
Apache NiFi is a data flow management system which comes with a web UI built providing an easy way to handle data flows in real time. Most important aspect understand for a quick start into nifi is flow-based programming. In plain, terms you create a series of nodes with a series of edges to create a graph that a data moves through. In nifi these nodes are processors and edges are connectors, and the data is stored within a packet of information known as a flow file. This flow file has things like content, attributes, and age.
Download and Install NiFi in Linux system.
We can visit the official website of Apache Nifi and download the NiFi.
Now, select the Downloads button, then select nifi-1.3.0-bin.zip
Select the below link and download the file
Once the file is downloaded unzip the nifi zip file
nifi.properties is the file where we can edit the web URL port number where the user will be creating workflows.
Next, create a copy of nifi.properties file in case the original nifi.properties file is edited and corrupted in the conf directory.
You can do it by below command
Now, we can work around with original nifi.properties file if incase nifi.properties file is corrupted we can use the copied file to view and change the original nifi.properties file.
Open the nifi.properties file using gedit
Save and close the nifi.properties file
To start nifi the user should go to the nifi bin directory and type the below command to start the nifi
NOTE: If we want to know the stats of the service you can follow the below path of log directory to know that.
Which references the web server or the application that runs the nifi in the background.
Now you can see the message stating nifi is successfully started. And the UI for nifi is available at the following URLs:
We can use the above URLs in the Linux operating system to start using nifi.
From the above image, we can see nifi is successfully installed and the empty campus where we can add processes.
NiFi Working Example:
In the next example, we will be creating a workflow where random files are created from the source and stored in a specific location.
To create workflow the user should use Add processor component which is available in the Components Toolbar of NiFi.
Click of Add processor component button, drag and drop this buttion into the canvas.
Right click on the drag and dropped Add processor component and select the type Generateflowfile.
GenerateFlowFile is a processor type which is used to create flow files with random data.
Once the GenerateFlowFile processor is created, create a new processor again by selecting the processor component, then drag and drop the processor component into the canvas.
Now, right click on the new processor, configure and select putfile process
PutFile is a processor type which is used to write the contents of a flow file into the local file system.
Now, right click on putfile processor, select configure option
In configure, processor window Select Automatically Terminate Relationships failure and success checkbox button.
Now, go to properties tab of configuring processor window
– In the property select Directory and in the value enter the path where the randomly generated files will be stored.
– In the Conflict Resolution Strategy set the value replace.
– In the Create Missing Directories set the value as true.
– Now, click on the Apply button to save the changes made.
Now link both Generateflowfile and putfile process in order to create a successful workflow.
In the below image we can see workflow to generate the random file and store these randomly generated files in the local file system is created successfully.
Now right click on GenerateFlowFile processor and select option start to generate random flow files.
And then right click on PutFile processor and select option start to collect and store the random flow files which is been generated by GenerateFlowFile processor.
After a few seconds, we can see random files are generated and stored in the specified path.
Right, click on both processor and select Stop option to stop the currently running processes.
Now go to the StorageFolder and see the result where you can see files which got generated and stored in the specified destination folder /home/acadgid/Desktop/NiFi/StorageFolder
To check the size of the present working directory use the below command.
du -sh StorageFolder
To see the number of files generated in the destination directory using the below command.
ls -l | wc -l
From the above example, we can see we have successfully created a workflow where random files are generated and stored in the specified location.
We hope this post has been helpful in understanding the working of NiFi. In the future, you can expect more blogs on nifi, until that keep visiting our website Acadgild for more updates on Big Data and other technologies.