Big Data Hadoop & Spark

JAVA APIs for copying File from Local File System to HDFS

[icegram campaigns=”2023″] In this blog, we will be implementing the copying of a file content from local file system to HDFS.
We will start our discussion with the given code snippet which needs to be written in eclipse and then we need to make a jar file from the given code and then execute it in the Linux terminal with Hadoop installed in it to copy a file from local file system to HDFS.
You can refer the below link to understand the process to  write a MapReduce program in eclipse and then execute it by making its jar file.
Link: https://drive.google.com/file/d/0Bxr27gVaXO5scXZjWklKdFYweTg/view?usp=sharing
Find the below code snippet for copying file from local to HDFS.

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.OutputStream;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class HdfsWriter extends Configured implements Tool {
public int run(String[] args) throws Exception {
//String localInputPath = args[0];
Path outputPath = new Path(args[0]);// ARGUMENT FOR OUTPUT_LOCATION
Configuration conf = getConf();
FileSystem fs = FileSystem.get(conf);
OutputStream os = fs.create(outputPath);
InputStream is = new BufferedInputStream(new FileInputStream("/home/acadgild/acadgild.txt"));//Data set is getting copied into input stream through buffer mechanism.
IOUtils.copyBytes(is, os, conf); // Copying the dataset from input stream to output stream
return 0;
}
public static void main(String[] args) throws Exception {
int returnCode = ToolRunner.run(new HdfsWriter(), args);
System.exit(returnCode);
}
}

Hadoop
Explanation of the above code:
After the creation of an instance of the File System, the HdfsWriter class calls the create() method to create a file (or overwrite if it already exists) in HDFS. The create() method returns an OutputStream object, which can be manipulated using normal Java I/O methods.
Step to execute the above code:
Step 1: The file which is to be placed in hdfs has to be kept in local file system and in this case the file is present in the directory /home/acadgild/acadgild.txt

Step 2: Make a jar file of the above code and run that jar file in the Hadoop environment where the file has to be copied.
hadoop jar /home/acadgild/hdfswrite.jar /my_dir_hdfs

Step 3: Type the command Hadoop dfs -ls / to check whether the file is present in hdfs or not. We can see the file i.e acadgild.txt which got copied from location /home/acadgild/ to the hdfs.
Refer the below screenshot for the same.


We can see that the content of the file acadgild.txt got copied into the file my_dir_hdfs.
We hope this blog helped you in understanding the JAVA api used for copying the files from local file system to HDFS.
Keep visiting our website acadgild.com for more blogs on Big data and other technologies.

100% Free Course On Big Data Essentials

Subscribe to our blog and get access to this course ABSOLUTELY FREE.

Related Popular Courses:

ONLINE DEVELOPER TRAINING

DEVOPS COURSE

CRYPTOCURRENCY MINING

ANDROID LEARNING

Hadoop

2 Comments

  1. first of all I would like to thank you for very good post…but I have one requirements where I need to write file from windows to hdfs ……I mean If I execute the my java program in windows it should copy the file to hdfs

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Close
Close