Big Data Hadoop & Spark

Configuring Trash in HDFS

Apache Hadoop Provides a trash feature. Trash feature is very helpful for Hadoop Administrators in case of accidental deletion of files and directories. If trash is enabled and a file or directory is deleted, the file is moved to the .Trash directory in the user’s home directory instead of being deleted. Deleted files are initially moved to the Current sub-directory of the .Trash directory, and their original path is preserved. Files in .Trash are permanently removed after a user-configurable time interval and the Administrator can easily restore it to its original location within a time interval.

Files are moved to a user’s trash directory, which is a sub-directory of their home directory named “.Trash”. Files are initially moved to a Current sub-directory of the trash directory. Within that sub-directory their original path is preserved.

Following are the steps to configure Trash in Hadoop:

1. First, stop the cluster to add a new property to core-site.xml

Trash property

<property>
<name>fs.trash.interval</name>
<value>30</value>            #{Here you can mention the time-interval(in mins.)for permanent deletion}     
</property>

Simply add this property into core-site.xml between configuration tag. Refer the image below:

trash property in core-site.xml

This property will automatically create a .Trash directory inside Hadoop user.

2. Start Hadoop cluster

start-dfs.sh

or

start-all.sh

start-dfs.sh

Next, follow the below steps to check the trash configuration working by creating a new directory and deleting it.

hadoop fs -mkdir /acadgild_india

hadoop fs -mkdir /acadgild_india

hadoop fs -ls /

hadoop fs -ls

You can now see that our directory has been created.

hadoop fs -rm -r /acadgild_india

hadoop -rm -r

Here, you can see that acadgild_india directory has not been deleted permanently, it is moved to trash only.

Now, where to find it?

Move to .Trash directory inside /user/Hadoop

cd hadoop fs -ls /user/hadoop/.Trash

hadoop trash

Next, Move to Current directory-

hadoop fs -ls /user/hadoop/.Trash/Current

trash/current

Here, you can see acadgild_india directory which we have deleted earlier. Still we have 30 minutes to restore it because we have set a 30-minute time interval for trash updation.

Command to move file from trash folder to Hadoop home:

hadoop dfs -mv /user/hadoop/.Trash/Current/acadgild_india/

Restoring from trash to hadoop home

Next, Run hadoop fs -ls /  to see restored file:

hadoop fs -ls /

hadoop fs ls

Trash has been configured now and working properly. 

In case of any queries, feel free to write to us at [email protected] or comment below, and we will get back to you at the earliest. Keep visiting our website Acadgild for more updates on Big Data and other technologies. Click here to learn Big Data Hadoop Development.

Hadoop

Tags

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Close