Recovering deleted files
In this recipe, we will look at how we can recover deleted files from the Hadoop cluster. What if the user deletes a critical file with the -skipTrash
option? Can it be recovered?
This recipe, is more of a best effort to restore the files after deletion. When the delete command is executed, the Namenode updates its metadata in edits
file and then fires the invalidate
command to remove the blocks. If the cluster is very busy, the invalidation might take time and we can revoke the files. But, on an idle cluster, if we delete the files, Namenode will immediately fire the invalidate command in response to the Datanode heartbeat and as Datanode does not have any pending operations to do, it will delete the blocks.
Getting ready
Make sure that the user has a running cluster with at least HDFS configured and working perfectly.
How to do it...
Connect to the
master1.cyrus.com
master node and switch to userhadoop
.Create any file and copy it to HDFS. Then, delete that file using...