Hadoop Hdfs Commands
Hadoop Hdfs Commands
Hadoop Hdfs Commands
List Files
hdfs dfs -ls / List all the files/directories for the given hdfs destination path.
Directories are listed as plain files. In this case, this command will list
hdfs dfs -ls -d /hadoop
the details of hadoop folder.
Format file sizes in a human-readable fashion (eg 64.0m instead of
hdfs dfs -ls -h /data
67108864).
Recursively list all files in hadoop directory and all subdirectories in
hdfs dfs -ls -R /hadoop
hadoop directory.
List all the files matching the pattern. In this case, it will list all the
hdfs dfs -ls /hadoop/dat*
files inside hadoop directory which starts with 'dat'.
Read/Write Files
HDFS Command that takes a source file and outputs the file in text
hdfs dfs -text /hadoop/derby.log format on the terminal. The allowed formats are zip and
TextRecordInputStream.
This command will display the content of the HDFS file test on your
hdfs dfs -cat /hadoop/test
stdout .
hdfs dfs -appendToFile /home/ubuntu/test1
Appends the content of a local file test1 to a hdfs file test2.
/hadoop/text2
Upload/Download Files
hdfs dfs -put /home/ubuntu/sample /hadoop Copies the file from local file system to HDFS.
Copies the file from local file system to HDFS, and in case the local
hdfs dfs -put -f /home/ubuntu/sample /hadoop already exits in the given destination path, using -f option with put
command will overwrite it.
Copies the file from local file system to HDFS. Allow DataNode to
hdfs dfs -put -l /home/ubuntu/sample /hadoop
lazily persist the file to disk. Forces replication factor of 1.
Copies the file from local file system to HDFS. Passing -p preserves
hdfs dfs -put -p /home/ubuntu/sample /hadoop
access and modification times, ownership and the mode.
hdfs dfs -get /newfile /home/ubuntu/ Copies the file from HDFS to local file system.
Copies the file from HDFS to local file system. Passing -p preserves
hdfs dfs -get -p /newfile /home/ubuntu/
access and modification times, ownership and the mode.
Copies all the files matching the pattern from local file system to
hdfs dfs -get /hadoop/*.txt /home/ubuntu/
HDFS.
Works similarly to the put command, except that the source is
hdfs dfs -copyFromLocal /home/ubuntu/sample /hadoop
restricted to a local file reference.
Works similarly to the put command, except that the destination is
hdfs dfs -copyToLocal /newfile /home/ubuntu/
restricted to a local file reference.
Works similarly to the put command, except that the source is
hdfs dfs -moveFromLocal /home/ubuntu/sample /hadoop
deleted after it's copied.
File Management
Copies file from source to destination on HDFS. In this case, copying
hdfs dfs -cp /hadoop/file1 /hadoop1
file1 from hadoop directory to hadoop1 directory.
Copies file from source to destination on HDFS. Passing -p preserves
hdfs dfs -cp -p /hadoop/file1 /hadoop1
access and modification times, ownership and the mode.
Copies file from source to destination on HDFS. Passing -f overwrites
hdfs dfs -cp -f /hadoop/file1 /hadoop1
the destination if it already exists.
Move files that match the specified file pattern <src> to a destination
hdfs dfs -mv /hadoop/file1 /hadoop1 <dst>. When moving multiple files, the destination must be a
directory.
hdfs dfs -rm /hadoop/file1 Deletes the file (sends it to the trash).
hdfs dfs -rm -r /hadoop
hdfs dfs -rm -R /hadoop Deletes the directory and any content under it recursively.
hdfs dfs -rmr /hadoop
The -skipTrash option will bypass trash, if enabled, and delete the
hdfs dfs -rm -skipTrash /hadoop
specified file(s) immediately.
Filesystem
hdfs dfs -df /hadoop Shows the capacity, free and used space of the filesystem.
Show the amount of space, in bytes, used by the files that match the
hdfs dfs -du /hadoop/file
specified file pattern.
Rather than showing the size of each individual file that matches the
hdfs dfs -du -s /hadoop/file
pattern, shows the total (summary) size.
Show the amount of space, in bytes, used by the files that match the
hdfs dfs -du -h /hadoop/file specified file pattern. Formats the sizes of files in a human-readable
fashion.
Administration
Re-read the hosts and exclude files to update the set of Datanodes
hdfs dfsadmin -refreshNodes that are allowed to connect to the Namenode and those that should
be decommissioned or recommissioned.