Steps To Install Hadoop 2.x Release (Yarn or Next-Gen) On Single Node Cluster Setup

YARN-Installation
Steps to install Hadoop 2.x release (Yarn or

Next-Gen) on single node cluster setup
Hadoop 2.x release involves many changes to Hadoop and MapReduce. The centralized JobTracker
service is replaced with a ResourceManager that manages the resources in the cluster and an
ApplicationManager that manages the application lifecycle. These architectural changes enable hadoop
to scale too much larger cluster.
Prerequisites:
 Java 6 installed
 Dedicated user for hadoop
 SSH configured
Steps to install Hadoop 2.x:
1. Download tarball
Download tarball for hadoop 2.x. Extract it to a folder say, /home/hadoop/Work. We assume
dedicated user for Hadoop is “hadoop”.
2. Setup Environment Variables
$ export HADOOP_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_MAPRED_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_COMMON_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_HDFS_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_YARN_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_CONF_DIR=$HOME/Work/hadoop-2.0.1-alpha/etc/hadoop
This is very important as if you miss any one variable or set the value incorrectly, it will be very
difficult to detect the error and the job will fail.
Also, add these to your ~/.bashrc

YARN-Installation
3. Create directories
Create two directories to be used by namenode and datanode.
$ mkdir -p $HOME/Work/yarn_data/hdfs/namenode
$ mkdir -p $HOME/Work/yarn_data/hdfs/datanode
4. Set up config files
$ cd $YARN_HOME
Add the following properties under configuration tag in the files mentioned below:
etc/hadoop/yarn-site.xml:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
etc/hadoop/core-site.xml:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>
etc/hadoop/hdfs-site.xml:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/Work/yarn_data/hdfs/namenode</value>
</property>
<property>
YARN-Installation
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/Work/yarn_data/hdfs/datanode</value>
</property>
etc/hadoop/mapred-site.xml:
If this file does not exist, create it and paste the content provided below:
<?xml version="1.0"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
5. Format namenode
This step is needed only for the first time. Doing it every time will result in loss of content on
HDFS.
$ bin/hadoop namenode -format
6. Start HDFS processes
Name node:
$ sbin/hadoop-daemon.sh start namenode

starting namenode, logging to /home/hduser/Work/hadoop-2.0.1-
alpha/logs/hadoop-hduser-namenode-pc3-laptop.out
$ jps
18509 Jps
17107 NameNode
YARN-Installation
Data node:
$ sbin/hadoop-daemon.sh start datanode

starting datanode, logging to /home/hduser/Work/hadoop-2.0.1-
alpha/logs/hadoop-hduser-datanode-pc3-laptop.out
$ jps
18509 Jps
17107 NameNode
17170 DataNode
7. Start Hadoop Map-Reduce Processes
Resource Manager:
$ sbin/yarn-daemon.sh start resourcemanager

starting resourcemanager, logging to /home/hduser/Work/hadoop-2.0.1-
alpha/logs/yarn-hduser-resourcemanager-pc3-laptop.out
$ jps
18509 Jps
17107 NameNode
17170 DataNode
17252 ResourceManager
Node Manager:
$ sbin/yarn-daemon.sh start nodemanager

starting nodemanager, logging to /home/hduser/Work/hadoop-2.0.1-
alpha/logs/yarn-hduser-nodemanager-pc3-laptop.out
$jps
18509 Jps
17107 NameNode
17170 DataNode
17309 NodeManager
Job History Server:
$ sbin/mr-jobhistory-daemon.sh start historyserver

starting historyserver, logging to /home/hduser/Work/hadoop-2.0.1-
alpha/logs/yarn-hduser-historyserver-pc3-laptop.out
$jps
18509 Jps
17107 NameNode
YARN-Installation
17170 DataNode
17309 NodeManager
17626 JobHistoryServer
8. Running the famous wordcount example to verify installation
$ mkdir in
$ cat > in/file
This is one line
This is another one
Add this directory to HDFS:
$ bin/hadoop fs -copyFromLocal in /in
Run wordcount example provided:

$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.*-
alpha.jar wordcount /in /out
Check the output:
$ bin/hadoop dfs -cat /out/*

This 2
another 1
is 2
line 1
one 2
YARN-Installation
9. Web interface
Browse HDFS and check health using http://localhost:50070 in the browser:
You can check the status of the applications running using the following URL:
http://localhost:8088
10. Stop the processes
$ sbin/hadoop-daemon.sh stop namenode

$ sbin/hadoop-daemon.sh stop datanode
$ sbin/yarn-daemon.sh stop resourcemanager
$ sbin/yarn-daemon.sh stop nodemanager
$ sbin/mr-jobhistory-daemon.sh stop historyserver
YARN-Installation
Note:
1. To start all the process at o ce the use this co a d start-all.sh

2. At a time you can only use one version of hadoop, if you are using YARN then comment the
configuration related to Hadoop 1.x in bashrc file.

Steps To Install Hadoop 2.x Release (Yarn or Next-Gen) On Single Node Cluster Setup

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Steps To Install Hadoop 2.x Release (Yarn or Next-Gen) On Single Node Cluster Setup

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Steps To Install Hadoop 2.x Release (Yarn or Next-Gen) On Single Node Cluster Setup

Uploaded by

Copyright:

Available Formats

YARN-Installation

Steps to install Hadoop 2.x release (Yarn or

Steps to install Hadoop 2.x:

2. Setup Environment Variables

Also, add these to your ~/.bashrc

Create two directories to be used by namenode and datanode.

4. Set up config files

$ bin/hadoop namenode -format

6. Start HDFS processes

$ sbin/hadoop-daemon.sh start namenode

$ sbin/hadoop-daemon.sh start datanode

7. Start Hadoop Map-Reduce Processes

$ sbin/yarn-daemon.sh start resourcemanager

$ sbin/yarn-daemon.sh start nodemanager

Job History Server:

$ sbin/mr-jobhistory-daemon.sh start historyserver

8. Running the famous wordcount example to verify installation

Add this directory to HDFS:

$ bin/hadoop fs -copyFromLocal in /in

Run wordcount example provided:

Check the output:

$ bin/hadoop dfs -cat /out/*

Browse HDFS and check health using http://localhost:50070 in the browser:

10. Stop the processes

$ sbin/hadoop-daemon.sh stop namenode

1. To start all the process at o ce the use this co a d start-all.sh

You might also like