Steps To Install Hadoop 2.x Release (Yarn or Next-Gen) On Single Node Cluster Setup
Steps To Install Hadoop 2.x Release (Yarn or Next-Gen) On Single Node Cluster Setup
Steps To Install Hadoop 2.x Release (Yarn or Next-Gen) On Single Node Cluster Setup
Hadoop 2.x release involves many changes to Hadoop and MapReduce. The centralized JobTracker
service is replaced with a ResourceManager that manages the resources in the cluster and an
ApplicationManager that manages the application lifecycle. These architectural changes enable hadoop
to scale too much larger cluster.
Prerequisites:
Java 6 installed
Dedicated user for hadoop
SSH configured
1. Download tarball
Download tarball for hadoop 2.x. Extract it to a folder say, /home/hadoop/Work. We assume
dedicated user for Hadoop is “hadoop”.
$ export HADOOP_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_MAPRED_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_COMMON_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_HDFS_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_YARN_HOME=$HOME/Work/hadoop-2.0.1-alpha
$ export HADOOP_CONF_DIR=$HOME/Work/hadoop-2.0.1-alpha/etc/hadoop
This is very important as if you miss any one variable or set the value incorrectly, it will be very
difficult to detect the error and the job will fail.
3. Create directories
$ mkdir -p $HOME/Work/yarn_data/hdfs/namenode
$ mkdir -p $HOME/Work/yarn_data/hdfs/datanode
$ cd $YARN_HOME
Add the following properties under configuration tag in the files mentioned below:
etc/hadoop/yarn-site.xml:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
etc/hadoop/core-site.xml:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>
etc/hadoop/hdfs-site.xml:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/Work/yarn_data/hdfs/namenode</value>
</property>
<property>
YARN-Installation
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/Work/yarn_data/hdfs/datanode</value>
</property>
etc/hadoop/mapred-site.xml:
If this file does not exist, create it and paste the content provided below:
<?xml version="1.0"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
5. Format namenode
This step is needed only for the first time. Doing it every time will result in loss of content on
HDFS.
Name node:
Data node:
Resource Manager:
Node Manager:
$ mkdir in
$ cat > in/file
This is one line
This is another one
9. Web interface
You can check the status of the applications running using the following URL:
http://localhost:8088
Note: