Hadoop Installation
Hadoop Installation
Hadoop Installation
java -version
sudo su - hadoop
Next, install the OpenSSH server and client:
ssh-keygen -t rsa
Here, it will ask you:
Where to save the key (hit enter to save it inside your home directory)
Create passphrase for keys (leave blank for no passphrase)
wget https://downloads.apache.org/hadoop/common/stable/hadoop-3.3.6.tar.gz
Once you are done with the download, extract the file using the following command:
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
source ~/.bashrc
To do that, you will have to define java environment variables in hadoop-env.sh file.
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
export HADOOP_CLASSPATH+=" $HADOOP_HOME/lib/*.jar"
cd /usr/local/hadoop/lib
hadoop version
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:9000</value>
</property>
Next, create a directory to store node metadata using the following command:
And change the ownership of the created directory to the hadoop user:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hdfs/datanode</value>
</property>
To do that, first, open the configuration file using the following command:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
edit the mapred-site.xml file to enable hadoop on ubuntu
Save and exit from the nano text editor.
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
edit the yarn-site.xml file to use hadoop in ubuntu
Save changes and exit from the config file.
Finally, use the following command to validate the Hadoop configuration and to format the
HDFS NameNode:
To start the Hadoop cluster, you will have to start the previously configured nodes.
start-dfs.sh
start the node manager and resource manager to start the Hadoop cluster in Ubuntu
To verify whether the services are running as intended, use the following command:
jps
http://server-IP:9870
OR
http://localhost:9870