Setup Hadoop 3.1.0 Single Node Cluster on Ubuntu 16.04

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

In this tutorial on setup Hadoop 3.x on Ubuntu, we will learn steps for pseudo-distribution single node hadoop cluster on Ubuntu.

Prerequisite

Install Java

Hadoop requires Java. The minimal version supported is Java 8. Let’s install it.

Add repository, update source-list and install Java.

krishna@hadoop-master:~$ sudo add-apt-repository ppa:webupd8team/java
krishna@hadoop-master:~$ sudo apt-get update
krishna@hadoop-master:~$ sudo apt-get install oracle-java8-installer

krishna@hadoop-master:~$ java -version
java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
krishna@hadoop-master:~$ 
krishna@hadoop-master:~$ javac -version
javac 1.8.0_171
krishna@hadoop-master:~$
krishna@hadoop-master:~$ sudo add-apt-repository ppa:webupd8team/java
[sudo] password for krishna: 
 Oracle Java (JDK) Installer (automatically downloads and installs Oracle JDK8). There are no actual Java files in this PPA.

Important -> Why Oracle Java 7 And 6 Installers No Longer Work: http://www.webupd8.org/2017/06/why-oracle-java-7-and-6-installers-no.html

Update: Oracle Java 9 has reached end of life: http://www.oracle.com/technetwork/java/javase/downloads/jdk9-downloads-3848520.html

The PPA supports Ubuntu 18.04, 17.10, 16.04, 14.04 and 12.04.

More info (and Ubuntu installation instructions):
- for Oracle Java 8: http://www.webupd8.org/2012/09/install-oracle-java-8-in-ubuntu-via-ppa.html

Debian installation instructions:
- Oracle Java 8: http://www.webupd8.org/2014/03/how-to-install-oracle-java-8-in-debian.html

For Oracle Java 10, see a different PPA: https://www.linuxuprising.com/2018/04/install-oracle-java-10-in-ubuntu-or.html
 More info: https://launchpad.net/~webupd8team/+archive/ubuntu/java
Press [ENTER] to continue or ctrl-c to cancel adding it

gpg: keyring `/tmp/tmpxm1ynh73/secring.gpg' created
gpg: keyring `/tmp/tmpxm1ynh73/pubring.gpg' created
gpg: requesting key EEA14886 from hkp server keyserver.ubuntu.com
gpg: /tmp/tmpxm1ynh73/trustdb.gpg: trustdb created
gpg: key EEA14886: public key "Launchpad VLC" imported
gpg: no ultimately trusted keys found
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
OK
krishna@hadoop-master:~$ 
krishna@hadoop-master:~$ sudo apt-get update
Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [107 kB]
Hit:2 http://in.archive.ubuntu.com/ubuntu xenial InRelease                     
Get:3 http://ppa.launchpad.net/webupd8team/java/ubuntu xenial InRelease [17.5 kB]
Get:4 http://in.archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]                                          
Get:5 http://ppa.launchpad.net/webupd8team/java/ubuntu xenial/main amd64 Packages [1,556 B]                                    
Get:6 http://in.archive.ubuntu.com/ubuntu xenial-backports InRelease [107 kB]                              
Get:7 http://ppa.launchpad.net/webupd8team/java/ubuntu xenial/main i386 Packages [1,556 B]
Get:8 http://ppa.launchpad.net/webupd8team/java/ubuntu xenial/main Translation-en [928 B]
Fetched 344 kB in 1s (271 kB/s)
Reading package lists... Done
krishna@hadoop-master:~$ 
krishna@hadoop-master:~$ sudo apt-get install oracle-java8-installer
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  gsfonts gsfonts-x11 java-common libfontenc1 libxfont1 oracle-java8-set-default xfonts-encodings xfonts-utils
Suggested packages:
  binfmt-support visualvm ttf-baekmuk | ttf-unfonts | ttf-unfonts-core ttf-kochi-gothic | ttf-sazanami-gothic ttf-kochi-mincho | ttf-sazanami-mincho ttf-arphic-uming
  firefox | firefox-2 | iceweasel | mozilla-firefox | iceape-browser | mozilla-browser | epiphany-gecko | epiphany-webkit | epiphany-browser | galeon | midbrowser
  | moblin-web-browser | xulrunner | xulrunner-1.9 | konqueror | chromium-browser | midori | google-chrome
The following NEW packages will be installed:
  gsfonts gsfonts-x11 java-common libfontenc1 libxfont1 oracle-java8-installer oracle-java8-set-default xfonts-encodings xfonts-utils
0 upgraded, 9 newly installed, 0 to remove and 56 not upgraded.
Need to get 4,186 kB of archives.
After this operation, 6,497 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 java-common all 0.56ubuntu2 [7,742 B]
Get:2 http://ppa.launchpad.net/webupd8team/java/ubuntu xenial/main amd64 oracle-java8-installer all 8u171-1~webupd8~0 [33.3 kB]
Get:3 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 gsfonts all 1:8.11+urwcyr1.0.7~pre44-4.2ubuntu1 [3,374 kB]
Get:4 http://ppa.launchpad.net/webupd8team/java/ubuntu xenial/main amd64 oracle-java8-set-default all 8u171-1~webupd8~0 [6,846 B]
Get:5 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 libfontenc1 amd64 1:1.1.3-1 [13.9 kB]                                                                            
Get:6 http://in.archive.ubuntu.com/ubuntu xenial-updates/main amd64 libxfont1 amd64 1:1.5.1-1ubuntu0.16.04.4 [95.0 kB]                                                       
Get:7 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 xfonts-encodings all 1:1.0.4-2 [573 kB]                                                                          
Get:8 http://in.archive.ubuntu.com/ubuntu xenial-updates/main amd64 xfonts-utils amd64 1:7.7+3ubuntu0.16.04.2 [74.6 kB]                                                      
Get:9 http://in.archive.ubuntu.com/ubuntu xenial/universe amd64 gsfonts-x11 all 0.24 [7,314 B]                                                                               
Fetched 4,186 kB in 8s (502 kB/s)                                                                                                                                            
Preconfiguring packages ...
Selecting previously unselected package java-common.
(Reading database ... 75702 files and directories currently installed.)
Preparing to unpack .../java-common_0.56ubuntu2_all.deb ...
Unpacking java-common (0.56ubuntu2) ...
Selecting previously unselected package oracle-java8-installer.
Preparing to unpack .../oracle-java8-installer_8u171-1~webupd8~0_all.deb ...
Unpacking oracle-java8-installer (8u171-1~webupd8~0) ...
Processing triggers for man-db (2.7.5-1) ...
Processing triggers for mime-support (3.59ubuntu1) ...
Processing triggers for hicolor-icon-theme (0.15-0ubuntu1) ...
Processing triggers for shared-mime-info (1.5-2ubuntu0.1) ...
Setting up java-common (0.56ubuntu2) ...
Setting up oracle-java8-installer (8u171-1~webupd8~0) ...
No /var/cache/oracle-jdk8-installer/wgetrc file found.
Creating /var/cache/oracle-jdk8-installer/wgetrc and
using default oracle-java8-installer wgetrc settings for it.
Downloading Oracle Java 8...
--2018-04-30 16:22:39--  http://download.oracle.com/otn-pub/java/jdk/8u171-b11/512cd62ec5174c3487ac17c61aaa89e8/jdk-8u171-linux-x64.tar.gz
Resolving download.oracle.com (download.oracle.com)... 104.85.141.68
Connecting to download.oracle.com (download.oracle.com)|104.85.141.68|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://edelivery.oracle.com/otn-pub/java/jdk/8u171-b11/512cd62ec5174c3487ac17c61aaa89e8/jdk-8u171-linux-x64.tar.gz [following]
--2018-04-30 16:22:39--  https://edelivery.oracle.com/otn-pub/java/jdk/8u171-b11/512cd62ec5174c3487ac17c61aaa89e8/jdk-8u171-linux-x64.tar.gz
Resolving edelivery.oracle.com (edelivery.oracle.com)... 49.44.96.135, 2405:200:1630:b8::2d3e, 2405:200:1630:b2::2d3e
Connecting to edelivery.oracle.com (edelivery.oracle.com)|49.44.96.135|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: http://download.oracle.com/otn-pub/java/jdk/8u171-b11/512cd62ec5174c3487ac17c61aaa89e8/jdk-8u171-linux-x64.tar.gz?AuthParam=1525085679_af5dac0062f189c2dc83bca3dddd82c6 [following]
--2018-04-30 16:22:40--  http://download.oracle.com/otn-pub/java/jdk/8u171-b11/512cd62ec5174c3487ac17c61aaa89e8/jdk-8u171-linux-x64.tar.gz?AuthParam=1525085679_af5dac0062f189c2dc83bca3dddd82c6
Connecting to download.oracle.com (download.oracle.com)|104.85.141.68|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 190890122 (182M) [application/x-gzip]
Saving to: ‘jdk-8u171-linux-x64.tar.gz’

     0K ........ ........ ........ ........ ........ ........  1% 12.2M 15s
  3072K ........ ........ ........ ........ ........ ........  3% 11.5M 15s
  6144K ........ ........ ........ ........ ........ ........  4% 11.4M 15s
  9216K ........ ........ ........ ........ ........ ........  6% 11.1M 15s
 12288K ........ ........ ........ ........ ........ ........  8% 11.1M 15s
 15360K ........ ........ ........ ........ ........ ........  9% 11.8M 14s
 18432K ........ ........ ........ ........ ........ ........ 11% 11.6M 14s
 21504K ........ ........ ........ ........ ........ ........ 13% 11.3M 14s
 24576K ........ ........ ........ ........ ........ ........ 14% 11.2M 14s
 27648K ........ ........ ........ ........ ........ ........ 16% 11.9M 13s
 30720K ........ ........ ........ ........ ........ ........ 18% 11.7M 13s
 33792K ........ ........ ........ ........ ........ ........ 19% 11.6M 13s
 36864K ........ ........ ........ ........ ........ ........ 21% 12.7M 12s
 39936K ........ ........ ........ ........ ........ ........ 23% 12.1M 12s
 43008K ........ ........ ........ ........ ........ ........ 24% 9.31M 12s
 46080K ........ ........ ........ ........ ........ ........ 26% 10.9M 12s
 49152K ........ ........ ........ ........ ........ ........ 28% 3.36M 13s
 52224K ........ ........ ........ ........ ........ ........ 29% 12.0M 13s
 55296K ........ ........ ........ ........ ........ ........ 31% 3.32M 14s
 58368K ........ ........ ........ ........ ........ ........ 32% 13.0M 13s
 61440K ........ ........ ........ ........ ........ ........ 34% 11.6M 13s
 64512K ........ ........ ........ ........ ........ ........ 36% 12.2M 12s
 67584K ........ ........ ........ ........ ........ ........ 37% 10.7M 12s
 70656K ........ ........ ........ ........ ........ ........ 39% 12.6M 11s
 73728K ........ ........ ........ ........ ........ ........ 41% 12.6M 11s
 76800K ........ ........ ........ ........ ........ ........ 42% 12.3M 11s
 79872K ........ ........ ........ ........ ........ ........ 44% 11.4M 10s
 82944K ........ ........ ........ ........ ........ ........ 46% 11.9M 10s
 86016K ........ ........ ........ ........ ........ ........ 47% 10.5M 10s
 89088K ........ ........ ........ ........ ........ ........ 49% 9.96M 9s
 92160K ........ ........ ........ ........ ........ ........ 51% 11.8M 9s
 95232K ........ ........ ........ ........ ........ ........ 52% 10.8M 9s
 98304K ........ ........ ........ ........ ........ ........ 54% 10.5M 8s
101376K ........ ........ ........ ........ ........ ........ 56% 12.9M 8s
104448K ........ ........ ........ ........ ........ ........ 57% 11.9M 8s
107520K ........ ........ ........ ........ ........ ........ 59% 12.8M 7s
110592K ........ ........ ........ ........ ........ ........ 60% 12.2M 7s
113664K ........ ........ ........ ........ ........ ........ 62% 12.5M 7s
116736K ........ ........ ........ ........ ........ ........ 64% 12.1M 6s
119808K ........ ........ ........ ........ ........ ........ 65% 12.2M 6s
122880K ........ ........ ........ ........ ........ ........ 67% 12.3M 6s
125952K ........ ........ ........ ........ ........ ........ 69% 11.5M 5s
129024K ........ ........ ........ ........ ........ ........ 70% 12.2M 5s
132096K ........ ........ ........ ........ ........ ........ 72% 12.7M 5s
135168K ........ ........ ........ ........ ........ ........ 74% 13.1M 4s
138240K ........ ........ ........ ........ ........ ........ 75% 13.1M 4s
141312K ........ ........ ........ ........ ........ ........ 77% 12.9M 4s
144384K ........ ........ ........ ........ ........ ........ 79% 11.7M 4s
147456K ........ ........ ........ ........ ........ ........ 80% 12.7M 3s
150528K ........ ........ ........ ........ ........ ........ 82% 12.5M 3s
153600K ........ ........ ........ ........ ........ ........ 84% 11.6M 3s
156672K ........ ........ ........ ........ ........ ........ 85% 12.3M 2s
159744K ........ ........ ........ ........ ........ ........ 87% 12.6M 2s
162816K ........ ........ ........ ........ ........ ........ 88% 11.6M 2s
165888K ........ ........ ........ ........ ........ ........ 90% 12.8M 2s
168960K ........ ........ ........ ........ ........ ........ 92% 12.4M 1s
172032K ........ ........ ........ ........ ........ ........ 93% 12.4M 1s
175104K ........ ........ ........ ........ ........ ........ 95% 12.3M 1s
178176K ........ ........ ........ ........ ........ ........ 97% 11.5M 0s
181248K ........ ........ ........ ........ ........ ........ 98% 12.6M 0s
184320K ........ ........ ........ ........                  100% 11.8M=17s

2018-04-30 16:22:56 (10.9 MB/s) - ‘jdk-8u171-linux-x64.tar.gz’ saved [190890122/190890122]

Download done.
Removing outdated cached downloads...
update-alternatives: error: no alternatives for java
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/ControlPanel to provide /usr/bin/ControlPanel (ControlPanel) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/java to provide /usr/bin/java (java) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/javaws to provide /usr/bin/javaws (javaws) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/jcontrol to provide /usr/bin/jcontrol (jcontrol) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/jjs to provide /usr/bin/jjs (jjs) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/keytool to provide /usr/bin/keytool (keytool) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/orbd to provide /usr/bin/orbd (orbd) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/pack200 to provide /usr/bin/pack200 (pack200) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/policytool to provide /usr/bin/policytool (policytool) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/rmid to provide /usr/bin/rmid (rmid) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/rmiregistry to provide /usr/bin/rmiregistry (rmiregistry) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/servertool to provide /usr/bin/servertool (servertool) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/tnameserv to provide /usr/bin/tnameserv (tnameserv) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/bin/unpack200 to provide /usr/bin/unpack200 (unpack200) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/lib/jexec to provide /usr/bin/jexec (jexec) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/appletviewer to provide /usr/bin/appletviewer (appletviewer) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/extcheck to provide /usr/bin/extcheck (extcheck) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/idlj to provide /usr/bin/idlj (idlj) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jar to provide /usr/bin/jar (jar) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jarsigner to provide /usr/bin/jarsigner (jarsigner) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/javac to provide /usr/bin/javac (javac) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/javadoc to provide /usr/bin/javadoc (javadoc) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/javafxpackager to provide /usr/bin/javafxpackager (javafxpackager) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/javah to provide /usr/bin/javah (javah) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/javap to provide /usr/bin/javap (javap) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/javapackager to provide /usr/bin/javapackager (javapackager) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jcmd to provide /usr/bin/jcmd (jcmd) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jconsole to provide /usr/bin/jconsole (jconsole) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jdb to provide /usr/bin/jdb (jdb) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jdeps to provide /usr/bin/jdeps (jdeps) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jhat to provide /usr/bin/jhat (jhat) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jinfo to provide /usr/bin/jinfo (jinfo) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jmap to provide /usr/bin/jmap (jmap) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jmc to provide /usr/bin/jmc (jmc) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jps to provide /usr/bin/jps (jps) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jrunscript to provide /usr/bin/jrunscript (jrunscript) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jsadebugd to provide /usr/bin/jsadebugd (jsadebugd) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jstack to provide /usr/bin/jstack (jstack) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jstat to provide /usr/bin/jstat (jstat) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jstatd to provide /usr/bin/jstatd (jstatd) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/jvisualvm to provide /usr/bin/jvisualvm (jvisualvm) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/native2ascii to provide /usr/bin/native2ascii (native2ascii) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/rmic to provide /usr/bin/rmic (rmic) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/schemagen to provide /usr/bin/schemagen (schemagen) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/serialver to provide /usr/bin/serialver (serialver) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/wsgen to provide /usr/bin/wsgen (wsgen) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/wsimport to provide /usr/bin/wsimport (wsimport) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/bin/xjc to provide /usr/bin/xjc (xjc) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-oracle/jre/lib/amd64/libnpjp2.so to provide /usr/lib/mozilla/plugins/libjavaplugin.so (mozilla-javaplugin.so) in auto mode
Oracle JDK 8 installed

#####Important########
To set Oracle JDK8 as default, install the "oracle-java8-set-default" package.
E.g.: sudo apt install oracle-java8-set-default
On Ubuntu systems, oracle-java8-set-default is most probably installed
automatically with this package.
######################

Selecting previously unselected package oracle-java8-set-default.
(Reading database ... 75740 files and directories currently installed.)
Preparing to unpack .../oracle-java8-set-default_8u171-1~webupd8~0_all.deb ...
Unpacking oracle-java8-set-default (8u171-1~webupd8~0) ...
Selecting previously unselected package gsfonts.
Preparing to unpack .../gsfonts_1%3a8.11+urwcyr1.0.7~pre44-4.2ubuntu1_all.deb ...
Unpacking gsfonts (1:8.11+urwcyr1.0.7~pre44-4.2ubuntu1) ...
Selecting previously unselected package libfontenc1:amd64.
Preparing to unpack .../libfontenc1_1%3a1.1.3-1_amd64.deb ...
Unpacking libfontenc1:amd64 (1:1.1.3-1) ...
Selecting previously unselected package libxfont1:amd64.
Preparing to unpack .../libxfont1_1%3a1.5.1-1ubuntu0.16.04.4_amd64.deb ...
Unpacking libxfont1:amd64 (1:1.5.1-1ubuntu0.16.04.4) ...
Selecting previously unselected package xfonts-encodings.
Preparing to unpack .../xfonts-encodings_1%3a1.0.4-2_all.deb ...
Unpacking xfonts-encodings (1:1.0.4-2) ...
Selecting previously unselected package xfonts-utils.
Preparing to unpack .../xfonts-utils_1%3a7.7+3ubuntu0.16.04.2_amd64.deb ...
Unpacking xfonts-utils (1:7.7+3ubuntu0.16.04.2) ...
Selecting previously unselected package gsfonts-x11.
Preparing to unpack .../gsfonts-x11_0.24_all.deb ...
Unpacking gsfonts-x11 (0.24) ...
Processing triggers for fontconfig (2.11.94-0ubuntu1.1) ...
Processing triggers for libc-bin (2.23-0ubuntu10) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up oracle-java8-set-default (8u171-1~webupd8~0) ...
Setting up gsfonts (1:8.11+urwcyr1.0.7~pre44-4.2ubuntu1) ...
Setting up libfontenc1:amd64 (1:1.1.3-1) ...
Setting up libxfont1:amd64 (1:1.5.1-1ubuntu0.16.04.4) ...
Setting up xfonts-encodings (1:1.0.4-2) ...
Setting up xfonts-utils (1:7.7+3ubuntu0.16.04.2) ...
Setting up gsfonts-x11 (0.24) ...
Processing triggers for libc-bin (2.23-0ubuntu10) ...
krishna@hadoop-master:~$ 
krishna@hadoop-master:~$ java -version
java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
krishna@hadoop-master:~$ 
krishna@hadoop-master:~$ javac -version
javac 1.8.0_171
krishna@hadoop-master:~$ 

Install Python Software

krishna@hadoop-master:~$ sudo apt-get install python-software-properties

Add User hadoop with Sudo privilege

root@hadoop-master:~# adduser hadoop
root@hadoop-master:~# usermod -aG sudo hadoop
root@hadoop-master:~# adduser hadoop
Adding user `hadoop' ...
Adding new group `hadoop' (1001) ...
Adding new user `hadoop' (1001) with group `hadoop' ...
Creating home directory `/home/hadoop' ...
Copying files from `/etc/skel' ...
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully
Changing the user information for hadoop
Enter the new value, or press ENTER for the default
	Full Name []: 
	Room Number []: 
	Work Phone []: 
	Home Phone []: 
	Other []: 
Is the information correct? [Y/n] y
root@hadoop-master:~# 
root@hadoop-master:~# usermod -aG sudo hadoop

hadoop@hadoop-master:~$ whoami
hadoop

Configure ssh and setup password-less access

Ssh is required to access Hadoop remotely and it’s node. PDSH issue commands to group of hosts in parallel.

Add host entry in /etc/hosts and verify that passwordless access is working properly.

hadoop@hadoop-master:~$ sudo apt-get install ssh pdsh
hadoop@hadoop-master:~$ ssh-keygen -t rsa
hadoop@hadoop-master:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hadoop@hadoop-master:~$ chmod 0600 ~/.ssh/authorized_keys
hadoop@hadoop-master:~$ sudo apt-get install ssh pdsh
[sudo] password for hadoop: 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  genders libgenders0
Suggested packages:
  rdist
The following NEW packages will be installed:
  genders libgenders0 pdsh ssh
0 upgraded, 4 newly installed, 0 to remove and 56 not upgraded.
Need to get 176 kB of archives.
After this operation, 601 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://in.archive.ubuntu.com/ubuntu xenial-updates/main amd64 ssh all 1:7.2p2-4ubuntu2.4 [7,076 B]
Get:2 http://in.archive.ubuntu.com/ubuntu xenial/universe amd64 libgenders0 amd64 1.21-1build2 [30.7 kB]
Get:3 http://in.archive.ubuntu.com/ubuntu xenial/universe amd64 genders amd64 1.21-1build2 [30.6 kB]
Get:4 http://in.archive.ubuntu.com/ubuntu xenial/universe amd64 pdsh amd64 2.31-3build1 [108 kB]
Fetched 176 kB in 1s (124 kB/s)
Preconfiguring packages ...
Selecting previously unselected package ssh.
(Reading database ... 76868 files and directories currently installed.)
Preparing to unpack .../ssh_1%3a7.2p2-4ubuntu2.4_all.deb ...
Unpacking ssh (1:7.2p2-4ubuntu2.4) ...
Selecting previously unselected package libgenders0:amd64.
Preparing to unpack .../libgenders0_1.21-1build2_amd64.deb ...
Unpacking libgenders0:amd64 (1.21-1build2) ...
Selecting previously unselected package genders.
Preparing to unpack .../genders_1.21-1build2_amd64.deb ...
Unpacking genders (1.21-1build2) ...
Selecting previously unselected package pdsh.
Preparing to unpack .../pdsh_2.31-3build1_amd64.deb ...
Unpacking pdsh (2.31-3build1) ...
Processing triggers for libc-bin (2.23-0ubuntu10) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up ssh (1:7.2p2-4ubuntu2.4) ...
Setting up libgenders0:amd64 (1.21-1build2) ...
Setting up genders (1.21-1build2) ...
Setting up pdsh (2.31-3build1) ...
Processing triggers for libc-bin (2.23-0ubuntu10) ...
hadoop@hadoop-master:~$ 
hadoop@hadoop-master:~$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:ydjp80Kr++qnXY+C04cWh4rKnaIRzckAKqXZ1Kvg4Qg hadoop@hadoop-master
The key's randomart image is:
+---[RSA 2048]----+
|. o.             |
|o*  .            |
|* .  .           |
|E* ..  + o       |
|*.*.  . S.       |
|.+.    .+ .      |
|.    . =o=.      |
| o....+.O+.o     |
|..ooo.*Xo+o .    |
+----[SHA256]-----+
hadoop@hadoop-master:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hadoop@hadoop-master:~$ chmod 0600 ~/.ssh/authorized_keys

hadoop@hadoop-master:~$ ping hadoop-master
PING hadoop-master (192.168.19.4) 56(84) bytes of data.
64 bytes from hadoop-master (192.168.19.4): icmp_seq=1 ttl=64 time=0.018 ms
64 bytes from hadoop-master (192.168.19.4): icmp_seq=2 ttl=64 time=0.033 ms
^C
--- hadoop-master ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.018/0.025/0.033/0.009 ms
hadoop@hadoop-master:~$ 
hadoop@hadoop-master:~$ ssh hadoop-master
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-116-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage
60 packages can be updated.
17 updates are security updates.

Last login: Tue May  1 17:43:14 2018 from 192.168.19.4
hadoop@hadoop-master:~$

Install Hadoop

Download Hadoop

Download the latest version of hadoop from here. Extract and create a soft link.

$ wget http://www-us.apache.org/dist/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz
$ tar -zxvf hadoop-3.1.0.tar.gz
$ ln -s hadoop-3.1.0 hadoop
hadoop@hadoop-master:~$ wget http://www-us.apache.org/dist/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz
--2018-05-01 20:54:46--  http://www-us.apache.org/dist/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz
Resolving www-us.apache.org (www-us.apache.org)... 140.211.11.105
Connecting to www-us.apache.org (www-us.apache.org)|140.211.11.105|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 325902823 (311M) [application/x-gzip]
Saving to: ‘hadoop-3.1.0.tar.gz’

hadoop-3.1.0.tar.gz              100%[=============================================================>] 310.80M   498KB/s    in 6m 11s  

2018-05-01 21:06:30 (457 KB/s) - ‘hadoop-3.1.0.tar.gz’ saved [325902823/325902823]

hadoop@hadoop-master:~$

Hadoop Configuration

Edit .bashrc file located in the Hadoop users home directory and add the below configs.

export HADOOP_HOME="/home/hadoop/hadoop-3.1.0"
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}

After that run

source .bashrc

Edit hadoop-env.sh (hadoop-3.1.0/etc/hadoop/hadoop-env.sh) from the hadoop home directory and add the java configs.

export JAVA_HOME=/usr/lib/jvm/java-8-oracle

Edit core-site.xml (hadoop-3.1.0/etc/hadoop/core-site.xml) and the following configs

<configuration> 
 <property> 
 <name>fs.defaultFS</name>
 <value>hdfs://hadoop-master:9000</value>
 </property> 
 <property> 
 <name>hadoop.tmp.dir</name> 
 <value>/home/hadoop/hdata</value>
 </property> 
</configuration>

Edit hdfs-site.xml (hadoop-3.1.0/etc/hadoop/hdfs-site.xml) and add the following entries.

<configuration>
 <property>
 <name>dfs.replication</name>
 <value>1</value>
 </property>
</configuration>

Edit mapred-site.xml (hadoop-3.1.0/etc/hadoop/mapred-site.xml) and add the following configs.

<configuration>
 <property>
 <name>mapreduce.framework.name</name>
 <value>yarn</value>
 </property>
</configuration>

Edit yarn-site.xml (hadoop-3.1.0/etc/hadoop/yarn-site.xml) and add the following configs.

<configuration>
 <property>
 <name>yarn.nodemanager.aux-services</name>
 <value>mapreduce_shuffle</value>
 </property>
 <property>
 <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
 <value>org.apache.hadoop.mapred.ShuffleHandler</value>
 </property>
</configuration>

Start Hadoop services

Let’s start the Hadoop services.

The first step to start up your Hadoop installation is formatting the Hadoop filesystem which is implemented on top of the local filesystem of your “cluster”. Use the following command to format the local filesystem

$ hdfs namenode -format

NOTE: This is one time activity after installing Hadoop.

Start HDFS Services

$ start-dfs.sh

It will give an error at the time of start HDFS services then use:

echo "ssh" | sudo tee /etc/pdsh/rcmd_default

Start HDFS Services

$ start-dfs.sh 

hadoop@hadoop-master:~$ start-dfs.sh
Starting namenodes on [hadoop-master]
Starting datanodes
Starting secondary namenodes [hadoop-master]
hadoop@hadoop-master:~$

It will give an error at the time of start HDFS services then use:

echo "ssh" | sudo tee /etc/pdsh/rcmd_default

Start YARN Services

$ start-yarn.sh
 
hadoop@hadoop-master:~$ start-yarn.sh 
Starting resourcemanager
Starting nodemanagers
hadoop@hadoop-master:~$

Check daemons running

$ jps

hadoop@hadoop-master:~$ jps
6544 Jps
5602 DataNode
5796 SecondaryNameNode
5477 NameNode
6213 NodeManager
6091 ResourceManager
hadoop@hadoop-master:~$

Stop Hadoop Services

Here’s the step to stop Hadoop services

Stop YARN services

$ stop-yarn.sh

Stop HDFS services

$ stop-dfs.sh

Hadoop Web Interface

Hadoop Dashboard
Data Node
Hadoop Cluster

3 Comments

  1. I’m unable to get the datanode in jps

    • Krishna Prajapati

      @Anisha: You need to format Hadoop file system. After that, starting start-dfs.sh should give you the datanode, namenode and secondary namenode in jps.
      Check your log file as well for errors.

  2. Krishna Prajapati

    hadoop@hadoop-master:~$ start-dfs.sh
    Starting namenodes on [hadoop-master]
    Starting datanodes
    Starting secondary namenodes [hadoop-master]
    hadoop@hadoop-master:~$ jps
    1682 DataNode
    2053 Jps
    1557 NameNode
    1867 SecondaryNameNode

    hadoop@hadoop-master:~$ start-yarn.sh
    Starting resourcemanager
    Starting nodemanagers
    hadoop@hadoop-master:~$ jps
    1682 DataNode
    1557 NameNode
    2166 ResourceManager
    2295 NodeManager
    1867 SecondaryNameNode
    2588 Jps
    hadoop@hadoop-master:~$

Leave a Reply

Your email address will not be published. Required fields are marked *