<<

Part II (c) – Desktop Installation

© Net Serpents LLC, USA

Installation Desktop Installation

¡ Supported Platforms ¡ Required Software ¡ Releases &Mirror Sites ¡ Install ¡ Configure ¡ Format ¡ Start/ Stop ¡ Verify

© Net Serpents LLC, USA

Supported Platforms Installation

¡ GNU supported for

¡ Development

¡ Production

¡ Demonstrated on 2000 node cluster

¡ Win32

¡ Development only

¡ Not supported as a production platform

© Net Serpents LLC, USA

Required Required Software Software

Following to be installed first

¡ Java 1.6.x or higher

¡ ssh: ¡ : ssh and ¡ Windows: openssh

© Net Serpents LLC, USA

Releases and Mirror Releases and Mirror Sites Sites

Releases ¡ http://hadoop.apache.org/releases.html ¡ Stable Release 2.7.1 (July 2015) ¡ Stable release: 2.6.0 (released Nov 2014)

¡ Earlier good releases: ¡ 2.4.0 (April 2014) ¡ 2.2.0 (GA Release – Oct 2013) ¡ 1.0.0 (Dec 2011)

Visit: http://hadoop.apache.org/releases.html

© Net Serpents LLC, USA

Mirror Sites Mirror Sites

¡ Downloads available at several mirror sites

¡ Suggested by Apache: ¡ http://www.apache.org/dyn/closer.cgi/hadoop/ common

¡ Other mirror sites at: ¡ http://www.apache.org/dyn/closer.cgi/hadoop/ common

Install - Overview Install

Hadoop 2.6.0 on Ubuntu 14.0.4

Step 1- Install Java Ubuntu is Step 2 – Create a dedicated hadoop user the most popular Step 3 – Install ssh Linux Step 4 – Create ssh certificates distribution Step 5 – Install Hadoop

Step 6 – Setup configuration Files

Step 7 – Format

Step 8 – Start/ Stop

Step 9 - Verify

Step 1 - Install Java Install

Login as an admin user:

$ cd ~

# Update the source list

$ sudo apt-get update

$ sudo apt-get install default-jdk

# Verify version of Java is 1.6.0 or higher

$ java -version

java version "1.7.0_65"

OpenJDK Runtime Environment (IcedTea 2.5.3) (7u71-2.5.3-0ubuntu0.14.04.1)

OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)

Install - Overview Install

Hadoop 2.6.0 on Ubuntu 14.0.4

Step 1- Install Java

Step 2 – Create a dedicated hadoop user

Step 3 – Install ssh

Step 4 – Create ssh certificates

Step 5 – Install Hadoop

Step 6 – Setup configuration Files

Step 7 – Format

Step 8 – Start/ Stop

Step 9 - Verify

Step 2 – Create a dedicated Hadoop user Install

# Create a hadoop group

$ sudo addgroup hadoop Adding group `hadoop' (GID 1009) … Done.

# Create hadoop user

$ sudo adduser --ingroup hadoop huser Adding user `huser' ... Adding new user `huser' (1001) with group `hadoop' ... Creating home directory `/home/huser' ... Copying files from `/etc/skel' ... Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully Changing the user information for huser

Step 2 – Create a dedicated Hadoop user Install

Enter the new value, or press ENTER for the default Full Name []: Room Number []: Work Phone []:

Home Phone []: Other []: Is the information correct? [Y/n] Y

# Add new user to sudoers

$ sudo adduser huser sudo [sudo] password for admin: Adding user `huser' to group `sudo' ... Adding user huser to group sudo Done.

Install - Overview Install

Hadoop 2.6.0 on Ubuntu 14.0.4

Step 1- Install Java

Step 2 – Create a dedicated hadoop user

Step 3 – Install ssh

Step 4 – Create ssh certificates

Step 5 – Install Hadoop

Step 6 – Setup configuration Files

Step 7 – Format

Step 8 – Start/ Stop

Step 9 - Verify

Step 3 – Install SSH Install

$ sudo apt-get install ssh

# Verify SSH is installed

$ which ssh

/usr/bin/ssh

$ which sshd

/usr/sbin/sshd

Step 4 – Create SSH Certificates Install

$ sudo su huser

#Generate a key pair

¡ $ ssh-keygen -f ~/.ssh/id_rsa -t rsa -P ""

Generating public/private rsa key pair. Enter file in which to save the key (/home/huser/.ssh/id_rsa): Created directory '/home/huser/.ssh'. Your identification has been saved in /home/huser/.ssh/id_rsa. Your public key has been saved in /home/huser/.ssh/id_rsa.pub. The key fingerprint is: 20:6c:f3:ff:0f:33:bf:30:72:c3:22:70:24:cc:2d:d3 huser@laptop The key's randomart image is: +--[ RSA 2048]----+ | .oo.o |

© Net Serpents LLC, USA

Step 4 – Create SSH Certificates Install

# Create list of authorized keys to avoid being prompted for password

$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

© Net Serpents LLC, USA

Install - Overview Install

Hadoop 2.6.0 on Ubuntu 14.0.4

Step 1- Install Java

Step 2 – Create a dedicated hadoop user

Step 3 – Install ssh

Step 4 – Create ssh certificates

Step 5 – Install Hadoop

Step 6 – Setup configuration Files

Step 7 – Format

Step 8 – Start/ Stop

Step 9 - Verify

Step 5 – Install Hadoop Install

# Download distribution from a mirror site

$ sudo http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz

# Extract the files

$ tar xvzf hadoop-2.6.0.tar.gz

$ cd hadoop-2.6.0

# Move files to /usr/local/hadoop

$ sudo mkdir /usr/local/hadoop

$ sudo mv * /usr/local/hadoop

# Change ownership to hadoop user

$ sudo chown -R huser:hadoop /usr/local/hadoop

© Net Serpents LLC, USA

Install - Overview Install

Hadoop 2.6.0 on Ubuntu 14.0.4

Step 1- Install Java

Step 2 – Create a dedicated hadoop user

Step 3 – Install ssh

Step 4 – Create ssh certificates

Step 5 – Install Hadoop

Step 6 – Setup configuration Files

Step 7 – Format

Step 8 - Verify

Step 6 – Configure Install

# Update links to point to Java

$update-alternatives --config java

There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/ java-7-openjdk-amd64/jre/bin/java

Nothing to configure. # Note down JAVA_HOME variable value

$ which javac /usr/bin/javac

$ readlink -f /usr/bin/javac

/usr/lib/jvm/java-7-openjdk-amd64/bin/javac

(Note: JAVA_HOME would be everything before /bin/javac)

© Net Serpents LLC, USA

Step 6 – Configure Install

# Add variables to the end of .bashrc $ vi ~/.bashrc export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL

© Net Serpents LLC, USA

Step 6 – Configure Install export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR= $HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path= $HADOOP_INSTALL/lib”

# Execute the commands in .bashrc

$ source ~/.bashrc

© Net Serpents LLC, USA

Step 6 – Configure Install

# Configure hadoop-env.sh

Change variable JAVA_HOME in hadoop-env.sh

$ vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

© Net Serpents LLC, USA

Step 6 – Configure Install

# Configure core-site.xml

# First create a tmp folder for hadoop

$ sudo mkdir -p /app/hadoop/tmp

$ sudo chown huser:hadoop /app/hadoop/tmp

# Modify core-site.xml

$ vi /usr/local/hadoop/etc/hadoop/core-site.xml

© Net Serpents LLC, USA

Step 6 – Configure Install

Modify as follows:

hadoop.tmp.dir /app/hadoop/tmp A base for other temporary directories.

fs.default.name hdfs://localhost:54310 The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.

© Net Serpents LLC, USA

Step 6 – Configure Install

# Configure mapred-site.xml

# First copy the file from the template provided

$ cp /usr/local/hadoop/etc/hadoop/mapred- site.xml.template /usr/local/hadoop/etc/hadoop/ mapred-site.xml

# Modify mapred-site.xml

$ vi /usr/local/hadoop/etc/hadoop/mapred-site.xml

© Net Serpents LLC, USA

Step 6 – Configure Install

Modify as follows:

mapred.job.tracker localhost:54311 The host and port that the MapReduce job tracker runs at.

© Net Serpents LLC, USA

Step 6 – Configure Install

# Configure hdfs-site.xml

# First create the directories for data node and name node

$ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

$ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode

$ sudo chown -R huser:hadoop /usr/local/hadoop_store

# Modify hdfs-site.xml

$ vi /usr/local/hadoop/etc/hadoop/mapred-site.xml

© Net Serpents LLC, USA

Step 6 – Configure Install

Modify as follows:

dfs.replication 1 This is a default value for block replication. Tis could be different from the value specified when the file is created. This value is just a default if none is specified at file creation.

…continued…see next page

© Net Serpents LLC, USA

Step 6 – Configure Install

dfs.namenode.name.dir file:/usr/local/hadoop_store/hdfs/namenode dfs.datanode.data.dir file:/usr/local/hadoop_store/hdfs/datanode

© Net Serpents LLC, USA

Install - Overview Install

Hadoop 2.6.0 on Ubuntu 14.0.4

Step 1- Install Java

Step 2 – Create a dedicated hadoop user

Step 3 – Install ssh

Step 4 – Create ssh certificates

Step 5 – Install Hadoop

Step 6 – Setup configuration Files

Step 7 – Format

Step 8 – Start/ Stop

Step 9 - Verify

Step 7 – Format Format

$ hadoop namenode -format

DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it.

15/04/18 14:43:03 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = laptop/192.168.1.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 2.6.0 STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop ... STARTUP_MSG: java = 1.7.0_65 ************************************************************/ 15/04/18 14:43:03 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT] 15/04/18 14:43:03 INFO namenode.NameNode: createNameNode [-format] 15/04/18 14:43:07 WARN util.NativeCode

© Net Serpents LLC, USA

Install - Overview Install

Hadoop 2.6.0 on Ubuntu 14.0.4

Step 1- Install Java Ubuntu is Step 2 – Create a dedicated hadoop user the most popular Step 3 – Install ssh Linux Step 4 – Create ssh certificates distribution Step 5 – Install Hadoop

Step 6 – Setup configuration Files

Step 7 – Format

Step 8 – Start/ Stop

Step 9 - Verify

Step 8 – Start/ Stop Start / Stop

To view the available commands: $ ls /usr/local/hadoop/sbin

Start hadoop (Login as hadoop user)

Available commands:

$start-all.sh (deprecated) – this starts all daemons OR $start-dfs.sh $start-yarn.sh $mr-jobhistory-daemon.sh start historyserver

-

© Net Serpents LLC, USA

Step 8 – Start/ Stop Start / Stop

To stop hadoop (Login as hadoop user): $stop-all.sh Or $ Stop-dfs.sh $ stop-yarn.sh $ mr-jobhistory-daemon.sh stop historyserver

© Net Serpents LLC, USA

Install - Overview Install

Hadoop 2.6.0 on Ubuntu 14.0.4

Step 1- Install Java Ubuntu is Step 2 – Create a dedicated hadoop user the most popular Step 3 – Install ssh Linux Step 4 – Create ssh certificates distribution Step 5 – Install Hadoop

Step 6 – Setup configuration Files

Step 7 – Format

Step 8 – Start/ Stop

Step 9 - Verify

Step 9 – Verify Verify

To verify:

$ jps (to view running daemons)

© Net Serpents LLC, USA

Step 9 – Verify Verify

WebUI: Using port 50070 Eg.,http://ec2-54-86-169-214.compute-1.amazonaws.com:50070

© Net Serpents LLC, USA

Quiz Quiz

1 - Which two of the following must be installed for hadoop installation to succeed a- ssh b- java c- ftp d- ruby

2 – Which of the following Linux commands may be used to install a software like ssh or java a- get b- put c- get-apt d- startall.sh

3- Name the commands you would use to start the hadoop daemons and to stop them

© Net Serpents LLC, USA

Quiz Quiz

4 - Which of the following is NOT a configuration file for hadoop a- hadoop-env.sh b- core-site.xml c- mapred-site.xml d- hadoop-configuration.xml

5 – True or false a- The user owning hadoop files must belong to a group called hadoop b- A hadoop cluster running on a single node is called pseudo-distributed mode c- A fully distributed hadoop configuration should have a minimum of three nodes

© Net Serpents LLC, USA