Hadoop 2.6.4 standalone mode installation on ubuntu 14.04

Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models.

The Hadoop framework application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage.

Pre Requirements

1) A machine with Ubuntu 14.04 LTS operating system installed.

2) Apache Hadoop 2.6.4 Software (Download Here)

Standalone Mode

By default, Hadoop is configured to run in a non-distributed or standalone mode, as a single Java process. There are no daemons running and everything runs in a single JVM instance. HDFS is not used.

Step 1 – Update. Open a terminal (CTRL + ALT + T) and type the following sudo command. It is advisable to run this before installing any package, and necessary to run it to install the latest updates, even if you have not added or removed any Software Sources.

Step 2 – Installing Java 7.

Step 3 – Install open-ssh server. It is a cryptographic network protocol for operating network services securely over an unsecured network. The best known example application is for remote login to computer systems by users.

Step 4 – Create a Group. We will create a group, configure the group sudo permissions and then add the user to the group. Here ‘hadoop’ is a group name and ‘hduser’ is a user of the group.

Step 5 – Configure the sudo permissions for ‘hduser’.

Since by default ubuntu text editor is nano we will need to use CTRL + O to edit.

Add the permissions to sudoers.

Use CTRL + X keyboard shortcut to exit out. Enter Y to save the file.

Step 6 – Creating hadoop directory.

Step 7 – Change the ownership and permissions of the directory /usr/local/hadoop. Here ‘hduser’ is an Ubuntu username.

Step 8 – Switch User, is used by a computer user to execute commands with the privileges of another user account.

Step 9 – Change the directory to /home/hduser/Desktop , In my case the downloaded hadoop-2.6.4.tar.gz file is in /home/hduser/Desktop folder. For you it might be in /downloads folder check it.

Step 10 – Untar the hadoop-2.6.4.tar.gz file.

Step 11 – Move the contents of hadoop-2.6.4 folder to /usr/local/hadoop

Step 12 – Edit $HOME/.bashrc file by adding the java and hadoop path.

$HOME/.bashrc file. Add the following lines

Step 13 – Reload your changed $HOME/.bashrc settings

Step 14 – Verify hadoop installation. It just display hadoop version in the terminal.

Hadoop Standalone Mode Installation on Ubuntu 14.04

Execution of WordCount Example

The following example copies the .txt files of the /usr/local/hadoop/ directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.

Step 1 – Creating input directory.

Step 2 – Copy all text files. From $HADOOP_HOME to /home/hduser/Desktop/input

Step 3 – Verify copy.

Step 4 – Submit jar file to run. Sample WordCount example jar is in $HADOOP_HOME/share/hadoop/mapreduce/ folder.

Hadoop Standalone Mode Installation on Ubuntu 14.04

Step 5 – Verify output.

Have any Question or Comment?

Leave a Reply

Your email address will not be published. Required fields are marked *