Hadoop Installation(Single-Node)-1/3

Hadoop 1.0.4 on Ubuntu Linux 12.04 ( Single Node )- Part -1/3

There are few ways to install Hadoop. 
  • Cloudera Distibution
  • Hadoop PPA
  • Stable download from hadoop.org
I prefer to use the download from hadoop.org. One of the main reason to select this is whenever there is a new version of hadoop, I need not wait for someone else to release their version including the newer version. 

Should we not be installing hadoop on multi nodes?

Yes. But my objective is to set-up a hadoop environment on my laptop, so that I can play around and also get a better understanding of Map-Reduce. For that single-node set-up is sufficient.

Prerequisites

Make sure you have already installed Java 1.6 on my machine. if not please follow this link - 
Install Java 1.6

Create a separate user & usergroup for hadoop.


I like to keep a dedicated user for hadoop. It is much easier when it comes to giving permissions and for various admin acitivies. It is not necessary tho. Create a separate Linux user and usergroup for installing hadoop
sudo addgroup hadoop_group
sudo adduser --ingroup hadoop_group hadoop_usr

Add the hadoop user to SUDOers list.

The best practice is to give only haddop related permission to the hadoop user. But at this stage we are just trying to understand how hadoop works, so let give ALL permissions.
open terminal and connect as admin
sudo visudo
and then add the below line as show in the below screenshot
hadoop_usr ALL=(ALL:ALL) ALL

Configuring SSH


Hadoop internally uses SSH to manage it nods, So we need to configure SSH access to localhost for our hadoop_usr. So let do that

Before configuring SSH for hadoop make sure you have installed SSH-server on your machine. In most machine it is likely that the ssh server is not installed as only the client is installed by default.
sudo apt-get install openssh-server
now configure SSH for hadoop
su - hadoop_usr
ssh-keygen -t rsa
enter a file name - ( give nothing, just press enter) 
for passphrase - (give click="" enter.="" just="" no="" passphrase="")
cat /home/hadoop_usr/.ssh/id_rsa.pub >> /home/hadoop_usr/.ssh/authorized_keys
ssh localhost ( first time it will ask a question "do you want to continue..." give "yes")
ssh localhost ( second time it should not ask any question)


Disable IP v6

cat /proc/sys/net/ipv6/conf/all/disable_ipv6
(if the above return 0, then IP v6 is enabled, so we have to disable it)
sudo vi /etc/sysctl.conf
add the below lines to that file and restart the machine and check again
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

contd...

Comments

Popular posts from this blog

Tableau - Accessing Tableau's DB

react-bootstrap-table | header column alignment fix

Tableau : Convert ESRI shapes into Tableau Format