Hadoop Installation(Single-Node)-1/3
Hadoop 1.0.4 on Ubuntu Linux 12.04 ( Single Node )- Part -1/3
There are few ways to install Hadoop.- Cloudera Distibution
- Hadoop PPA
- Stable download from hadoop.org
I prefer to use the download from hadoop.org. One of the main reason to select this is whenever there is a new version of hadoop, I need not wait for someone else to release their version including the newer version.
Should we not be installing hadoop on multi nodes?
Yes. But my objective is to set-up a hadoop environment on my laptop, so that I can play around and also get a better understanding of Map-Reduce. For that single-node set-up is sufficient.
Prerequisites
Make sure you have already installed Java 1.6 on my machine. if not please follow this link -
Install Java 1.6
I like to keep a dedicated user for hadoop. It is much easier when it comes to giving permissions and for various admin acitivies. It is not necessary tho. Create a separate Linux user and usergroup for installing hadoop
Create a separate user & usergroup for hadoop.
I like to keep a dedicated user for hadoop. It is much easier when it comes to giving permissions and for various admin acitivies. It is not necessary tho. Create a separate Linux user and usergroup for installing hadoop
sudo addgroup hadoop_group sudo adduser --ingroup hadoop_group hadoop_usr
Add the hadoop user to SUDOers list.
The best practice is to give only haddop related permission to the hadoop user. But at this stage we are just trying to understand how hadoop works, so let give ALL permissions.
open terminal and connect as admin
sudo visudoand then add the below line as show in the below screenshot
hadoop_usr ALL=(ALL:ALL) ALL
Configuring SSH
Hadoop internally uses SSH to manage it nods, So we need to configure SSH access to localhost for our hadoop_usr. So let do that
Before configuring SSH for hadoop make sure you have installed SSH-server on your machine. In most machine it is likely that the ssh server is not installed as only the client is installed by default.
sudo apt-get install openssh-server
now configure SSH for hadoop
su - hadoop_usr ssh-keygen -t rsaenter a file name - ( give nothing, just press enter)
for passphrase - (give click="" enter.="" just="" no="" passphrase="")
cat /home/hadoop_usr/.ssh/id_rsa.pub >> /home/hadoop_usr/.ssh/authorized_keys ssh localhost ( first time it will ask a question "do you want to continue..." give "yes") ssh localhost ( second time it should not ask any question)
Disable IP v6
cat /proc/sys/net/ipv6/conf/all/disable_ipv6 (if the above return 0, then IP v6 is enabled, so we have to disable it) sudo vi /etc/sysctl.conf
add the below lines to that file and restart the machine and check again
net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1
Comments