Posts

Showing posts from April, 2013

Running a Hadoop Example - WordCount

Image
"Word Count" I know this is one of the common example you will find when searching for hadoop examples. The code for this comes along with the hadoop installation.  This is a very very simple example which you can use to understand how the hadoop code works. Steps: Make some sample files. I have made 2 files - you can download them from this link - test sample files Now load these files into Hadoop's HDFS ./hadoop-1.0.4/bin/hadoop dfs -copyFromLocal /home/venkat/Documents/*.txt /source_data/ You can also see the uploaded files using the hadoop web portal. As i said earlier hadoop installation should contain an example jar which has got "word count" as one of the example. Here is where you can find that example jar /hadoop-examples-1.0.4.jar Use this command to see the classes related to "word count" jar -tvf ./hadoop-1.0.4/hadoop-examples-1.0.4.jar | grep 'wordcount.class' -i If you would like to see the source

Hadoop File System Commands

Hadoop shell commands are very similar to linux shell commands. The Below links gives the complete set of commands Hadoop Shell Commands  - http://hadoop.apache.org/docs/r0.18.3/hdfs_shell.html Below are some of the important ones are ./hadoop-1.0.4/bin/hadoop dfs -ls / ./hadoop-1.0.4/bin/hadoop dfs -mkdir /source_data ./hadoop-1.0.4/bin/hadoop dfs -lsr / ./hadoop-1.0.4/bin/hadoop dfs -mkdir /tmp/tmp1 ./hadoop-1.0.4/bin/hadoop dfs -copyFromLocal /home/venkat/Documents/*.txt /source_data/ ./hadoop-1.0.4/bin/hadoop dfs -rmr /source_data/*.txt