hadoop 1.2.1 installation - university of michigan · let’s run an example! there is an example...

17
Hadoop 1.2.1 installation Chun-Chen Tu [email protected]

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Hadoop 1.2.1 installation

Chun-Chen Tu

[email protected]

Page 2: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Before installation

• Where to get hadoop– http://ftp.twaren.net/Unix/Web/apache/hadoop/common/had

oop-1.2.1/– or my ftp://hadoop:[email protected]– Please download: hadoop-1.2.1.tar.gz

• GUI mode may help for typing commands.• List of commands are also on ftp:

– the file cmd

• In this ppt, commands will be shown in italic and purple color– mkdir hadoop

Page 3: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

After login, install required packages firstsudo apt-get install libssl-dev rsync g++

type “y” when asked

Page 4: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Download files:cd Downloadswget ftp://hadoop:[email protected]/hadoop-1.2.1.tar.gzwget ftp://hadoop:[email protected]/jdk-7u45-linux-x64.gz

Page 5: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Install java : reference website(Under Downloads folder)tar -zxvf jdk-7u45-linux-x64.gzsudo mkdir /usr/lib/jdksudo cp -r jdk1.7.0_45 /usr/lib/jdk/

Edit profile: sudo vim /etc/profile(add four lines in at the end of profile)export JAVA_HOME=/usr/lib/jdk/jdk1.7.0_45export JRE_HOME=/usr/lib/jdk/jdk1.7.0_45/jreexport PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATHexport CLASSPATH=$CLASSPATH:.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib

Config java:sudo update-alternatives --install /usr/bin/java java /usr/lib/jdk/jdk1.7.0_45/bin/java 300sudo update-alternatives --install /usr/bin/javac javac /usr/lib/jdk/jdk1.7.0_45/bin/javac 300

sudo update-alternatives --config javasudo update-alternatives --config javac

Test it with versionjava –versionYou will see the version informationif success.

Page 6: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

SSH setting: SSH setting is optional but is recommended if you don’t want to enter password every time.Generate RSA keyssh-keygen -t rsa -P '' -f ~/.ssh/id_rsaCopy public keycat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Page 7: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

SSH test:ssh hadoop@localhostremember to exitexit

You will be asked for the authenticity for the first time. After this connection, no more inquiring.

If you fail the setting, you will need to enter password.

Page 8: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Install hadoop:tar -zxvf hadoop-1.2.1.tar.gzmv hadoop-1.2.1 ~/hadoop move it under home directory for convenience

vim ~/hadoop/conf/hadoop-env.sh edit hadoop environment shell scriptexport JAVA_HOME=/usr add this line

Page 9: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Set environment PATH:sudo vim ~/.bashrc configure bash settingexport PATH=/home/hadoop/hadoop/bin:$PATH add this line so we can use

command hadoop everywhere

logout and thenre-loginthe setting should take effect

type “hadoop” to try

Page 10: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Standalone mode: test if hadoop is available ref: websitecd ~/hadoopmkdir inputcp conf/*.xml inputhadoop jar hadoop-examples-1.2.1.jar grep input output 'dfs[a-z.]+'cat output/part-00000

Page 11: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Pseudo-distributed configuration: reference websiteedit 3 .xml files under conf foldercore-site.xml, hdfs-site.xml, mapred-site.xml these files may also download from ftpcd ~/hadoop

vim conf/core-site.xml<configuration><property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property></configuration>

vim conf/hdfs-site.xml<configuration> <property> <name>dfs.replication</name>

<value>1</value> </property></configuration>

vim conf/mapred-site.xml<configuration> <property>

<name>mapred.job.tracker</name> <value>localhost:9001</value></property>

</configuration>

Page 12: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

HDFS format:hadoop namenode -format

Page 13: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Start hadoop in pseudo-distributed mode:start-all.sh

type jps to see what’s working

If you need to enter password,it’s fine just inconvenient.To solve this, please refer toSSH setting in previous slides.

Page 14: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Let’s run an example! There is an example jar file under ~/hadoophadoop-examples-1.2.1.jar

hadoop jar hadoop-examples-1.2.1.jar : to get more information

Now suppose we want to run the wordcount example.First, put the input data on HDFS (you have to create your own input.txt first)hadoop dfs –put input.txt /input.txt

Next, execute the wordcount examplehadoop jar hadoop-examples-1.2.1.jar wordcount /input.txt /test_out

Finally, get the resultshadoop dfs –get /test_out test_out

The result file part-r-00000 show up in the directory test_out

Page 15: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Run hadoop with C

• We need to use pipes provided by hadoop.

• Really slow!

Page 16: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Recompile library: websitevim ~/hadoop/src/c++/pipes/impl/HadoopPipes.cc#include <unistd.h>

cd ~/hadoop/src/c++/utilschmod 755 configure./configuremake install

cd ~/hadoop/src/c++/pipesexport LIBS=-lcryptochmod 755 configure./configuremake install

New library will appear in ~/hadoop/src/c++/install

Page 17: Hadoop 1.2.1 installation - University of Michigan · Let’s run an example! There is an example jar file under ~/hadoop hadoop-examples-1.2.1.jar hadoop jar hadoop-examples-1.2.1.jar

Compile wordcount example : websitewget –r –np –nH ftp://hadoop:[email protected]/wordcountmake wordcounthadoop dfs –mkdir testhadoop dfs –put wordcount test/wordcounthadoop dfs –put testdata.txt test/testdata.txt

hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input test/testdata.txt -output test/output -program test/wordcount

hadoop dfs -get test/output outputcat output/part-00000

Congratulation!!