tutorial: to run the mapreduce eemd code with hadoop on futuregrid
DESCRIPTION
-by Rewati Ovalekar. Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid. Step 1: Code is available on: http://code.google.com/p/cyberaide/ - PowerPoint PPT PresentationTRANSCRIPT
Tutorial: To run the MapReduce EEMD code with Hadoop on
Futuregrid
-by Rewati Ovalekar
2
● Step 1:– Code is available on:
http://code.google.com/p/cyberaide/– Download the code from:
http://code.google.com/p/cyberaide/source/browse/#svn%2Ftrunk%2Fproject%2Fspring2011%2FEEMDAnalysis%2FEEMDJava
3
● Step 2:– Create a futuregrid account– For further details refer:
https://portal.futuregrid.org/tutorials (FutureGrid Tutorial)
4
● Step 3:– Login to Futuregrid– ssh [email protected]– Following message will be displayed for successful
login
5
● Step 4:– Create a jar file
● Step 5:– To transfer the jar file and the input file:– sftp [email protected]
– put /../filepath
6
● Step 6:– In order to run Hadoop on FutureGrid create an
eucalyptus account– For further details refer:
https://portal.futuregrid.org/tutorials/eucalyptus
● Step 7:– Once the account is approved, load the eucalyptus
tools :
Module load euca2ools
7
● Step 8:– Make sure that the jar file and the input file are in the
same directory as the username.private key– Run the image which has hadoop on it:
euca-run-instances -k rovaleka -t c1.xlarge emi-D778156D
-k indicates the key name
-t indicates the type of instance
emi-D778156D indicates the image name
-n indicates the number of clusters to run
8
● Step 8:– Check the status using:– euca-describe-instances– Keep checking till the status is running, once the
status is running one can login to run the Hadoop. It will be displayed as below:
9
● Step 9:– Transfer the input file and the jar file to the required
VM using:
scp –i username.private filename [email protected]:/
(Make sure that the address is same as the address assigned to you else it will ask for password)
– Login using:
scp –i username.private [email protected] (Make sure the address is same)
10
SINGLE NODE
● Step 10:– Above message will be displayed for successful login– Retrieve the transferred files and transfer it in the Hadoop folder:
cd /..
mv filename /opt/hadoop-0.20.2
cd /opt/hadoop-0.20.2
11
● Step 11:– To run Hadoop:
cd /opt/hadoop-0.20.2
bin/start-all.sh– To check if everything is started:
jps
12
● Step 12:– Transfer the input file on the HDFS:
bin/hadoop dfs –copyFromLocal inputfile name_in_HDFS
– To check if it is present on HDFS:
bin/hadoop dfs –ls
NOTE: We need to transfer the input file whenever we start Hadoop
13
● Step 13:– To run the code:
bin/hadoop jar [jarFile] EEMDHadoop [inputfilename] [required_output_file]
14
● Step 14:– Retrieve the output :
bin/hadoop dfs -copyToLocal [outputFileName] [outputfileNameToBeGiven]
(output will be avaliable in part-00000 file)
To check the logs and to debug the code go to folder logs/userlogs
15
● Step 15:– Stop the Hadoop:
bin/stop-all.sh
exit
16
Thank you!!!