apache hadoophadoop.apache.org/docs/r1.0.4/cn/mapred_tutorial.pdf · 2018. 9. 7. · 28 !« eߦ¤...
TRANSCRIPT
-
Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
••
Page 2Copyright © 2007 The Apache Software Foundation. All rights reserved.
quickstart.htmlcluster_setup.htmlhdfs_design.html
-
•
•
Page 3Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/streaming/package-summary.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/pipes/package-summary.htmlhttp://www.swig.org/http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/io/Writable.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/io/WritableComparable.htmlquickstart.html#Standalone+Operationquickstart.html#SingleNodeSetupquickstart.html#Fully-Distributed+Operation
-
Page 4Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
Page 5Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
•
Page 6Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
•
Page 7Copyright © 2007 The Apache Software Foundation. All rights reserved.
commands_manual.html
-
Page 8Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
Page 9Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Mapper.html
-
Page 10Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConfigurable.html#configure(org.apache.hadoop.mapred.JobConf)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Mapper.html#map(K1, V1, org.apache.hadoop.mapred.OutputCollector, org.apache.hadoop.mapred.Reporter)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Mapper.html#map(K1, V1, org.apache.hadoop.mapred.OutputCollector, org.apache.hadoop.mapred.Reporter)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/io/Closeable.html#close()http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/OutputCollector.html#collect(K, V)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/OutputCollector.html#collect(K, V)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setOutputKeyComparatorClass(java.lang.Class)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setCombinerClass(java.lang.Class)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/io/compress/CompressionCodec.html
-
Page 11Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setNumMapTasks(int)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setNumMapTasks(int)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Reducer.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setNumReduceTasks(int)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConfigurable.html#configure(org.apache.hadoop.mapred.JobConf)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Reducer.html#reduce(K2, java.util.Iterator, org.apache.hadoop.mapred.OutputCollector, org.apache.hadoop.mapred.Reporter)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/io/Closeable.html#close()http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setOutputValueGroupingComparator(java.lang.Class)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setOutputValueGroupingComparator(java.lang.Class)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setOutputKeyComparatorClass(java.lang.Class)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Reducer.html#reduce(K2, java.util.Iterator, org.apache.hadoop.mapred.OutputCollector, org.apache.hadoop.mapred.Reporter)
-
Page 12Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/OutputCollector.html#collect(K, V)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/OutputCollector.html#collect(K, V)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/fs/FileSystem.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileOutputFormat.html#setOutputPath(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.fs.Path)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Partitioner.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/lib/HashPartitioner.html
-
••
Page 13Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Reporter.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/OutputCollector.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/lib/package-summary.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/conf/Configuration.html#FinalParamshttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setNumReduceTasks(int)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setNumMapTasks(int)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileInputFormat.html#setInputPaths(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.fs.Path[])http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileInputFormat.html#addInputPath(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.fs.Path)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileInputFormat.html#setInputPaths(org.apache.hadoop.mapred.JobConf,%20java.lang.String)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileInputFormat.html#addInputPath(org.apache.hadoop.mapred.JobConf,%20java.lang.String)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileOutputFormat.html#setOutputPath(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.fs.Path)
-
Page 14Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setMapDebugScript(java.lang.String)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setReduceDebugScript(java.lang.String)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setMapSpeculativeExecution(boolean)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setReduceSpeculativeExecution(boolean)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapAttempts(int)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceAttempts(int)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapTaskFailuresPercent(int)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceTaskFailuresPercent(int)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/conf/Configuration.html#set(java.lang.String, java.lang.String)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/conf/Configuration.html#get(java.lang.String, java.lang.String)
-
•
•
Page 15Copyright © 2007 The Apache Software Foundation. All rights reserved.
cluster_setup.html#??Hadoop?????????http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#getJobLocalDir()http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#getJar()
-
Page 16Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
1.2.3.4.5.
Page 17Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#loadLibrary(java.lang.String)http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#load(java.lang.String)native_libraries.html#??DistributedCache+?????http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobClient.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/OutputLogFilter.html
-
••
•
1.2.
3.
Page 18Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobClient.html#runJob(org.apache.hadoop.mapred.JobConf)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobClient.html#submitJob(org.apache.hadoop.mapred.JobConf)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/RunningJob.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setJobEndNotificationURI(java.lang.String)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/InputFormat.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileInputFormat.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/TextInputFormat.html
-
1.2.
Page 19Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/InputSplit.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileSplit.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/RecordReader.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/OutputFormat.html
-
Page 20Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileOutputFormat.html#getWorkOutputPath(org.apache.hadoop.mapred.JobConf)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileOutputFormat.html#getWorkOutputPath(org.apache.hadoop.mapred.JobConf)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileOutputFormat.html#getWorkOutputPath(org.apache.hadoop.mapred.JobConf)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/RecordWriter.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Reporter.html#incrCounter(java.lang.Enum, long)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Reporter.html#incrCounter(java.lang.String, java.lang.String, long amount)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/Reporter.html#incrCounter(java.lang.String, java.lang.String, long amount)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/filecache/DistributedCache.html
-
Page 21Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/filecache/DistributedCache.html#addCacheFile(java.net.URI,%20org.apache.hadoop.conf.Configuration)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/filecache/DistributedCache.html#addCacheArchive(java.net.URI,%20org.apache.hadoop.conf.Configuration)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/filecache/DistributedCache.html#setCacheFiles(java.net.URI[],%20org.apache.hadoop.conf.Configuration)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/filecache/DistributedCache.html#setCacheArchives(java.net.URI[],%20org.apache.hadoop.conf.Configuration)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/filecache/DistributedCache.html#addArchiveToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/filecache/DistributedCache.html#addFileToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration)
-
Page 22Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/util/Tool.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/util/ToolRunner.html#run(org.apache.hadoop.util.Tool, java.lang.String[])http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/util/GenericOptionsParser.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/IsolationRunner.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setProfileEnabled(boolean)
-
Page 23Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setProfileTaskRange(boolean,%20java.lang.String)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setProfileParams(java.lang.String)mapred_tutorial.html#DistributedCachehttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setMapDebugScript(java.lang.String)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setReduceDebugScript(java.lang.String)
-
Page 24Copyright © 2007 The Apache Software Foundation. All rights reserved.
http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/jobcontrol/package-summary.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/io/compress/CompressionCodec.htmlhttp://www.zlib.net/http://www.oberhumer.com/opensource/lzo/http://www.gzip.org/http://www.gzip.org/native_libraries.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setCompressMapOutput(boolean)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressorClass(java.lang.Class)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileOutputFormat.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/FileOutputFormat.html#setOutputCompressorClass(org.apache.hadoop.mapred.JobConf,%20java.lang.Class)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/SequenceFileOutputFormat.htmlhttp://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/SequenceFileOutputFormat.html#setOutputCompressionType(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.io.SequenceFile.CompressionType)http://hadoop.apache.org/core/docs/r0.18.2/api/org/apache/hadoop/mapred/SequenceFileOutputFormat.html#setOutputCompressionType(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.io.SequenceFile.CompressionType)
-
Page 25Copyright © 2007 The Apache Software Foundation. All rights reserved.
quickstart.html#SingleNodeSetupquickstart.html#Fully-Distributed+Operation
-
Page 26Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
Page 27Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
Page 28Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
Page 29Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
Page 30Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
Page 31Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
Page 32Copyright © 2007 The Apache Software Foundation. All rights reserved.
-
•
•
•
•
Page 33Copyright © 2007 The Apache Software Foundation. All rights reserved.
1 目的2 先决条件3 概述4 输入与输出5 例子:WordCount v1.05.1 源代码5.2 用法5.3 解释
6 Map/Reduce - 用户界面6.1 核心功能描述6.1.1 Mapper6.1.1.1 需要多少个Map?
6.1.2 Reducer6.1.2.1 Shuffle6.1.2.2 Sort6.1.2.2.1 Secondary Sort
6.1.2.3 Reduce6.1.2.4 需要多少个Reduce?6.1.2.5 无Reducer
6.1.3 Partitioner6.1.4 Reporter6.1.5 OutputCollector
6.2 作业配置6.3 任务的执行和环境6.4 作业的提交与监控6.4.1 作业的控制
6.5 作业的输入6.5.1 InputSplit6.5.2 RecordReader
6.6 作业的输出6.6.1 任务的Side-Effect File6.6.2 RecordWriter
6.7 其他有用的特性6.7.1 Counters6.7.2 DistributedCache6.7.3 Tool6.7.4 IsolationRunner6.7.5 Profiling6.7.6 调试6.7.6.1 如何分发脚本文件:6.7.6.2 如何提交脚本:6.7.6.3 默认行为
6.7.7 JobControl6.7.8 数据压缩6.7.8.1 中间输出6.7.8.2 作业输出
7 例子:WordCount v2.07.1 源代码7.2 运行样例7.3 程序要点