hadoop - lessons learned

50
Hadoop lessons learned

Upload: tcurdt

Post on 15-Jan-2015

2.342 views

Category:

Technology


6 download

DESCRIPTION

Hadoop has proven to be an invaluable tool for many companies over the past few years. Yet it has it's ways and knowing them up front can safe valuable time. This session is a run down of the ever recurring lessons learned from running various Hadoop clusters in production since version 0.15. What to expect from Hadoop - and what not? How to integrate Hadoop into existing infrastructure? Which data formats to use? What compression? Small files vs big files? Append or not? Essential configuration and operations tips. What about querying all the data? The project, the community and pointers to interesting projects that complement the Hadoop experience.

TRANSCRIPT

Page 1: Hadoop - Lessons Learned

Hadooplessons  learned

Page 2: Hadoop - Lessons Learned

@tcurdtgithub.com/tcurdt

yourdailygeekery.com

Page 3: Hadoop - Lessons Learned

Data

Page 4: Hadoop - Lessons Learned
Page 5: Hadoop - Lessons Learned

hiring

Page 6: Hadoop - Lessons Learned

Agenda

· hadoop?  really?  cloud?· integration· mapreduce· operations· community  and  outlook

Page 7: Hadoop - Lessons Learned

Why  Hadoop?

Page 8: Hadoop - Lessons Learned

“It is a new and improved version of enterprise tape

drive”

Page 9: Hadoop - Lessons Learned

20  machines20  files,  1.5  GB  each

grep “needle” file

hadoop job grep.jar

0 17.5 35.0 52.5 70.0

unfair

Map  Reduce

Page 10: Hadoop - Lessons Learned
Page 11: Hadoop - Lessons Learned
Page 12: Hadoop - Lessons Learned

Run  your  own?

http://bit.ly/elastic-mr-pig

Page 13: Hadoop - Lessons Learned

Integration

Page 14: Hadoop - Lessons Learned

black  box

Page 15: Hadoop - Lessons Learned

· hadoop-cat

· hadoop-grep

· hadoop-range --prefix /logs --from 2012-05-15 --until 2012-05-22 --postfix /*play*.seq | xargs hadoop jar

· streaming  jobs

Engineers

Page 16: Hadoop - Lessons Learned

· mount  hdfs

· pig  /  hive

· data  dumps

Non-Engineering  Folks

Page 17: Hadoop - Lessons Learned

Map  Reduce

InputFormat

HDFS files

Split

Map

Combiner

Partitioner

Copy and Merge

Reducer

OutputFormat

Reducer

Sort

Split

Map

Combiner

Sort

Split

Map

Combiner

Sort

Split

Map

Combiner

Sort

Combiner Combiner

Page 18: Hadoop - Lessons Learned

MAPREDUCE-346  (since  2009)

12/05/25 01:27:38 INFO mapred.JobClient: Reduce input records=106..12/05/25 01:27:38 INFO mapred.JobClient: Combine output records=40912/05/25 01:27:38 INFO mapred.JobClient: Map input records=11270584412/05/25 01:27:38 INFO mapred.JobClient: Reduce output records=412/05/25 01:27:38 INFO mapred.JobClient: Combine input records=64842079..12/05/25 01:27:38 INFO mapred.JobClient: Map output records=64841776

map in : 112705844 *********************************map out : 64841776 *****************combine in : 64842079 *****************combine out : 409 |reduce in : 106 |reduce out : 4 |

Job  Counters

Page 19: Hadoop - Lessons Learned

map in : 20000 **************map out : 40000 ******************************combine in : 40000 ******************************combine out : 10001 ********reduce in : 10001 ********reduce out : 10001 ********

Job  Counters

Page 20: Hadoop - Lessons Learned

mapred.reduce.tasks = 0

Map-only

Page 21: Hadoop - Lessons Learned

public class EofSafeSequenceFileInputFormat<K,V> extends SequenceFileInputFormat<K,V> { ...}

public class EofSafeRecordReader<K,V> extends RecordReader<K,V> { ... public boolean nextKeyValue() throws IOException, InterruptedException { try { return this.delegate.nextKeyValue(); } catch(EOFException e) { return false; } } ...}

EOF  on  append

Page 22: Hadoop - Lessons Learned

ASN1, custom java serialization, Thrift

Serialization

before

now

protobuf

Page 23: Hadoop - Lessons Learned

public static class Play extends CustomWritable {

public final LongWritable time = new LongWritable();

public final LongWritable owner_id = new LongWritable();

public final LongWritable track_id = new LongWritable();

public Play() { fields = new WritableComparable[] { owner_id, track_id, time }; }}

Custom  Writables

Page 24: Hadoop - Lessons Learned

BytesWritable bytes = new BytesWritable();...byte[] buffer = bytes.getBytes();

Fear  the  State

Page 25: Hadoop - Lessons Learned

public void reduce( LongTriple key, Iterable<LongWritable> values, Context ctx) {

for(LongWritable v : values) { } for(LongWritable v : values) { }}

public void reduce( LongTriple key, Iterable<LongWritable> values, Context ctx) { buffer.clear(); for(LongWritable v : values) { buffer.add(v); } for(LongWritable v : buffer.values()) { }}

Re-Iterate

HADOOP-5266  (applied  to  0.21.0)

Page 26: Hadoop - Lessons Learned

long min = 1;long max = 10000000;

FastBitSet set = new FastBitSet(min, max);

for(long i = min; i<max; i++) { set.set(i);}

BitSets

org.apache.lucene.util.*BitSet

Page 27: Hadoop - Lessons Learned

Data  Structures

http://bit.ly/data-structureshttp://bit.ly/bloom-filtershttp://bit.ly/stream-lib

Page 28: Hadoop - Lessons Learned

General  Tips

· test  on  small  datasets,  test  on  your  machine

· many  reducers

· always  consider  a  combiner  and  partitioner

· pig  /  streaming  for  one-time  jobs,java/scala  for  recurring

http://bit.ly/map-reduce-book

Page 29: Hadoop - Lessons Learned

Operations

pdsh -w "hdd[001-019]" \"sudo sv restart /etc/sv/hadoop-tasktracker"

runit  /  init.d

pdsh  /  dsh

use  chef  /  puppet

Page 30: Hadoop - Lessons Learned

Hardware

· 2x  name  nodes  raid  1

· 12  cores,  48GB  RAM,  xfs,  2x1TB

· n  x  data  nodes  no  raid

· 12  cores,  16GB  RAM,  xfs,  4x2TB

Page 31: Hadoop - Lessons Learned

Monitoringdfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31dfs.period=10dfs.servers=...

mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31mapred.period=10mapred.servers=...

jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31jvm.period=10jvm.servers=...

rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31rpc.period=10rpc.servers=...

# ignoreugi.class=org.apache.hadoop.metrics.spi.NullContext

Page 32: Hadoop - Lessons Learned

Monitoring

total  capacity capacity  used

Page 33: Hadoop - Lessons Learned

Compression

#  of  64MB  blocks#  of  bytes  needed#  of  bytes  used#  bytes  reclaimed

bzip2  /  gzip  /  lzo  /  snappyio.seqfile.compression.type = BLOCKio.seqfile.compression.blocksize = 512000

Page 34: Hadoop - Lessons Learned

Janitor

hadoop-expire -url namenode.here -path /tmp -mtime 7d -delete

Page 35: Hadoop - Lessons Learned

The last block of an HDFS block only occupies the required space. So a 4k file only consumes 4k on disk.-- Owen

BUSTED

Page 36: Hadoop - Lessons Learned

find \ -wholename "/var/log/hadoop/hadoop-*" \ -wholename "/var/log/hadoop/job_*.xml" \ -wholename "/var/log/hadoop/history/*" \ -wholename "/var/log/hadoop/history/\\.*.crc" \ -wholename "/var/log/hadoop/history/done/*" \ -wholename "/var/log/hadoop/history/done/\\.*.crc" \ -wholename "/var/log/hadoop/userlogs/attempt_*" \ -mtime +7 \ -daystart \ -delete

Logfiles

Page 37: Hadoop - Lessons Learned

Limits

hdfs hard nofile 128000hdfs soft nofile 64000mapred hard nofile 128000mapred soft nofile 64000

fs.file-max = 128000

sysctl.conf

limits.conf

Page 38: Hadoop - Lessons Learned

Localhost

127.0.0.1 localhost localhost.localdomain127.0.1.1 hdd01

127.0.0.1 localhost localhost.localdomain127.0.1.1 hdd01.some.net hdd01

before

hadoop

Page 39: Hadoop - Lessons Learned

Rackaware

<property> <name>topology.script.file.name</name> <value>/path/to/script/location-from-ip</value> <final>true</final></property>

#!/usr/bin/rubylocation = { 'hdd001.some.net' => '/ams/1', '10.20.2.1' => '/ams/1', 'hdd002.some.net' => '/ams/2', '10.20.2.2' => '/ams/2',}

puts ARGV.map { |ip| location[ARGV.first] || '/default-rack' }.join(' ')

site  config

topology  script

Page 40: Hadoop - Lessons Learned

for f in `hdfs hadoop fsck / | grep "Replica placement policy is violated" | awk -F: '{print $1}' | sort | uniq | head -n1000`; do hadoop fs -setrep -w 4 $f hadoop fs -setrep 3 $fdone

Fix  the  Policy

Page 41: Hadoop - Lessons Learned

hadoop fsck / -openforwrite -files | grep -i "OPENFORWRITE: MISSING 1 blocks of total size" | awk '{print $1}' | xargs -L 1 -i hadoop dfs -mv {} /lost+notfound

Fsck

Page 42: Hadoop - Lessons Learned

Community

hadoop

*  from  markmail.org

Page 43: Hadoop - Lessons Learned

Community

The  Enterprise  Effect

“The  Community  Effect”  (in  2011)

Page 44: Hadoop - Lessons Learned
Page 45: Hadoop - Lessons Learned
Page 46: Hadoop - Lessons Learned

Community

mapreduce

core

*  from  markmail.org

Page 47: Hadoop - Lessons Learned

The  Future

real  timeincremental

flexible  pipelinesrefined  API

refined  implementation

Page 48: Hadoop - Lessons Learned

Real  Time  Datamining  and  Aggregation  at  Scale  (Ted  Dunning)

Eventually  Consistent  Data  Structures  (Sean  Cribbs)

Real-time  Analytics  with  HBase  (Alex  Baranau)

Profiling  and  performance-tuning  your  Hadoop  pipelines  (Aaron  Beppu)

From  Batch  to  Realtime  with  Hadoop  (Lars  George)

Event-Stream  Processing  with  Kafka  (Tim  Lossen)

Real-/Neartime  analysis  with  Hadoop  &  VoltDB  (Ralf  Neeb)

Page 49: Hadoop - Lessons Learned

Take  Aways

·use  hadoop  only  if  you  must·really  understand  the  pipeline·unbox  the  black  box

Page 50: Hadoop - Lessons Learned

@tcurdtgithub.com/tcurdt

yourdailygeekery.com

That’s  it  folks!