map reduce basics chapter 2

63

Upload: xiang

Post on 23-Feb-2016

46 views

Category:

Documents


0 download

DESCRIPTION

Map Reduce Basics Chapter 2. Basics. Divide and conquer Partition large problem into smaller subproblems Worker work on subproblems in parallel Threads in a core, cores in multi-core processor, multiple processor in a machine, machines in a cluster - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Map Reduce Basics  Chapter 2
Page 2: Map Reduce Basics  Chapter 2

MAP REDUCE BASICS CHAPTER 2

Page 3: Map Reduce Basics  Chapter 2

Basics• Divide and conquer– Partition large problem into smaller subproblems– Worker work on subproblems in parallel• Threads in a core, cores in multi-core

processor, multiple processor in a machine, machines in a cluster

Page 4: Map Reduce Basics  Chapter 2

History CPUs

• Single CPU – inserted into single CPU socket on a motherboard

• Didn’t work well but:– Tried to add other CPU sockets, so multiple CPUs– Need additional hardware to connect them to RAM,

resources– Lots of overhead

Page 5: Map Reduce Basics  Chapter 2

Hyper-threading

• Parallel computation to PCs – 2000• Pentium 4 – single CPU core but HT– Appears as 2 logical CPUs– MIMD– But are sharing execution– Need OS to take advantage of it

• Logical core is the number of physical cores times the number of threads that can run on a machine

Page 6: Map Reduce Basics  Chapter 2

Multi-core

• Additional cores added• Single CPU socket, all cores on same chip so

less latency– If dual core, CPU has 2 central processing units– Can run 2 processes at same time– Since all on same chip• No extra hardware needed to connect to RAM, etc.• faster communication

Page 7: Map Reduce Basics  Chapter 2

vCPU

• Virtual Processor• VM assigned to a vCPU, share of a physical CPU

assigned to a VM• vCPU is a series of time slots on logical processors• Adding more vCPUs can increase wait time

Page 8: Map Reduce Basics  Chapter 2

How to increase vCPUs

• If a vCPU is presented to OS as single core CPU in single socket, limits number of vCPUs

• Typically OS restrict number of physical CPUs not logical CPUs

• Multicore vCPUs - VMware introduced virtual sockets and cores per socket, e.g. 2 virtual sockets and 4 virtual cores per socket, allows 8vCPUs

• Or can use hyperthreads to increase vCPUs

Page 9: Map Reduce Basics  Chapter 2

Multiple threads on a VM• If 2 hyper threads each on 4 cores, VM thinks have 8 cores• If want all 8 threads at once?

Page 10: Map Reduce Basics  Chapter 2

• Each thread at 50% utilization• Each physical CPU at 100%

• Avoid creating a single VM with more vCPUs than physical cores

• If need more, split up VM

Page 11: Map Reduce Basics  Chapter 2

Basics

• MR – abstraction that hides system-level details from programmer

• Move code to data– Spread data across disks– DFS manages storage

Page 12: Map Reduce Basics  Chapter 2

Topics

• Functional programming• MapReduce• Distributed file system

Page 13: Map Reduce Basics  Chapter 2

Functional Programming Roots• MapReduce = functional programming plus

distributed processing on steroids – Not a new idea… dates back to the 50’s (or even

30’s)• What is functional programming?– Computation as application of functions– Computation is evaluation of mathematical

functions– Avoids state and mutable data– Emphasizes application of functions instead of

changes in state

Page 14: Map Reduce Basics  Chapter 2

Functional Programming Roots• How is it different?– Traditional notions of “data” and “instructions” are not

applicable– Data flows are implicit in program– Different orders of execution are possible– Theoretical foundation provided by lambda calculus

• a formal system for function definition• Exemplified by LISP, Scheme

Page 15: Map Reduce Basics  Chapter 2

Overview of Lisp

• Functions written in prefix notation where operators precede operands

(+ 1 2) 3 (* 3 4) 12(sqrt ( + (* 3 3) (* 4 4))) 5(define x 3) x(* x 5) 15

Page 16: Map Reduce Basics  Chapter 2

Functions• Functions = lambda (anonymous) expressions bound to

variablesExample expressed with lambda:(+ 1 2) 3 λxλy.x+y

• Above expression is equivalent to:

• Once defined, function can be applied:

(define (foo x y) (sqrt (+ (* x x) (* y y))))

(define foo (lambda (x y) (sqrt (+ (* x x) (* y y)))))

(foo 3 4) 5

Page 17: Map Reduce Basics  Chapter 2

Functional Programming Roots

• Map and Fold• Two important concepts in functional

programming– Map: do something to everything in a list– Fold: combine results of a list in some way

Page 18: Map Reduce Basics  Chapter 2

Functional Programming Map• Higher order functions – accept other functions as arguments– Map• Takes a function f and its argument, which is a list• applies to all elements in list• Returns a list as result

• Lists are primitive data types – [1 2 3 4 5]– [[a 1] [b 2] [c 3]]

Page 19: Map Reduce Basics  Chapter 2

Map/Fold in Action• Simple map example:

(map (lambda (x) (* x x)) [1 2 3 4 5]) [1 4 9 16 25]

Page 20: Map Reduce Basics  Chapter 2

Functional Programming Reduce– Fold• Takes function g, which has 2 arguments: an initial

value and a list. • The g applied to initial value and 1st item in list• Result stored in intermediate variable• Intermediate variable and next item in list 2nd

application of g, etc.• Fold returns final value of intermediate variable

Page 21: Map Reduce Basics  Chapter 2
Page 22: Map Reduce Basics  Chapter 2

Map/Fold in Action• Simple map example:

• Fold examples:

• Write Sum of squares:

(map (lambda (x) (* x x)) [1 2 3 4 5]) [1 4 9 16 25]

(fold + 0 [1 2 3 4 5]) 15(fold * 1 [1 2 3 4 5]) 120

(define (sum-of-squares v) // where v is a list (fold + 0 (map (lambda (x) (* x x)) v)))(sum-of-squares [1 2 3 4 5]) 55

Page 23: Map Reduce Basics  Chapter 2

Functional Programming Roots

• Use map/fold in combination• Map – transformation of dataset• Fold- aggregation operation• Can apply map in parallel• Fold – more restrictions, elements must be

brought together– Many applications do not require g be applied to

all elements of list, fold aggregations in parallel

Page 24: Map Reduce Basics  Chapter 2

MapReduce - Functional Programming Roots

• Input to function, apply function• Function emits output– Can use output as input to next stage

Page 25: Map Reduce Basics  Chapter 2

MapReduce

• Map in MapReduce is same as in functional programming

• Reduce corresponds to fold• 2 stages:– User specified computation applied over all input,

can occur in parallel, return intermediate output– Output aggregated by another user-specified

computation

Page 26: Map Reduce Basics  Chapter 2

Mappers/Reducers

• Key-value pair (k,v) – basic data structure in MR

• Keys, values – int, strings, etc., user defined– e.g. (k – URLs, v – HTML content)– e.g. (k – node ids, v – adjacency lists of nodes)

Map: (k1, v1) -> [(k2, v2)]Reduce: (k2, [v2]) -> [(k3, v2)]

Where […] denotes a listNotice output of Map, input to Reduce different

Page 27: Map Reduce Basics  Chapter 2

General Flow• Apply mapper to every input key-value pair stored in

DFS• Generate arbitrary number of intermediate (k,v)• Group by operation on intermediate keys within mapper

(really a sort? But called a shuffle))• Distribute intermediate results by key – not across

reducers but across the network (really a shuffle? But called a sort)

• Aggregate intermediate results• Generate final output to DFS – one file per reducer

Map

Reduce

Page 28: Map Reduce Basics  Chapter 2

What function is implemented?

9

Page 29: Map Reduce Basics  Chapter 2

Another Example: unigram (word count)

• (docid, doc) on DFS, doc is text• Mapper tokenizes (docid, doc), emits (k,v) for every

word – (word, 1)• Execution framework all same keys brought

together in reducer• Reducer – sums all counts (of 1) for word• Each reduce writes to one file• Words within file sorted, file same # words• Can use output as input to another MR

Page 30: Map Reduce Basics  Chapter 2

• Hadoop libraries for MapReduce

Page 31: Map Reduce Basics  Chapter 2
Page 32: Map Reduce Basics  Chapter 2

Combine - Bandwidth Optimization

• Issue: Can be a large number of key-value pairs– Example – word count (word, 1)– If copy across network intermediate data > input

Page 33: Map Reduce Basics  Chapter 2

Combine - Bandwidth Optimization

• Solution: use Combiner functions– allow local aggregation (after mapper) before shuffle sort

• Word Count - Aggregate (count each word locally)– intermediate = # unique words

– Executed on same machine as mapper – no output from other mappers

– Results in a “mini-reduce” right after the map phase– (k,v) of same type as input/output– If operation associative and commutative, reduce can be

same as combiner– Reduces key-value pairs to save bandwidth

Page 34: Map Reduce Basics  Chapter 2

Partitioners – Load Balance

• Issue: Intermediate results can all be on one reducer• Solution: use Partitioner functions – divide up intermediate key space and assign (k,v) to

reducers– Specifies task to which copy (k,v)– Reducer processes keys in sorted order– Partitioner applies function to key– Hopefully same number of each to each reducer

• But may be- Zipfian

Page 35: Map Reduce Basics  Chapter 2

MapReduce• Programmers specify two functions:

map (k, v) → <k’, v’>*reduce (k’, v’) → <k’, v’>*– All v’ with the same k’ are reduced together

• Usually, programmers also specify:partition (k’, number of partitions ) → partition for k’– Often a simple hash of the key, e.g. hash(k’) mod n

• Where n is the number of reducers– Allows reduce operations for different keys in parallel

Page 36: Map Reduce Basics  Chapter 2
Page 37: Map Reduce Basics  Chapter 2

Its not just Map and Reduce• Apply mapper to every input key-value pair stored in DFS• Generate arbitrary number of intermediate (k,v)• Aggregate locally • Assign to reducers• Group by operation on intermediate keys• Distribute intermediate results by key not across

reducers• Aggregate intermediate results• Generate final output to DFS – one file per reducer

Map

Reduce

Combine

Partition

Page 38: Map Reduce Basics  Chapter 2

Execution Framework

• MapReduce program (job) contains• Code for mappers• Combiners (optional)• Partitioners (optional)• Code for reducers• Configuration parameters (where is input, store output)

– Execution framework takes care of everything else– Developer submits job to submission node of

cluster (jobtracker)

Page 39: Map Reduce Basics  Chapter 2

Recall these problems?• How do we assign work units to workers?• What if we have more work units than workers?• What if workers need to share partial results?• How do we aggregate partial results?• How do we know all the workers have finished?• What if workers die?

• MapReduce takes care of all this

Page 40: Map Reduce Basics  Chapter 2

Execution Framework of MapReduce

• Scheduling– Job divided into tasks (certain block of (k,v) pairs)– Can have 1000s jobs need to be assigned– May exceed number that can run concurrently– Task queue– Coordination among tasks from different jobs

Page 41: Map Reduce Basics  Chapter 2

Execution Framework

• Synchronization– Concurrently running processes join up– Intermediate (k,v) grouped by key, copy intermediate data over network, shuffle/sort

• Number of copy operations (M mappers, R reducers)• Worst case?

–M X R copy operations• Each mapper may send intermediate results to every reducer

– Reduce computation cannot start until all mappers finished, (k,v) shuffled/sorted• Differs from functional programming

– But can copy intermediate (k,v) over network to reducer when mapper finishes

Page 42: Map Reduce Basics  Chapter 2

Execution Framework of MapReduce

• Speculative execution• Map phase only as fast as?– slowest map task• Problem: Stragglers, flaky hardware• Solution: Use speculative execution:–Exact copy of same task on different machine–Uses result of fastest task in attempt to finish–Better for map or reduce?–Can improve running time by 44% (Google)–Doesn’t help if skewed distribution of values

Page 43: Map Reduce Basics  Chapter 2

Execution Framework

• Data/code co-location– Execute near data– If not possible must stream data• Try to keep within same rack

Page 44: Map Reduce Basics  Chapter 2

Execution Framework

• Error/fault handling– The norm– Disk failures, RAM errors, datacenter outages– Software errors– Corrupted data

Page 45: Map Reduce Basics  Chapter 2

Map Reduce• Implementations:– Google has a proprietary implementation in C++– Hadoop is an open source implementation in Java (lead by

Yahoo now Apache)• Hadoop is a framework for storage and large scale

processing of data sets on clusters of commodity hardware• Walmart uses Hadoop for storage (although they

originally broke it – wouldn’t scale as needed)• P&G uses HIVE – For data analytics built on top of

Hadoop

Page 46: Map Reduce Basics  Chapter 2

Differences in MapReduce Implementations

• Hadoop (Apache) vs. Google– Google – program can specify 2ndary sort, can’t

change key in reducer– Hadoop - Values arbitrarily ordered, can change key in

reducer• Hadoop– Programmer can specify number of map tasks, but

framework makes final decision– In reduce, programmer specified number of tasks is

used

Page 47: Map Reduce Basics  Chapter 2

Hadoop• Careful using external resources

– e.g. bottleneck querying SQL DB• Mappers can emit arbitrary number of intermediate (k,v),

can be of different type• Reduce can emit arbitrary number of final (k,v) and can

be of different type than intermediate (k,v)• Different from functional programming, can have side

effects (state change internal – may cause problems, external may write to files)

• MapReduce can have no reduce, but must have mapper– Can just pass identity function to reducer– May not have any input • compute pi

Page 48: Map Reduce Basics  Chapter 2

Other Sources

• Other source can serve as source/destination for data from MapReduce– Google – BigTable– Hbase – BigTable clone– Hadoop – integrated RDB with parallel processing,

can write to DB tables

Page 49: Map Reduce Basics  Chapter 2

File Systems – GFS vs DFS

• Distributed File System (DFS)– In HPC, storage distinct from computation– NAS (network attached storage) and SAN are common

• Separate, dedicated nodes for storage– Fetch, load, process, write– Bottleneck

• Higher performance networks $$ (10G Ethernet), special purpose interconnects $$$ (InfiniBand)– $$ increases non-linearly

– In GFS Computation and storage not distinct components

Page 50: Map Reduce Basics  Chapter 2

Hadoop Distributed File System - HDFS• GFS supports proprietary MapReduce• HDFS – supports Hadoop• Don’t have to run GFS on DFS, but misses

advantages• Difference in GFS and HDFS vs. DFS:– Adapted to large data processing– divide user data into chunks/blocks – LARGE (was)– Replicate these across the local disk nodes in

cluster– Master-slave architecture

Page 51: Map Reduce Basics  Chapter 2

HDFS vs GFS (Google File System)

• Difference in HDFS:– Master-slave architecture

• GFS: Master (master), slave (chunkserver)• HDFS: master (namenode), slave (datanode)

– Master – namespace (metadata, directory structure, file to block mapping, location of blocks, access permission)

– Slaves – manage actual data blocks– Client contacts namespace, gets data from slaves, 3 copies of

each block, etc.– Block is 64 MB– Initially Files were immutable – once closed cannot be modified

Page 52: Map Reduce Basics  Chapter 2
Page 53: Map Reduce Basics  Chapter 2

HDFS

• Namenode– Namespace management– Coordinate file operations

• Lazy garbage collection– Maintain file system health

• Heartbeats, under-replication, balancing

• Supports subset of POSIX API, pushed to application• No Security

Page 54: Map Reduce Basics  Chapter 2
Page 55: Map Reduce Basics  Chapter 2

Hadoop Cluster Architecture

• HDFS namenode runs daemon• Job submission node runs jobtracker – point of contact run MapReduce– Monitors progress of MapReduce jobs, coordinates

Mappers and reducers• Slaves run tasktracker– Runs users code, datanode daemon, serve HDFS

data– Send heartbeat messages to jobtracker

Page 56: Map Reduce Basics  Chapter 2

Hadoop Cluster Architecture

• Number of reduce tasks depends on reducers specified by programmer

• Number of map tasks depends on– Hint from programmer– Number of input files– Number of HDFS data blocks of files

Page 57: Map Reduce Basics  Chapter 2

Hadoop Cluster Architecture

• Map tasks assigned– (k,v) called input split• Input splits computed automatically• Aligned on HDFS boundaries so associated with

single block, simplifies scheduling• Data locality, if not stream across network

(same rack if possible)

Page 58: Map Reduce Basics  Chapter 2

• How can we use MapReduce to solve problems?

• Refresh your memory on Dijkstra’s algorithm

Page 60: Map Reduce Basics  Chapter 2

Hadoop Cluster Architecture

• Mappers in Hadoop– Javaobjects with a MAP method– Mapper object instantiated for every map task by tasktracker– Life cycle – instantiation, hook in API for program specified

code• Mappers can load state, static data sources, dictionaries, etc.

– After initialization: MAP method called by framework on all (k,v) in input split

– Method calls within same Java object, can preserve state across multiple (k,v) in same task

– Can run programmer specified termination code

Page 61: Map Reduce Basics  Chapter 2

Hadoop Cluster Architecture

• Reducers in Hadoop– Execution similar to that of mappers• Instantiation, initialization, framework calls REDUCE

method with intermediate key and iterator over all key values• Intermediate keys in sorted order• Can preserve state across multiple intermediate keys

Page 62: Map Reduce Basics  Chapter 2

CAP Theorem

• Consistency, availability, partition tolerance• Cannot satisfy all 3• Partitioning unavoidable in large data systems,

must trade off availability and consistency– If master fails, system is unavailable so consistent!– If multiple masters, more available, but inconsistent

• Workaround to single namenode– Warm standby namenode– Hadoop community working on it

Page 63: Map Reduce Basics  Chapter 2