cloud ppt.pptx

Upload: chetanjyotsna

Post on 02-Jun-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 cloud ppt.pptx

    1/30

    Resilient Distributed DatasetsA Fault-Tolerant Abstraction for

    In-Memory Cluster Computing

    Matei Zaharia, Mosharaf Chowdhury, Tathagata Das,Ankur Dave, Justin Ma, Murphy McCauley,Michael Franklin, Scott Shenker, Ion Stoica

  • 8/11/2019 cloud ppt.pptx

    2/30

    MotivationMapReduce greatly simplified big data analysison large, unreliable clusters

    But as soon as it got popular, users wanted more:

    More complex, multi-stage applications

    (e.g. iterative machine learning & graph processing)

    More interactivead-hoc queries

    Response: specializedframeworks for some of

    these apps (e.g. Pregel for graph processing)

  • 8/11/2019 cloud ppt.pptx

    3/30

    MotivationComplex apps and interactive queries both needone thing that MapReduce lacks:

    Efficient primitives for data sharing

    In MapReduce, the only way to share dataacross jobs is stable storageslow!

  • 8/11/2019 cloud ppt.pptx

    4/30

    Examples

    iter. 1 iter. 2 . . .

    Input

    HDFSread

    HDFSwrite

    HDFSread

    HDFSwrite

    Input

    query 1

    query 2

    query 3

    result 1

    result 2

    result 3

    . . .

    HDFS

    read

    Slow due to replication and disk I/O,

    but necessary for fault tolerance

  • 8/11/2019 cloud ppt.pptx

    5/30

    iter. 1 iter. 2 . . .

    Input

    Goal: In-Memory Data Sharing

    Input

    query 1

    query 2

    query 3

    . . .

    one-time

    processing

    10-100 faster than network/disk, but how to get FT?

  • 8/11/2019 cloud ppt.pptx

    6/30

    Challenge

    How to design a distributed memory abstractionthat is both fault-tolerantand efficient?

  • 8/11/2019 cloud ppt.pptx

    7/30

    Challenge

    Existing storage abstractions have interfacesbased onfine-grainedupdates to mutable state

    RAMCloud, databases, distributed mem, Piccolo

    Requires replicating data or logs across nodesfor fault tolerance

    Costly for data-intensive apps10-100x slower than memory write

  • 8/11/2019 cloud ppt.pptx

    8/30

    Solution: Resilient Distributed

    Datasets (RDDs)Restricted form of distributed shared memory

    Immutable, partitioned collections of recordsCan only be built through coarse-grained

    deterministic transformations (map, filter, join, )

    Efficient fault recovery using lineageLog one operation to apply to many elementsRecompute lost partitions on failure

    No cost if nothing fails

  • 8/11/2019 cloud ppt.pptx

    9/30

    Input

    query 1

    query 2

    query 3

    . . .

    RDD Recovery

    one-time

    processing

    iter. 1 iter. 2 . . .

    Input

  • 8/11/2019 cloud ppt.pptx

    10/30

    Generality of RDDs

    Despite their restrictions, RDDs can expresssurprisingly many parallel algorithms

    These naturally apply the same operation to many items

    Unify many current programming modelsData flow models:MapReduce, Dryad, SQL,

    Specialized modelsfor iterative apps: BSP (Pregel),iterative MapReduce (Haloop), bulk incremental,

    Support new apps that these models dont

  • 8/11/2019 cloud ppt.pptx

    11/30

    Memory

    bandwidth

    Network

    bandwidth

    Tradeoff Space

    Granularityof Updates

    Write Throughput

    Fine

    Coarse

    Low High

    K-V

    stores,databases,

    RAMCloudBest for batch

    workloads

    Best for

    transactionalworkloads

    HDFS RDDs

  • 8/11/2019 cloud ppt.pptx

    12/30

    Outline

    Spark programming interface

    Implementation

    Demo

    How people are using Spark

  • 8/11/2019 cloud ppt.pptx

    13/30

    Spark Programming Interface

    DryadLINQ-like API in the Scala language

    Usable interactively from Scala interpreter

    Provides:Resilient distributed datasets (RDDs)Operations on RDDs: transformations(build new RDDs),

    actions(compute and output results)Control of each RDDspartitioning(layout across nodes)

    andpersistence(storage in RAM, on disk, etc)

  • 8/11/2019 cloud ppt.pptx

    14/30

    Example: Log Mining

    Load error messages from a log into memory, theninteractively search for various patterns

    lines = spark.textFile(hdfs://...)

    errors = lines.filter(_.startsWith(ERROR))messages = errors.map(_.split(\t)(2))

    messages.persist()

    Block 1

    Block 2

    Block 3

    Worker

    Worker

    Worker

    Master

    messages.filter(_.contains(foo)).count

    messages.filter(_.contains(bar)).count

    tasks

    results

    Msgs. 1

    Msgs. 2

    Msgs. 3

    Base RDDTransformed RDD

    Action

    Result:full-text search of Wikipediain

  • 8/11/2019 cloud ppt.pptx

    15/30

    RDDs track the graph of transformations thatbuilt them (their lineage) to rebuild lost data

    E.g.: messages = textFile(...).filter(_.contains(error)).map(_.split(\t)(2))

    HadoopRDD

    path = hdfs://

    FilteredRDD

    func = _.contains(...)

    MappedRDD

    func = _.split()

    Fault Recovery

    HadoopRDD FilteredRDD MappedRDD

  • 8/11/2019 cloud ppt.pptx

    16/30

    Fault Recovery Results

    119

    57 56 58 58

    81

    57 59 57 59

    020

    40

    60

    80100

    120

    140

    1 2 3 4 5 6 7 8 9 10

    I

    teratriontime

    (s)

    Iteration

    Failure happens

  • 8/11/2019 cloud ppt.pptx

    17/30

    Example: PageRank

    1. Start each page with a rank of 1

    2. On each iteration, update each pages rank to

    ineighborsranki/ |neighborsi|

    links = // RDD of (url, neighbors) pairsranks = // RDD of (url, rank) pairs

    for (i links.map(dest => (dest, rank/links.size))

    }.reduceByKey(_ + _)

    }

  • 8/11/2019 cloud ppt.pptx

    18/30

    Optimizing Placement

    links& ranksrepeatedly joined

    Can co-partitionthem (e.g. hash

    both on URL) to avoid shuffles

    Can also use app knowledge,e.g., hash on DNS name

    links = links.partitionBy(new URLPartitioner())

    reduce

    Contribs0

    join

    join

    Contribs2

    Ranks0(url, rank)

    Links(url, neighbors)

    . . .

    Ranks2

    reduce

    Ranks1

  • 8/11/2019 cloud ppt.pptx

    19/30

    PageRank Performance

    171

    72

    23

    0

    50

    100

    150

    200

    Tim

    eperiteration

    (s)

    Hadoop

    Basic Spark

    Spark + Controlled

    Partitioning

  • 8/11/2019 cloud ppt.pptx

    20/30

    Implementation

    Runs on Mesos [NSDI 11]to share clusters w/ Hadoop

    Can read from any Hadoopinput source (HDFS, S3, )

    Spark Hadoop MPI

    Mesos

    Node Node Node Node

    No changes to Scala language or compiler Reflection + bytecode analysis to correctly ship code

    www.spark-project.org

    http://www.spark-project.org/http://www.spark-project.org/http://www.spark-project.org/http://www.spark-project.org/
  • 8/11/2019 cloud ppt.pptx

    21/30

  • 8/11/2019 cloud ppt.pptx

    22/30

    Demo

  • 8/11/2019 cloud ppt.pptx

    23/30

    Open Source Community

    15contributors, 5+companies using Spark,3+applications projects at Berkeley

    User applications:Data mining 40x faster than Hadoop (Conviva)Exploratory log analysis (Foursquare)

    Traffic prediction via EM (Mobile Millennium)Twitter spam classification (Monarch)DNA sequence analysis (SNAP) . . .

  • 8/11/2019 cloud ppt.pptx

    24/30

    Related WorkRAMCloud, Piccolo, GraphLab, parallel DBs

    Fine-grained writes requiring replication for resilience

    Pregel, iterative MapReduce

    Specialized models; cant run arbitrary / ad-hoc queriesDryadLINQ, FlumeJava

    Language-integrated distributed dataset API, but cannotshare datasets efficiently acrossqueries

    Nectar [OSDI 10] Automatic expression caching, but over distributed FS

    PacMan [NSDI 12] Memory cache for HDFS, but writes still go to network/disk

  • 8/11/2019 cloud ppt.pptx

    25/30

  • 8/11/2019 cloud ppt.pptx

    26/30

    Behavior with Insufficient RAM

    68.

    8

    58.1

    40.7

    29.7

    11.5

    0

    20

    40

    60

    80

    100

    0% 25% 50% 75% 100%

    Iterationtim

    e(s)

    Percent of working set in memory

  • 8/11/2019 cloud ppt.pptx

    27/30

    Scalability

    184

    111

    76

    116

    80

    62

    15

    6

    3

    0

    50

    100

    150

    200

    250

    25 50 100

    Iterationtime(s

    )

    Number of machines

    Hadoop

    HadoopBinMem

    Spark

    274

    157

    106

    197

    121

    87

    143

    61

    33

    0

    50

    100

    150

    200250

    300

    25 50 100

    Iterationtime(s)

    Number of machines

    Hadoop

    HadoopBinMem

    Spark

    Logistic Regression K-Means

  • 8/11/2019 cloud ppt.pptx

    28/30

    Breaking Down the Speedup

    15.4

    13

    .1

    2

    .9

    8.4

    6.9

    2

    .9

    0

    5

    10

    15

    20

    In-mem HDFS In-mem local file Spark RDD

    Iterationtime(s)

    Text Input

    Binary Input

  • 8/11/2019 cloud ppt.pptx

    29/30

    Spark Operations

    Transformations

    (define a new RDD)

    mapfilter

    sample

    groupByKeyreduceByKey

    sortByKey

    flatMapunionjoin

    cogroupcross

    mapValues

    Actions(return a result todriver program)

    collect

    reducecountsave

    lookupKey

  • 8/11/2019 cloud ppt.pptx

    30/30

    Task SchedulerDryad-like DAGs

    Pipelines functions

    within a stage

    Locality & datareuse aware

    Partitioning-awareto avoid shuffles

    join

    union

    groupBy

    map

    Stage 3

    Stage 1

    Stage 2

    A: B:

    C: D:

    E:

    F:

    G:

    = cached data partition