ddb slides

Upload: y

Post on 25-Feb-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/25/2019 DDB Slides

    1/67

    1

    Distributed Data Management

    Distributed Systems

    Department of Computer Science

    UC Irvine

  • 7/25/2019 DDB Slides

    2/67

    2

    Centralized DB systems

    Software:

    ApplicationSQL Front End

    Query rocessor!ransaction roc"File Access

    # """

    $ Simpli%cations:

    sin&le front end one place to 'eep data( loc's

    if processor fails( system fails( """

  • 7/25/2019 DDB Slides

    3/67

    )

    Distri*uted Data*ase Systems

    $ #ultiple processors + , memories-

    $ .etero&eneity and autonomy of

    /components0

  • 7/25/2019 DDB Slides

    4/67

    3y do we need Distri*uted

    Data*ases4$ E5ample: 6B# 3as o7ces in London(

    8ew 9or'( and .on& on&"

    $ Employee data:; E#+E8-

    $ 3ere s3ould t3e employee data

    ta*le reside4

  • 7/25/2019 DDB Slides

    5/67

    ?

    6B# Data Access attern

    $ #ostly( employee data is mana&ed att3e o7ce w3ere t3e employee wor's; E"&"( payroll( *ene%ts( 3ire and %re

    $ eriodically( 6B# needs consolidatedaccess to employee data; E"&"( 6B# c3an&es *ene%t plans and t3at

    a@ects all employees"; E"&"( Annual *onus depends on &lo*al netpro%t"

  • 7/25/2019 DDB Slides

    6/67

    E#

    6nternet

    Londonayroll app

    London

    8ew 9or'ayroll app

    8ew 9or'

    .on& on&ayroll app

    .on& on&

    ro*lem:

    89 and . payrollapps run ery slowly

  • 7/25/2019 DDB Slides

    7/67

    LondonEmp

    6nternet

    Londonayroll app

    London

    8ew 9or'ayroll app

    8ew 9or'

    .on& on&ayroll app

    .on& on&

    .Emp

    89Emp

    #uc3 *etter

  • 7/25/2019 DDB Slides

    8/67

    6nternet

    Londonayroll app

    AnnualBonus app

    London

    8ew 9or'ayroll app

    8ew 9or'

    .on& on&ayroll app

    .on& on&

    LondonEmp 89

    Emp

    .Emp

    Distri*ution proidesopportunities for

    parallel e5ecution

  • 7/25/2019 DDB Slides

    9/67

    6nternet

    Londonayroll app

    AnnualBonus app

    London

    8ew 9or'ayroll app

    8ew 9or'

    .on& on&ayroll app

    .on& on&

    LondonEmp 89

    Emp

    .Emp

  • 7/25/2019 DDB Slides

    10/671G

    6nternet

    Londonayroll app

    AnnualBonus app

    London

    8ew 9or'ayroll app

    8ew 9or'

    .on& on&ayroll app

    .on& on&

    Lon( 89Emp 89( .

    Emp

    .( LonEmp

    =eplication improe

    aaila*ility

  • 7/25/2019 DDB Slides

    11/67

    .omo&eneous Distri*utedData*ases

    $ In a homogeneous distributed database

    ; All sites 3ae identical software

    ; Are aware of eac3 ot3er and a&ree to cooperate inprocessin& user reHuests"

    ; Eac3 site surrenders part of its autonomy in terms of ri&3t to

    c3an&e sc3emas or software

    ; Appears to user as a sin&le system

    $ In a heterogeneous distributed database

    ; Di@erent sites may use di@erent sc3emas and software

    Di@erence in sc3ema is a maIor pro*lem for Huery

    processin& Di@erence in software is a maIor pro*lem for transaction

    processin&

    ; Sites may not *e aware of eac3 ot3er and may proide onlylimited facilities for cooperation in transaction processin&

  • 7/25/2019 DDB Slides

    12/6712

    DB arc3itectures

    +1- S3ared memory

    """

    #

  • 7/25/2019 DDB Slides

    13/67

    1)

    DB arc3itectures

    +2- S3ared dis'

    """

    """

    #

    # #

  • 7/25/2019 DDB Slides

    14/67

    1

    DB arc3itectures

    +)- S3ared not3in&

    #

    #

    #

    """

  • 7/25/2019 DDB Slides

    15/67

    1?

    DB arc3itectures+- .y*rid e5ample ; .ierarc3ical or Clustered

    #

    """

    #

    """

  • 7/25/2019 DDB Slides

    16/67

    1

    6ssues for selectin&arc3itecture$ =elia*ility

    $ Scala*ility

    $ Jeo&rap3ic distri*ution of data$ erformance

    $ Cost

  • 7/25/2019 DDB Slides

    17/67

    1

    arallel or distri*uted DB system4

    $ #ore similarities t3an di@erences

  • 7/25/2019 DDB Slides

    18/67

    1

    $!ypically( parallel DBs:; Fast interconnect

    ; .omo&eneous software

    ; .i&3 performance is &oal;!ransparency is &oal

  • 7/25/2019 DDB Slides

    19/67

    1

    $!ypically( distri*uted DBs:; Jeo&rap3ically distri*uted

    ; Data s3arin& is &oal +may run into

    3etero&eneity( autonomy-

    ; Disconnected operation possi*le

  • 7/25/2019 DDB Slides

    20/67

    2G

    Distri*uted Data*ase

    C3allen&es$ Distri*uted Data*ase Desi&n; Decidin& w3at data &oes w3ere

    ; Depends on data access patterns ofmaIor applications

    ;!wo su*pro*lems: Fra&mentation: partition ta*les into

    fra&ments Allocation: allocate fra&ments to nodes

  • 7/25/2019 DDB Slides

    21/67

    Distri*uted Data Stora&e

    $ Assume relational data model

    $ Replication

    ; System maintains multiple copies of data(stored in di@erent sites( for faster retrieal andfault tolerance"

    $ Fragmentation

    ; =elation is partitioned into seeral fra&mentsstored in distinct sites

    $ =eplication and fra&mentation can *e com*ined

    ; =elation is partitioned into seeral fra&ments:system maintains seeral identical replicas ofeac3 suc3 fra&ment"

  • 7/25/2019 DDB Slides

    22/67

    Data =eplication

    $ A relation or fra&ment of a relation isreplicatedif it is stored redundantly in twoor more sites"

    $ Full replicationof a relation is t3e casew3ere t3e relation is stored at all sites"

    $ Fully redundant data*ases are t3ose inw3ic3 eery site contains a copy of t3e

    entire data*ase"

  • 7/25/2019 DDB Slides

    23/67

    Data =eplication +Cont"-

    $ Adanta&es of =eplication; Availability: failure of site containin& relation r does not result

    in unaaila*ility of ris replicas e5ist"

    ; Parallelism: Hueries on r may *e processed *y seeral nodesin parallel"

    ; Reduced data transfer: relationr is aaila*le locally at eac3site containin& a replica of r"

    $ Disadanta&es of =eplication; 6ncreased cost of updates: eac3 replica of relation rmust *e

    updated"

    ; 6ncreased comple5ity of concurrency control: concurrent

    updates to distinct replicas may lead to inconsistent data unlessspecial concurrency control mec3anisms are implemented"

  • 7/25/2019 DDB Slides

    24/67

    Data Fra&mentation

    $ Diision of relation r into fra&ments r1( r2( >( rnw3ic3

    contain su7cient information to reconstruct relation r"

    $ Horizontal fragmentation: eac3 tuple of ris assi&nedto one or more fra&ments

    $ Vertical fragmentation: t3e sc3ema for relation rissplit into seeral smaller sc3emas

    ; All sc3emas must contain a common candidate 'ey+or super'ey- to ensure lossless Ioin property"

    ; A special attri*ute( t3e tupleKid attri*ute may *e

    added to eac3 sc3ema to sere as a candidate 'ey"$ E5ample : relation account wit3 followin& sc3ema

    $ Account +branch_name( account_number, balance -

  • 7/25/2019 DDB Slides

    25/67

    .orizontal Fra&mentation of account=elation

    branch_name account_number balance

    .illside

    .illside

    .illside

    AK)G?AK22

    AK1??

    ?GG))

    2

    account1 branch_name=Hillside+account -

    branch_name account_number balance

    MalleyiewMalleyiewMalleyiewMalleyiew

    AK1AKG2AKGAK)

    2G?1GGGG112)?G

    account2 branch_name=Valleyview

    i l i f l i f l i

  • 7/25/2019 DDB Slides

    26/67

    Mertical Fra&mentation of employee_info =elation

    branch_name customer_name tuple_id

    .illside

    .illsideMalleyiewMalleyiew.illsideMalleyiew

    Malleyiew

    LowmanCampCampa3na3na3nJreen

    deposit1 branch_name, customer_name, tuple_id +employee_info -

    12)?

    account_number balance tuple_id

    ?GG))2G?1GGGG2112)?G

    1

    2)?

    AK)G?

    AK22AK1AKG2AK1??AKGAK)

    deposit2 account_number, balance, tuple_id +employee_info -

  • 7/25/2019 DDB Slides

    27/67

    Adanta&es of Fra&mentation

    $ .orizontal:

    ; allows parallel processin& on fra&ments of a relation

    ; allows a relation to *e split so t3at tuples are locatedw3ere t3ey are most freHuently accessed

    $ Mertical:; allows tuples to *e split so t3at eac3 part of t3e tuple is

    stored w3ere it is most freHuently accessed

    ; tupleKid attri*ute allows e7cient Ioinin& of erticalfra&ments

    ; allows parallel processin& on a relation$ Mertical and 3orizontal fra&mentation can *e mi5ed"

    ; Fra&ments may *e successiely fra&mented to anar*itrary dept3"

  • 7/25/2019 DDB Slides

    28/67

    Data !ransparency

    $ Data transparency: De&ree to w3ic3system user may remain unaware of t3edetails of 3ow and w3ere t3e data items

    are stored in a distri*uted system$ Consider transparency issues in relation

    to:

    ; Fra&mentation transparency; =eplication transparency

    ; Location transparency

  • 7/25/2019 DDB Slides

    29/67

    8amin& of Data 6tems K

    Criteria1" Eery data item must 3ae a systemKwideuniHue name"

    2" 6t s3ould *e possi*le to %nd t3e location ofdata items e7ciently"

    )" 6t s3ould *e possi*le to c3an&e t3elocation of data items transparently"

    " Eac3 site s3ould *e a*le to create newdata items autonomously"

  • 7/25/2019 DDB Slides

    30/67

    Centralized Sc3eme K 8ameSerer$ Structure:

    ; name serer assi&ns all names

    ; eac3 site maintains a record of local data items

    ; sites as' name serer to locate nonKlocal data items$ Adanta&es:

    ; satis%es namin& criteria 1K)

    $ Disadanta&es:

    ; does not satisfy namin& criterion ; name serer is a potential performance *ottlenec'

    ; name serer is a sin&le point of failure

  • 7/25/2019 DDB Slides

    31/67

    Nse of Aliases

    $ Alternatie to centralized sc3eme: eac3 sitepre%5es its own site identi%er to any namet3at it &enerates i"e"( site 1"account

    ; Ful%lls 3ain& a uniHue identi%er( and

    aoids pro*lems associated wit3 centralcontrol"

    ; .oweer( fails to ac3iee networ'transparency"

  • 7/25/2019 DDB Slides

    32/67

    $ A set of steps completed *y a DB#S toaccomplis3 a sin&le user tas'"

    $ #ust *e eit3er entirely completed or a*orted

    $ 8o intermediate states are accepta*le

    3at is a !ransaction4

  • 7/25/2019 DDB Slides

    33/67

    Distri*uted !ransactions

    $ !ransaction may access data at seeral sites"$ Eac3 site 3as a local transaction mana&erresponsi*le for:

    ; #aintainin& a lo& for recoery purposes

    ; articipatin& in coordinatin& t3e concurrent e5ecution of t3etransactions e5ecutin& at t3at site"

    $ Eac3 site 3as a transaction coordinator(w3ic3 is responsi*lefor:

    ; Startin& t3e e5ecution of transactions t3at ori&inate at t3e site"

    ; Distri*utin& su*transactions at appropriate sites for e5ecution"

    ; Coordinatin& t3e termination of eac3 transaction t3at ori&inatesat t3e site( w3ic3 may result in t3e transaction *ein& committedat all sites or a*orted at all sites"

  • 7/25/2019 DDB Slides

    34/67

    !ransaction System

    Arc3itecture

  • 7/25/2019 DDB Slides

    35/67

    System Failure #odes

    $ Failures uniHue to distri*uted systems:; Failure of a site"

    ; Loss of massa&es

    .andled *y networ' transmission control protocols suc3 as !CK6

    ; Failure of a communication lin'

    .andled *y networ' protocols( *y routin& messa&es ia

    alternatie lin's

    ; Netor! partition

    A networ' is said to *e partitionedw3en it 3as *een splitinto two or more su*systems t3at lac' any connection*etween t3em

    8ote: a su*system may consist of a sin&le node$ 8etwor' partitionin& and site failures are &enerallyindistin&uis3a*le"

  • 7/25/2019 DDB Slides

    36/67

    "

    Distri*uted commit

    pro*lem

    Action:a1(a2

    Action:a)

    Action:a(a?

    !ransaction !

    Commit must *e atomic

  • 7/25/2019 DDB Slides

    37/67

    E5ample of a Distri*uted

    !ransaction

    .eadHuarters

    !G

    !1

    !n

    !2

    Stores

    !?

    Store 2 s3oulds3ip?GG

    toot3*rus3es tostore ?

    8 : 8 K ?GG

    8 : 8 , ?GG

    Eac3 !iis e5ecuted atomically( *ut !Gitself is not ato

    .ow a distri*uted transaction t3at 3as componentsat seeral sites can e5ecute atomically4

    FAI"s

  • 7/25/2019 DDB Slides

    38/67

    Distri*uted commit

    pro*lem$ Commit must *e atomic

    $ Solution: !woKp3ase commit +2C-; Centralized 2C; Distri*uted 2C; Linear 2C

    ; #any ot3er ariants>

    !

  • 7/25/2019 DDB Slides

    39/67

    !< .ASE C

  • 7/25/2019 DDB Slides

    40/67

    !< .ASE C

  • 7/25/2019 DDB Slides

    41/67

    !erminolo&y

    $ =esource #ana&ers +=#s-

    ; Nsually data*ases

    $ articipants; =#s t3at did wor' on *e3alf of

    transaction

    $ Coordinator; Component t3at runs twoKp3asecommit on *e3alf of transaction

  • 7/25/2019 DDB Slides

    42/67

    Coord

    inator

    artic

    ipant

    =EQNES!K!

  • 7/25/2019 DDB Slides

    43/67

    Coord

    inator

    artic

    ipant

    =EQNES!K!

  • 7/25/2019 DDB Slides

    44/67

    Centralized twoKp3asecommit

    Coordinator articipant

    6

    C

    A

    6

    C

    A

    commitKreHuest

    reHuestKprepare

    no

    a*ort

    preparedCommit

    commitdone

    reHuestKprepare

    prepared

    reHuestKprepareno

    a*ortdone

    Fdone

    done

  • 7/25/2019 DDB Slides

    45/67

    $ 8otation: 6ncomin& messa&e

  • 7/25/2019 DDB Slides

    46/67

    $ After coordinator receies D

  • 7/25/2019 DDB Slides

    47/67

    $ Add timeouts to cope wit3messa&es lost durin& cras3es

    8e5t

  • 7/25/2019 DDB Slides

    48/67

    Coordinator

    6

    C

    A

    F

    commitKreHuestreHuestKprepare

    doneK

    doneK

    RtRa*ort

    any

    a*ort

    any

    commit

    RtRcommit

    RtRa*ort

    noa*ort

    preparedcommit

    ttimeout

  • 7/25/2019 DDB Slides

    49/67

    articipant

    6

    C

    A

    reHuestKprepareprepared

    eHuialent to

    %nis3 state

    RtR

    pin&

    a*ortdone

    commitdone

    reHuestKprepareno

    commitdone

    a*ortdone

  • 7/25/2019 DDB Slides

    50/67

    Distri*uted Query

    rocessin&$ For centralized systems( t3e primary criterionfor measurin& t3e cost of a particular strate&yis t3e num*er of dis' accesses"

    $ 6n a distri*uted system( ot3er issues must *eta'en into account:

    ; !3e cost of a data transmission oer t3enetwor'"

    ;!3e potential &ain in performance from3ain& seeral sites process parts of t3eHuery in parallel"

  • 7/25/2019 DDB Slides

    51/67

    ro*lem Statement

    $ 6nput: QueryHow many times has the moon circled

    around the earth in the last twenty years!

    $

  • 7/25/2019 DDB Slides

    52/67

    Query rocessin&

    $ 6nput: Declaratie Query; SQL(

  • 7/25/2019 DDB Slides

    53/67

    Al&e*ra

    ; relational al&e*ra for SQL ery wellunderstood

    ; al&e*ra for Query +wor' in pro&ress-

    #$"$%& A'dFR() A* +,H$R$ A'a - +'b

    AND A'c - ./

    A"d

    A"a B"*(A"c )?

    A B

  • 7/25/2019 DDB Slides

    54/67

    Query

  • 7/25/2019 DDB Slides

    55/67

    Query E5ecution

    ; li*rary of operators +3as3 Ioin( mer&e Ioin( """-; pipelinin& +iterator model-

    ; lazy ealuation

    ; e5ploit inde5es and clusterin& in data*ase

    A"d

    3as3Ioin

    B"*

    inde5 A"c B

    +To3n( )?( CS-+#ary( )?( EE- +Edin*ur&3( CS(?"+Edin*ur&3( AS( "

    +CS-+AS-

    +To3n( )?( CS-

    To3n

  • 7/25/2019 DDB Slides

    56/67

    Distri*uted Queryrocessin&

    $ 6dea:%his is &ust an e'tension of centrali(ed )ueryprocessin* +System = et al" in t3e early Gs-

    $ 3at is di@erent4; e5tend p3ysical al&e*ra: sendUreceieoperators

    ; resource ectors( networ' interconnect matri5

    ; cac3in& and replication; optimize for response time

    ; less predicta*ility in cost model +adaptie al&os-

    ; 3etero&eneity in data formats and data models

  • 7/25/2019 DDB Slides

    57/67

    Distri*uted Query lanA"d

    3as3Ioin

    B"*

    inde5 A"c B

    receie receie

    send send

  • 7/25/2019 DDB Slides

    58/67

    Cost

    1

    2

    ? 1G

    1

    1

    !otal Cost Sum of Cost of

  • 7/25/2019 DDB Slides

    59/67

    Query !ransformation

    $ !ranslatin& al&e*raic Hueries on fra&ments"

    ; 6t must *e possi*le to construct relation rfrom its fra&ments

    ; =eplace relation r*y t3e e5pression to construct relation r

    from its fra&ments

    $ Consider t3e 3orizontal fra&mentation of t3e account relationinto

    account1 branch_name /.illside0+account -

    account2 branch_name /Malleyiew0+account -

    $ !3e Huery branch_name /.illside0+account - *ecomes*ranc3Rname /.illside0+account1account2-

    w3ic3 is optimized into

    branch_name /.illside0+account1- branch_name /.illside0+account2-

  • 7/25/2019 DDB Slides

    60/67

    Simple Toin rocessin&

    $ Consider t3e followin& relational al&e*rae5pression in w3ic3 t3e t3ree relations areneit3er replicated nor fra&mented

    account depositor branch$ account is stored at site +1

    $ depositor at +2

    $ branch at +)$ For a Huery issued at site +6( t3e system

    needs to produce t3e result at site +6

  • 7/25/2019 DDB Slides

    61/67

    ossi*le Query rocessin&

    Strate&ies$ S3ip copies of all t3ree relations to site +6 and c3oose astrate&y for processin& t3e entire locally at site +6"

    $ S3ip a copy of t3e account relation to site S2and

    compute temp1 account depositor at S2" S3ip

    temp1from S2to S)( and compute temp2 temp1branchat S)" S3ip t3e resulttemp2to +6"

    $ Deise similar strate&ies( e5c3an&in& t3e roles +1( +2( +)

    $ #ust consider followin& factors:

    ; amount of data *ein& s3ipped; cost of transmittin& a data *loc' *etween sites

    ; relatie processin& speed at eac3 site

  • 7/25/2019 DDB Slides

    62/67

    SemiIoin Strate&y$ Let r1*e a relation wit3 sc3ema 1stores at site +1

    Let r2*e a relation wit3 sc3ema 2stores at site +2$ Ealuate t3e e5pression r1 r2 and o*tain t3e result at

    +1"

    1" Compute temp11 2+r1- at +1"$ 2" S3ip temp1from +1 to +2"

    $ )" Compute temp2 r2 temp1 at +2

    $ " S3ip temp2from S2to S1"

    $ ?" Compute r1 temp2at +1" !3is is t3e same as r1

    r2"

    Formal De%nition

  • 7/25/2019 DDB Slides

    63/67

    Formal De%nition

    $!3e semi0oinof r1wit3 r2( is denoted *y:

    r1 r2

    $ it is de%ned *y:

    1+r1 r2-$!3us( r1 r2selects t3ose tuples of r1t3at

    contri*uted to r1 r2"

    $ 6n step ) a*oe( temp2r2 r1"

    $ For Ioins of seeral relations( t3e a*oe strate&ycan *e e5tended to a series of semiIoin steps"

  • 7/25/2019 DDB Slides

    64/67

    Toin Strate&ies t3at E5ploit arallelism

    $ Consider r1 r2 r) rw3ere relation ri is stored at

    site +i" !3e result must *e presented at site +1"

    $ r1is s3ipped to +2and r1 r2is computed at +2:

    simultaneously r)is s3ipped to +and r) r is

    computed at +

    $ +2s3ips tuples of +r1 r2- to +1as t3ey producedV

    +s3ips tuples of +r) r- to +1

    $

  • 7/25/2019 DDB Slides

    65/67

    ConclusionK Adanta&es of

    DDB#Ss$ =eWects or&anizational structure$ 6mproed s3area*ility and local autonomy

    $ 6mproed aaila*ility

    $ 6mproed relia*ility

    $ 6mproed performance

    $ Economics

    $ #odular &rowt3

  • 7/25/2019 DDB Slides

    66/67

    Arc3itectural comple5ity

    Cost

    Security

    6nte&rity control more di7cult

    Lac' of standards

    Lac' of e5perience

    Data*ase desi&n morecomple5

    ConclusionK Disadanta&es of

    DDB#Ss

  • 7/25/2019 DDB Slides

    67/67

    =eferences

    $ C3apter 22 of data*ase systems concepts+Sil*ersc3atz *oo'-

    $ 6CS courses on DB#S: CS222( CS22)