ddb slides
TRANSCRIPT
-
7/25/2019 DDB Slides
1/67
1
Distributed Data Management
Distributed Systems
Department of Computer Science
UC Irvine
-
7/25/2019 DDB Slides
2/67
2
Centralized DB systems
Software:
ApplicationSQL Front End
Query rocessor!ransaction roc"File Access
# """
$ Simpli%cations:
sin&le front end one place to 'eep data( loc's
if processor fails( system fails( """
-
7/25/2019 DDB Slides
3/67
)
Distri*uted Data*ase Systems
$ #ultiple processors + , memories-
$ .etero&eneity and autonomy of
/components0
-
7/25/2019 DDB Slides
4/67
3y do we need Distri*uted
Data*ases4$ E5ample: 6B# 3as o7ces in London(
8ew 9or'( and .on& on&"
$ Employee data:; E#+E8-
$ 3ere s3ould t3e employee data
ta*le reside4
-
7/25/2019 DDB Slides
5/67
?
6B# Data Access attern
$ #ostly( employee data is mana&ed att3e o7ce w3ere t3e employee wor's; E"&"( payroll( *ene%ts( 3ire and %re
$ eriodically( 6B# needs consolidatedaccess to employee data; E"&"( 6B# c3an&es *ene%t plans and t3at
a@ects all employees"; E"&"( Annual *onus depends on &lo*al netpro%t"
-
7/25/2019 DDB Slides
6/67
E#
6nternet
Londonayroll app
London
8ew 9or'ayroll app
8ew 9or'
.on& on&ayroll app
.on& on&
ro*lem:
89 and . payrollapps run ery slowly
-
7/25/2019 DDB Slides
7/67
LondonEmp
6nternet
Londonayroll app
London
8ew 9or'ayroll app
8ew 9or'
.on& on&ayroll app
.on& on&
.Emp
89Emp
#uc3 *etter
-
7/25/2019 DDB Slides
8/67
6nternet
Londonayroll app
AnnualBonus app
London
8ew 9or'ayroll app
8ew 9or'
.on& on&ayroll app
.on& on&
LondonEmp 89
Emp
.Emp
Distri*ution proidesopportunities for
parallel e5ecution
-
7/25/2019 DDB Slides
9/67
6nternet
Londonayroll app
AnnualBonus app
London
8ew 9or'ayroll app
8ew 9or'
.on& on&ayroll app
.on& on&
LondonEmp 89
Emp
.Emp
-
7/25/2019 DDB Slides
10/671G
6nternet
Londonayroll app
AnnualBonus app
London
8ew 9or'ayroll app
8ew 9or'
.on& on&ayroll app
.on& on&
Lon( 89Emp 89( .
Emp
.( LonEmp
=eplication improe
aaila*ility
-
7/25/2019 DDB Slides
11/67
.omo&eneous Distri*utedData*ases
$ In a homogeneous distributed database
; All sites 3ae identical software
; Are aware of eac3 ot3er and a&ree to cooperate inprocessin& user reHuests"
; Eac3 site surrenders part of its autonomy in terms of ri&3t to
c3an&e sc3emas or software
; Appears to user as a sin&le system
$ In a heterogeneous distributed database
; Di@erent sites may use di@erent sc3emas and software
Di@erence in sc3ema is a maIor pro*lem for Huery
processin& Di@erence in software is a maIor pro*lem for transaction
processin&
; Sites may not *e aware of eac3 ot3er and may proide onlylimited facilities for cooperation in transaction processin&
-
7/25/2019 DDB Slides
12/6712
DB arc3itectures
+1- S3ared memory
"""
#
-
7/25/2019 DDB Slides
13/67
1)
DB arc3itectures
+2- S3ared dis'
"""
"""
#
# #
-
7/25/2019 DDB Slides
14/67
1
DB arc3itectures
+)- S3ared not3in&
#
#
#
"""
-
7/25/2019 DDB Slides
15/67
1?
DB arc3itectures+- .y*rid e5ample ; .ierarc3ical or Clustered
#
"""
#
"""
-
7/25/2019 DDB Slides
16/67
1
6ssues for selectin&arc3itecture$ =elia*ility
$ Scala*ility
$ Jeo&rap3ic distri*ution of data$ erformance
$ Cost
-
7/25/2019 DDB Slides
17/67
1
arallel or distri*uted DB system4
$ #ore similarities t3an di@erences
-
7/25/2019 DDB Slides
18/67
1
$!ypically( parallel DBs:; Fast interconnect
; .omo&eneous software
; .i&3 performance is &oal;!ransparency is &oal
-
7/25/2019 DDB Slides
19/67
1
$!ypically( distri*uted DBs:; Jeo&rap3ically distri*uted
; Data s3arin& is &oal +may run into
3etero&eneity( autonomy-
; Disconnected operation possi*le
-
7/25/2019 DDB Slides
20/67
2G
Distri*uted Data*ase
C3allen&es$ Distri*uted Data*ase Desi&n; Decidin& w3at data &oes w3ere
; Depends on data access patterns ofmaIor applications
;!wo su*pro*lems: Fra&mentation: partition ta*les into
fra&ments Allocation: allocate fra&ments to nodes
-
7/25/2019 DDB Slides
21/67
Distri*uted Data Stora&e
$ Assume relational data model
$ Replication
; System maintains multiple copies of data(stored in di@erent sites( for faster retrieal andfault tolerance"
$ Fragmentation
; =elation is partitioned into seeral fra&mentsstored in distinct sites
$ =eplication and fra&mentation can *e com*ined
; =elation is partitioned into seeral fra&ments:system maintains seeral identical replicas ofeac3 suc3 fra&ment"
-
7/25/2019 DDB Slides
22/67
Data =eplication
$ A relation or fra&ment of a relation isreplicatedif it is stored redundantly in twoor more sites"
$ Full replicationof a relation is t3e casew3ere t3e relation is stored at all sites"
$ Fully redundant data*ases are t3ose inw3ic3 eery site contains a copy of t3e
entire data*ase"
-
7/25/2019 DDB Slides
23/67
Data =eplication +Cont"-
$ Adanta&es of =eplication; Availability: failure of site containin& relation r does not result
in unaaila*ility of ris replicas e5ist"
; Parallelism: Hueries on r may *e processed *y seeral nodesin parallel"
; Reduced data transfer: relationr is aaila*le locally at eac3site containin& a replica of r"
$ Disadanta&es of =eplication; 6ncreased cost of updates: eac3 replica of relation rmust *e
updated"
; 6ncreased comple5ity of concurrency control: concurrent
updates to distinct replicas may lead to inconsistent data unlessspecial concurrency control mec3anisms are implemented"
-
7/25/2019 DDB Slides
24/67
Data Fra&mentation
$ Diision of relation r into fra&ments r1( r2( >( rnw3ic3
contain su7cient information to reconstruct relation r"
$ Horizontal fragmentation: eac3 tuple of ris assi&nedto one or more fra&ments
$ Vertical fragmentation: t3e sc3ema for relation rissplit into seeral smaller sc3emas
; All sc3emas must contain a common candidate 'ey+or super'ey- to ensure lossless Ioin property"
; A special attri*ute( t3e tupleKid attri*ute may *e
added to eac3 sc3ema to sere as a candidate 'ey"$ E5ample : relation account wit3 followin& sc3ema
$ Account +branch_name( account_number, balance -
-
7/25/2019 DDB Slides
25/67
.orizontal Fra&mentation of account=elation
branch_name account_number balance
.illside
.illside
.illside
AK)G?AK22
AK1??
?GG))
2
account1 branch_name=Hillside+account -
branch_name account_number balance
MalleyiewMalleyiewMalleyiewMalleyiew
AK1AKG2AKGAK)
2G?1GGGG112)?G
account2 branch_name=Valleyview
i l i f l i f l i
-
7/25/2019 DDB Slides
26/67
Mertical Fra&mentation of employee_info =elation
branch_name customer_name tuple_id
.illside
.illsideMalleyiewMalleyiew.illsideMalleyiew
Malleyiew
LowmanCampCampa3na3na3nJreen
deposit1 branch_name, customer_name, tuple_id +employee_info -
12)?
account_number balance tuple_id
?GG))2G?1GGGG2112)?G
1
2)?
AK)G?
AK22AK1AKG2AK1??AKGAK)
deposit2 account_number, balance, tuple_id +employee_info -
-
7/25/2019 DDB Slides
27/67
Adanta&es of Fra&mentation
$ .orizontal:
; allows parallel processin& on fra&ments of a relation
; allows a relation to *e split so t3at tuples are locatedw3ere t3ey are most freHuently accessed
$ Mertical:; allows tuples to *e split so t3at eac3 part of t3e tuple is
stored w3ere it is most freHuently accessed
; tupleKid attri*ute allows e7cient Ioinin& of erticalfra&ments
; allows parallel processin& on a relation$ Mertical and 3orizontal fra&mentation can *e mi5ed"
; Fra&ments may *e successiely fra&mented to anar*itrary dept3"
-
7/25/2019 DDB Slides
28/67
Data !ransparency
$ Data transparency: De&ree to w3ic3system user may remain unaware of t3edetails of 3ow and w3ere t3e data items
are stored in a distri*uted system$ Consider transparency issues in relation
to:
; Fra&mentation transparency; =eplication transparency
; Location transparency
-
7/25/2019 DDB Slides
29/67
8amin& of Data 6tems K
Criteria1" Eery data item must 3ae a systemKwideuniHue name"
2" 6t s3ould *e possi*le to %nd t3e location ofdata items e7ciently"
)" 6t s3ould *e possi*le to c3an&e t3elocation of data items transparently"
" Eac3 site s3ould *e a*le to create newdata items autonomously"
-
7/25/2019 DDB Slides
30/67
Centralized Sc3eme K 8ameSerer$ Structure:
; name serer assi&ns all names
; eac3 site maintains a record of local data items
; sites as' name serer to locate nonKlocal data items$ Adanta&es:
; satis%es namin& criteria 1K)
$ Disadanta&es:
; does not satisfy namin& criterion ; name serer is a potential performance *ottlenec'
; name serer is a sin&le point of failure
-
7/25/2019 DDB Slides
31/67
Nse of Aliases
$ Alternatie to centralized sc3eme: eac3 sitepre%5es its own site identi%er to any namet3at it &enerates i"e"( site 1"account
; Ful%lls 3ain& a uniHue identi%er( and
aoids pro*lems associated wit3 centralcontrol"
; .oweer( fails to ac3iee networ'transparency"
-
7/25/2019 DDB Slides
32/67
$ A set of steps completed *y a DB#S toaccomplis3 a sin&le user tas'"
$ #ust *e eit3er entirely completed or a*orted
$ 8o intermediate states are accepta*le
3at is a !ransaction4
-
7/25/2019 DDB Slides
33/67
Distri*uted !ransactions
$ !ransaction may access data at seeral sites"$ Eac3 site 3as a local transaction mana&erresponsi*le for:
; #aintainin& a lo& for recoery purposes
; articipatin& in coordinatin& t3e concurrent e5ecution of t3etransactions e5ecutin& at t3at site"
$ Eac3 site 3as a transaction coordinator(w3ic3 is responsi*lefor:
; Startin& t3e e5ecution of transactions t3at ori&inate at t3e site"
; Distri*utin& su*transactions at appropriate sites for e5ecution"
; Coordinatin& t3e termination of eac3 transaction t3at ori&inatesat t3e site( w3ic3 may result in t3e transaction *ein& committedat all sites or a*orted at all sites"
-
7/25/2019 DDB Slides
34/67
!ransaction System
Arc3itecture
-
7/25/2019 DDB Slides
35/67
System Failure #odes
$ Failures uniHue to distri*uted systems:; Failure of a site"
; Loss of massa&es
.andled *y networ' transmission control protocols suc3 as !CK6
; Failure of a communication lin'
.andled *y networ' protocols( *y routin& messa&es ia
alternatie lin's
; Netor! partition
A networ' is said to *e partitionedw3en it 3as *een splitinto two or more su*systems t3at lac' any connection*etween t3em
8ote: a su*system may consist of a sin&le node$ 8etwor' partitionin& and site failures are &enerallyindistin&uis3a*le"
-
7/25/2019 DDB Slides
36/67
"
Distri*uted commit
pro*lem
Action:a1(a2
Action:a)
Action:a(a?
!ransaction !
Commit must *e atomic
-
7/25/2019 DDB Slides
37/67
E5ample of a Distri*uted
!ransaction
.eadHuarters
!G
!1
!n
!2
Stores
!?
Store 2 s3oulds3ip?GG
toot3*rus3es tostore ?
8 : 8 K ?GG
8 : 8 , ?GG
Eac3 !iis e5ecuted atomically( *ut !Gitself is not ato
.ow a distri*uted transaction t3at 3as componentsat seeral sites can e5ecute atomically4
FAI"s
-
7/25/2019 DDB Slides
38/67
Distri*uted commit
pro*lem$ Commit must *e atomic
$ Solution: !woKp3ase commit +2C-; Centralized 2C; Distri*uted 2C; Linear 2C
; #any ot3er ariants>
!
-
7/25/2019 DDB Slides
39/67
!< .ASE C
-
7/25/2019 DDB Slides
40/67
!< .ASE C
-
7/25/2019 DDB Slides
41/67
!erminolo&y
$ =esource #ana&ers +=#s-
; Nsually data*ases
$ articipants; =#s t3at did wor' on *e3alf of
transaction
$ Coordinator; Component t3at runs twoKp3asecommit on *e3alf of transaction
-
7/25/2019 DDB Slides
42/67
Coord
inator
artic
ipant
=EQNES!K!
-
7/25/2019 DDB Slides
43/67
Coord
inator
artic
ipant
=EQNES!K!
-
7/25/2019 DDB Slides
44/67
Centralized twoKp3asecommit
Coordinator articipant
6
C
A
6
C
A
commitKreHuest
reHuestKprepare
no
a*ort
preparedCommit
commitdone
reHuestKprepare
prepared
reHuestKprepareno
a*ortdone
Fdone
done
-
7/25/2019 DDB Slides
45/67
$ 8otation: 6ncomin& messa&e
-
7/25/2019 DDB Slides
46/67
$ After coordinator receies D
-
7/25/2019 DDB Slides
47/67
$ Add timeouts to cope wit3messa&es lost durin& cras3es
8e5t
-
7/25/2019 DDB Slides
48/67
Coordinator
6
C
A
F
commitKreHuestreHuestKprepare
doneK
doneK
RtRa*ort
any
a*ort
any
commit
RtRcommit
RtRa*ort
noa*ort
preparedcommit
ttimeout
-
7/25/2019 DDB Slides
49/67
articipant
6
C
A
reHuestKprepareprepared
eHuialent to
%nis3 state
RtR
pin&
a*ortdone
commitdone
reHuestKprepareno
commitdone
a*ortdone
-
7/25/2019 DDB Slides
50/67
Distri*uted Query
rocessin&$ For centralized systems( t3e primary criterionfor measurin& t3e cost of a particular strate&yis t3e num*er of dis' accesses"
$ 6n a distri*uted system( ot3er issues must *eta'en into account:
; !3e cost of a data transmission oer t3enetwor'"
;!3e potential &ain in performance from3ain& seeral sites process parts of t3eHuery in parallel"
-
7/25/2019 DDB Slides
51/67
ro*lem Statement
$ 6nput: QueryHow many times has the moon circled
around the earth in the last twenty years!
$
-
7/25/2019 DDB Slides
52/67
Query rocessin&
$ 6nput: Declaratie Query; SQL(
-
7/25/2019 DDB Slides
53/67
Al&e*ra
; relational al&e*ra for SQL ery wellunderstood
; al&e*ra for Query +wor' in pro&ress-
#$"$%& A'dFR() A* +,H$R$ A'a - +'b
AND A'c - ./
A"d
A"a B"*(A"c )?
A B
-
7/25/2019 DDB Slides
54/67
Query
-
7/25/2019 DDB Slides
55/67
Query E5ecution
; li*rary of operators +3as3 Ioin( mer&e Ioin( """-; pipelinin& +iterator model-
; lazy ealuation
; e5ploit inde5es and clusterin& in data*ase
A"d
3as3Ioin
B"*
inde5 A"c B
+To3n( )?( CS-+#ary( )?( EE- +Edin*ur&3( CS(?"+Edin*ur&3( AS( "
+CS-+AS-
+To3n( )?( CS-
To3n
-
7/25/2019 DDB Slides
56/67
Distri*uted Queryrocessin&
$ 6dea:%his is &ust an e'tension of centrali(ed )ueryprocessin* +System = et al" in t3e early Gs-
$ 3at is di@erent4; e5tend p3ysical al&e*ra: sendUreceieoperators
; resource ectors( networ' interconnect matri5
; cac3in& and replication; optimize for response time
; less predicta*ility in cost model +adaptie al&os-
; 3etero&eneity in data formats and data models
-
7/25/2019 DDB Slides
57/67
Distri*uted Query lanA"d
3as3Ioin
B"*
inde5 A"c B
receie receie
send send
-
7/25/2019 DDB Slides
58/67
Cost
1
2
? 1G
1
1
!otal Cost Sum of Cost of
-
7/25/2019 DDB Slides
59/67
Query !ransformation
$ !ranslatin& al&e*raic Hueries on fra&ments"
; 6t must *e possi*le to construct relation rfrom its fra&ments
; =eplace relation r*y t3e e5pression to construct relation r
from its fra&ments
$ Consider t3e 3orizontal fra&mentation of t3e account relationinto
account1 branch_name /.illside0+account -
account2 branch_name /Malleyiew0+account -
$ !3e Huery branch_name /.illside0+account - *ecomes*ranc3Rname /.illside0+account1account2-
w3ic3 is optimized into
branch_name /.illside0+account1- branch_name /.illside0+account2-
-
7/25/2019 DDB Slides
60/67
Simple Toin rocessin&
$ Consider t3e followin& relational al&e*rae5pression in w3ic3 t3e t3ree relations areneit3er replicated nor fra&mented
account depositor branch$ account is stored at site +1
$ depositor at +2
$ branch at +)$ For a Huery issued at site +6( t3e system
needs to produce t3e result at site +6
-
7/25/2019 DDB Slides
61/67
ossi*le Query rocessin&
Strate&ies$ S3ip copies of all t3ree relations to site +6 and c3oose astrate&y for processin& t3e entire locally at site +6"
$ S3ip a copy of t3e account relation to site S2and
compute temp1 account depositor at S2" S3ip
temp1from S2to S)( and compute temp2 temp1branchat S)" S3ip t3e resulttemp2to +6"
$ Deise similar strate&ies( e5c3an&in& t3e roles +1( +2( +)
$ #ust consider followin& factors:
; amount of data *ein& s3ipped; cost of transmittin& a data *loc' *etween sites
; relatie processin& speed at eac3 site
-
7/25/2019 DDB Slides
62/67
SemiIoin Strate&y$ Let r1*e a relation wit3 sc3ema 1stores at site +1
Let r2*e a relation wit3 sc3ema 2stores at site +2$ Ealuate t3e e5pression r1 r2 and o*tain t3e result at
+1"
1" Compute temp11 2+r1- at +1"$ 2" S3ip temp1from +1 to +2"
$ )" Compute temp2 r2 temp1 at +2
$ " S3ip temp2from S2to S1"
$ ?" Compute r1 temp2at +1" !3is is t3e same as r1
r2"
Formal De%nition
-
7/25/2019 DDB Slides
63/67
Formal De%nition
$!3e semi0oinof r1wit3 r2( is denoted *y:
r1 r2
$ it is de%ned *y:
1+r1 r2-$!3us( r1 r2selects t3ose tuples of r1t3at
contri*uted to r1 r2"
$ 6n step ) a*oe( temp2r2 r1"
$ For Ioins of seeral relations( t3e a*oe strate&ycan *e e5tended to a series of semiIoin steps"
-
7/25/2019 DDB Slides
64/67
Toin Strate&ies t3at E5ploit arallelism
$ Consider r1 r2 r) rw3ere relation ri is stored at
site +i" !3e result must *e presented at site +1"
$ r1is s3ipped to +2and r1 r2is computed at +2:
simultaneously r)is s3ipped to +and r) r is
computed at +
$ +2s3ips tuples of +r1 r2- to +1as t3ey producedV
+s3ips tuples of +r) r- to +1
$
-
7/25/2019 DDB Slides
65/67
ConclusionK Adanta&es of
DDB#Ss$ =eWects or&anizational structure$ 6mproed s3area*ility and local autonomy
$ 6mproed aaila*ility
$ 6mproed relia*ility
$ 6mproed performance
$ Economics
$ #odular &rowt3
-
7/25/2019 DDB Slides
66/67
Arc3itectural comple5ity
Cost
Security
6nte&rity control more di7cult
Lac' of standards
Lac' of e5perience
Data*ase desi&n morecomple5
ConclusionK Disadanta&es of
DDB#Ss
-
7/25/2019 DDB Slides
67/67
=eferences
$ C3apter 22 of data*ase systems concepts+Sil*ersc3atz *oo'-
$ 6CS courses on DB#S: CS222( CS22)