dangers of replication

1

Dangers of Replication

Gray, Helland, O’Neil, Shasha

2

Data Replication

Motivation Gives increased availability. Faster query evaluation.

Synchronous (Eager) Copies updated as part of the

updating transaction; serializable.

Not an option for mobile apps. Asynchronous (Lazy)

Copies updated after the updating xact commits.

Stale data versions. Non-serializable behavior.

R1

R1 R2

R3

SITE A

SITE B

3

Checkbook Example

Joe and Jill share a joint checking account with $1000; replicated three times: Joe’s checkbook; Jill’s checkbook; bank’s ledger

Eager replication Can’t write checks totaling > $1000 All three books have same balance

Lazy replication Joe and Jill can write checks for $1000 each. Differences in books must be reconciled eventually.

Master-copy implementation of lazy replication: Only bank’s ledger counts. Second check bounces.

4

Group vs. Master Ownership

Who is allowed to update an object? Group: Any node with a copy of the object

can update it; changes must be propagated to all copies. (“Update anywhere”)

Master: Each object has a master node. Only the master can update the object; remaining copies are read-only, and changed only by propagated changes from the master.

Ownership policy is orthogonal to how updates are propagated (eager or lazy).

5

Lazy Group Replication

After original xact commits, an xact is generated for every other node to propagate updates to replicas.

Two xacts could make simultaneous updates that conflict; we use timestamps to reconcile: Each object has ts of its most recent update Replica update = <updt ts, new-value, old object

ts> If local replica’s ts = old ts in replica update,

update is safe; do it and change local replica ts to updt ts

• Otherwise, replica update must be reconciled

6

Lazy Master Replication

Each object has an owner node. To update an object, a user xact must request

owner node to perform update. After update commits at the owner node, must

propagate changes to replicas. If a replica receives an update with an older

timestamp than the local ts, it knows that the update is “stale”, and discards it. (Or replica updates can simply be propagated sequentially in commit order at the master.)

No reconciliation failures; conflicts resolved by deadlocks; main issue is how often master update xacts deadlock. Performance similar to single node system with higher

transaction rates.

7

Problem

As number of checkbooks goes up by factor of n, deadlock (in eager and lazy master replication) or reconciliation (in lazy group) rates go up by n2 or more.

Disconnected operation and msg delays further increase reconciliation rates in lazy replication.

Update anywhere-anytime-anyway replication is unstable as workload scales up.

8

Simple Analysis Model—Fixed db, all objects replicated at all

nodes: x # of original xacts per sec per node a # of updates per xact t time per update n # of nodes

Node update rate: With n nodes, each update is replicated at n-1 nodes,

and # of original xacts/sec grows to x.n Eager replication: xact size grows by factor of n, node

update rate grows to (x.n).a.(n-1) ≈ x.a.n2

Lazy replication: each original xact generates n-1 replica xacts, and node update rate grows to (x.n).(n-1).a ≈ x.a.n2

Time per xact = a.t # of concurrent original xacts at a node = x.a.t

9

General Analysis Summary

Eager deadlock rate ≈ t2.a5.n3

Lazy group reconciliation rate ≈ t2.a3.n3

Lazy master deadlock rate ≈ t2.a4.n2

Slightly better than lazy group, but requires contact with object masters; not suitable for mobile apps.

Bottom line: None of these schemes scales as the number of nodes (and replicas) increases.

10

Non-transactional Replication

Convergence (alternative to serializability): If no new xacts arrive, and all nodes are connected, they must converge to same replicated state. Leads to non-serializable behavior; lost updates.

Example: Lotus Notes Append: This update adds a note with a ts. Replace: Value replaced with newer value if

replace ts > value’s current ts. • If there are two concurrent updates to a

checkbook, one with higher ts “wins”, other is “lost”.

11

Non-transactional Replication (contd)

MS Access: Update-anywhere for records. Each node has version vector with ts for each

replicated record. Version vectors are periodically exchanged between

nodes; for each record, most recent version “wins” in each exchange.

Oracle: Like Lotus Notes, uses a lazy group scheme with a choice of twelve reconciliation rules (e.g., site-priority, time-priority, value-priority, merge commutative updates), plus user-defined reconciliation.

12

Two-Tier Replication

Goals Availability and scalability Mobility: Read and update while

disconnected Serializability Convergence

Either eager replication or a lazy master scheme is required to ensure serializability and durability. But some adaptation is required to ensure

scalability. Two-tier scheme achieves this by introducing “tentative” transactions.

13

Two-Tier: Types of Nodes

Mobile nodes: Disconnected. Originate tentative xacts May be master for some items Every item at a mobile node has 2 versions:

• Master version: Most recent value received from master

• Tentative version: Most recent local value Base nodes: Always connected.

Most items mastered at base nodes Both types of nodes have a replica of the DB

14

Two-Tier: Types of Xacts

Base Xacts: Work only on master data, produce new master data Involve at most 1 connected mobile node and possibly

several base nodes Tentative Xacts:

Work on local tentative data. May also involve objects mastered at base nodes and on the originating mobile node—but not other mobile nodes (scope rule)

Produce new tentative versions, plus a base xact to be run later on base nodes

• This base xact may fail or produce different results; if so, parent tentative xact fails unless the base xact meets some acceptance criteria (e.g., balance cannot go negative)

15

Mobile Node Connects to Base Node:

It discards tentative versions (only used for queries while disconnected; will be refreshed)

Sends replica updates for items mastered at the mobile node

Sends all tentative base xacts in the order they committed

Accepts replica updates from base node (standard lazy-master step)

Accepts notice of success or failure of each tentative xact

16

The “Host” Base Node Does This:

Sends delayed replica update xacts to mobile node Accepts delayed replica update xacts for mobile-

mastered items Accepts list of tentative base xacts and re-runs

them After a tentative base xact is committed, base node

propagates replica updates to other nodes (standard lazy-master step)

After all tentative base xacts have been (re-) processed, mobile node’s state is converged with base node state

17

Two-Tier vs Lazy Group

Master DB is always converged in two-tier Originating node need only contact a base

node to check if a tentative xact is acceptable Base xacts run with single-copy serializability Xact is durable when corresponding base xact

commits Replicas at connected nodes converge If all xacts commute, no “acceptance failures”

dangers of replication

Documents