update queries deep dive

Update QueriesDeep DiveConor Cunningham, Principal Software Architect, SQL QP Team, Microsoft

Who Is Conor?•I’ve been at Microsoft for 13+ years building

database engines (mostly Query Processors)•Spent 1-2 years outside the company

▫Startups + Consulting•I like to talk to customers to help improve our

future product offerings•I wrote the Optimizer chapter of the SQL

2008 Internals book•I blog at “Conor vs. SQL” on all things query

What You Will Learn In This Talk• How to read Insert/Update/Delete plans • Why the Optimizer picks various Update plan shapes• How the Architecture of the System support Update

Queries

• This is a deep-dive, white-box discussion on the SQL QP

• Note: There is far more on this subject than one can learn in an hour. We’ll cover a lot, but don’t expect to be an expert after the talk on everything.

• Note2: Most of this talk is beyond what CSS will support if you call them

Agenda• Basic Updates + Architecture Overview• Halloween Protection• Narrow vs. Wide Plans• Split, Sort, Collapse• FK validation• Locking Considerations• Indexed views• Updating views (not necessarily indexed)• Table/Index Partitioning

Vocabulary•Inserts, Updates, Deletes, and Merge are

all related •I will say “Update” but usually mean all of

the various data change commands (I/U/D/M)▫The internal operators are all the same, so I

will often just call it “Update”

Basic Updates (I/U/D/M)1. Read rows2. Compute new

values3. Update rows

• Every Update plan has this basic shape

• Everything in Updates starts with this template

• We do optimize a few cases down to a single operator (“Simple” Updates)

Logical Engine Architecture• Queries are executed across

multiple code components ▫ Query Execution (QE), Storage

Engine (SE), Buffer Pool (BP), Lock Manager (LM)

• When queries read rows from the SE, it locks rows

• Pages are read into the Buffer Pool and cached

• The SQL QP does not _directly_ control locking▫ The plan tells the SE the locking

mode▫ The SE does lock management,

escalation• Uniqueness enforced in SE

(UNIQUE, PK constraints) when a row is written

Index Physical Details• Indexes have extra

columns▫ Heaps have

RIDs/Bookmarks (8 bytes)▫ Non-Unique Clustered

Indexes have uniqueifiers (4b)

▫ Unique CIs have no extra columns

• Indexes link to the Heap/Cls Idx using these

• Update plans maintain these extra columns for you.

• The QP uses extra columns to do bookmark lookup/fetch

• This means that when you update a clustering index key, the secondary indexes need to be updated as well.

• It means that rebuilding heaps have to rebuild secondary indexes since RIDs are physical locators

• Uniquifiers are assigned on row creation and are not changed during reorg/rebuild operations.

Halloween Problem + Protection• (Originally found on Halloween)• UPDATE SalaryTable SET Salary = Salary * 1.1

WHERE Salary < 25000;• Expectation: Each row updated once• Actual (in this case): Every salary was multiplied by

1.1 until all were over 25000• Problem – while scanning and updating the rows,

rows were moving ahead in the scan order and being seen again (and again)

• Solution – “Phase separation”. In SQL Server, this means Spooling. This is also called “Halloween Protection”

• We have fancy logic to determine when we need phase separation (actually when we can skip it)

HP Example• On an empty table, the plan was a table scan with no

HP (why is this legal?)• When I added enough rows, it the plan was a Seek +

Spool (why did it do this?)• Bottom line – some spools are needed for correctness• Bonus question – does one need this spool in

snapshot isolation?

Narrow Plans, Wide Plans•Per-Row (narrow) vs. Per-Index (wide)•Updates that touch lots of rows tend to

use wide update plans▫Sequential IO is cheaper than Random IO▫but there is a greater cost to

batch/sort/spool•Some functional logic requires wide

updates▫Indexed Views, Query Notifications

Narrow (Row at a Time) Plans• Narrow plans take 1 row,

update all indexes, THEN go to the next row

• You can see this in the SSMS Properties page▫ Look at the object list

Wide Update Example• Accesses an index at a time• Common Subexpression Spools let us save off the set of

changes and re-read them• Split, Spool, then Sort/Collapse per Index• Pattern: Update Heap/Clustered Idx, then other access

paths• Engineering Limitation – some schema constructs ONLY

work with wide update plans (Indexed Views, Query Notification)

Note: I cut out ComputeScalars

Write to SpoolRead Spool

Split/Sort/Collapse• It is possible to get phantom(false) UNIQUE/PRIMARY

key violations• If the Storage Engine enforces uniqueness, the order in

which we apply changes can cause the SE to error if the plan updates row 1 from v1 to v2 before updating row 2 from v2 to v3.

• Example: Update T set col = col+1

Split/Sort/Collapse Contintued•The “Action” column controls Insert,

Update, Delete•Sort is on (index key, operation)

Update 12

Update 23

Update 34

…

Delete1Insert2Delete2Insert3Delete3Insert4

Delete1Delete2Insert2Delete3Insert3Insert4

Delete 1Update 2

2Update 3

3Insert 4

Split Sort Collapse

Referential Integrity (Foreign Keys)• Implemented as Semi-Join + Assert• Located AFTER the Update Change• We impose restrictions on the FK topology to be a

“tree”• Note: Updates surface the “after” image of a row• Bonus: WITH OUTPUTS uses this stream as well

Locking Considerations• Locks happen implicitly by the plan shape. We

have heuristics on locking within a plan• We can still have deadlocks, however. Solutions:

▫Read committed snapshot (optimistic concurrency)▫Query hints (examine both shapes)

Indexed View Updates – Delta Algebra• Update plans maintain IVs

like secondary indexes• IVs have restrictions

▫ Only plan shapes with efficient maintenance

• This is an “Update” example with a Group By▫ SUM is commutative!

(other agg fns() may not be)• Key ideas:

▫ Start with View Definition tree

▫ Replace table in view graph with “delta” table

▫ Modify other operators in view to maintain

Group By (A := SUM(col1))

MyTable

Group By’Expr:=F(…)∆MyTable

Original View

Delta View

Maintenance Plan

Update view

Collapse

SortCompute Scalar

(Compute New Agg )

Left Outer Join (Orig.Gbcols = Delta.Gbcols)

Delta View

Orig. View

Indexed View Example• Every IV will have similar logic in Showplan

Original View

Delta View

Insert

Delete

Update

Update <view>• UPDATE <view> gets translated

into an update against a base table

• This then becomes an almost regular update case

• There is logic in some views to guarantee changes fit the view restrictions (with check option)

• Same algebra as in indexed views

Table/Index Partitioning• Plan shapes differ in 2005 vs. 2008+

▫ We continued to improve on the initial implementation to make it faster/use less memory

• Partitioning adds the notion of finding the target partition and moving rows between partitions

• Conceptually, the partition id is part of the key for Updates▫ Split/Sort/Collapse works to understand deletes and inserts

across partition boundaries• Most plans will look the same, but there are a few plan

shapes that are “per-partition” plans• Note: Storage Engine rowsets are not re-entrant for

updates. Partitions are each separate rowsets

Partitioning Example (non-collocated)• Partitioning causes different update plan behavior

▫ Usually, just add the partitioning keys to the base plan shapes▫ Updates compute the partitioning function to find/update the

rows• Differences

▫ ComputeScalar computes the target partition id▫ Split/Sort/Collapse uses ptn function (to move rows to new ptn)▫ Update operator is marked as “partitioned” (and switches

partitions)

THANK YOU•Thanks for attending the session•Questions?

update queries deep dive

Documents