sql performance 2011/12 joe chang, solidq jchang6@yahoo.com

Post on 14-Dec-2015

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SQL Performance 2011/12

Joe Chang, SolidQwww.qdpma.com http://sqlblog.com/blogs/joe_chang/default.aspxjchang6@yahoo.com

2011

• Hardware is Powerful & Cheap– CPU (cores), memory, and now IO too!• Quad-core since 2006, 1GHz since 2000

• Is Performance still a concern?– Yes, along with fundamentals

• Modern Performance Strategy– Can handle minor inefficiencies– Identify and circumvent the really bad things

Modern Hardware

• 4-12 cores per processor socket• 16GB DIMM at less than $1K• SSD– Enterprise grade SSD still moderately expensive• Both SLC and MLC

– Consumer SSD – really cheap, $3-4K per TB• Uneven performance characteristics over time

• Desired IO performance: 1-2GB/s, 20-50K IOPS

Hardware Baseline 2011

Entry• 1 Xeon E3 quad-core• 16GB memory

– 4 x 4GB unbuffered ECC

• SSD options:– 2-4 SATA SSDs– 1-2 PCI-E SSDs

Mid-range • 2 Xeon 5600 6-core• 48 – 192GB

– 6 x 8GB to 12 x 16GB

• SSD options– 16+ SATA SSDs– 4-5 PCI-E SSDs

Performance Fundamentals

• Network round-trips– Owner qualified, case correct

• Log write latencies– Sufficiently low to support transaction volume• Not necessarily separate data and log disks

• Normalization – correct data trumps all!• Indexes – a few good ones, and not too many!• SQL – that the optimizer

What can go wrong?

• With immense hardware resources• And a great database engine– What can go wrong?

• Following a fixed set of rules and procedures– Basic transactions processing should work well– If your process does something unanticipated• Some things can go horribly wrong

Performance Concepts

• Query Optimizer• Execution plan operators– Formula for component operation

• Data distribution statistics – to estimate rows & pages, automatically updated– Rules when estimate not possible

• Stored procedure compile rules– Parameters and variables

Stored Procedure

Basics

Parameters and Variables

• On compile– Parameter values used to for row estimate– Variables – assume unknown value

• Consider effect of skewed distribution

Parameter & Variable

6 rows for value 1

4 rows for value unknown

Consider impact for skewed data distributions

Stored Procedure Compile Options

• WITH RECOMPILE• OPTIMIZE FOR• Plan Guide

• Temp table– KEEP PLAN, KEEPFIXED PLAN

Compile & Execute Time

• Plan reuse desired when– Compile cost is high relative to execute cost

• Recompile desired when– Execute cost is high relative to compile cost

Statistics

Basics

Statistics

• No statistics – table variables• Temp table – statistics auto recompute– 6 row modified, 500 rows, every 20% thereafter

• Statistics sampling– Random page, how to handle skewed distribution?

• Upper and lower bounds– Problems caused by incrementing columns

• Propagation errors

Statistics Recompute

• Scenario: start with accurate statistics• Update column with new values– That did not previously exist

• If fewer than 20% of rows updated– Auto-recompute is not triggered

Sampling

• Default sampling percentage is usually good• Caution: not a random row sample!– Random sampling of page• From nonclustered index if available

• If there is correlation between pages & values– Then serious over estimation possible

Out of range

• Statistics sampling tries to identify lower and upper bound

Bad Execution Plan Examples

Not comprehensive

Scenarios

• Or condition• Multiple optional search arguments• Skewed distributions– 1 business logic for Small and large data sets– Reports for 1 day, 1 week, 1 month, 1 year

• Statistics related problems– Resulting in horrible execution plan

Not comprehensive

When the Query Optimizer

Does not understand you

Simple OR Conditions

Simple Index Seek

OR Condition in Join

Alternative – Union

UNION and UNION ALL

• UNION– Only distinct rows– Sort to eliminate duplicates• Can be expensive for high row counts

• UNION ALL– All rows– No sort to eliminate duplicates

Multiple Optional SARGs

This was suppose to work, but does not

Parameterized SQL

Skew and Range Variation

Statistics Out-of Range

Statistics Out of Range (cont.)

Table Variable

No Statisticsassumes 1 row, 1page Why? No recompiles

Loop Join – Scan Inner Source

Really Bad News

Estimate 1 row

The Correct Plan

Hash Join forcedwith hint

Estimate 1 row

Loop Join vs. Hash Join

Alternative Plan with Index

Temp Table versus CTE

• Consider options– SELECT xxx INTO #Temp– FROM SqlMain Expression

– WITH tmp AS (SELECT xxx FROM Sql)Main Expression

top related