self-tuning database systems: the autoadmin...

36
5/10/2002 (c) Microsoft Corporation 1 Self-Tuning Database Systems: The AutoAdmin Experience Surajit Chaudhuri Data Management and Exploration Group Microsoft Research http://research.microsoft.com/users/surajitc [email protected]

Upload: others

Post on 12-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 1

Self-Tuning Database Systems: The AutoAdmin Experience

Surajit ChaudhuriData Management and Exploration Group

Microsoft Researchhttp://research.microsoft.com/users/surajitc

[email protected]

Page 2: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 2

Research Group Overview

Page 3: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 3

Data Management, Exploration and Mining Group

Formed in 1999 by fusing two projects -AutoAdmin and DB support for DM Research with technology transfer

Project-orientedClose partnership with SQL Server

6 researchers, 5 developersA junior-heavy team Strong internship program

Page 4: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 4

Current Projects

AutoAdmin: Self Tuning Database Systems Data Cleaning Exploratory Projects

Approximate Query ProcessingDocuments + Structured DataXML2SQL

Past project: SQL-aware Data Mining

Page 5: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 5

Self-Tuning Database Systems: The AutoAdmin Experience

Page 6: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

The Black Art of Database Tuning. . .

Applications

DBS

Workload

Performance

TuningGuru

SystemParameters

Page 7: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 7

AutoAdmin: Motivation

Started in summer 1996 at Microsoft Research – team of 2Our Goal:

Make database systems self-tuning and self administering

Analogy: Cars

Reduce TCO

Page 8: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 8

Vision of a Self Tuning System

Manager Sets goals, policy, and the budgetSystem does the rest

Everyone is a CIOBuild a system

Used by millions of people each dayAdministered and managed by a ½ time person

On hardware fault, order replacement partOn overload, order additional equipmentUpgrade hardware and software automatically

“What Next?A dozen remaining IT problems”

Turing Award Lecture,FCRC,

May 1999Jim GrayMicrosoft

Page 9: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation
Page 10: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 10

Physical Design ImpactsQuery Execution

SELECT NameFROM EmployeesWHERE Age < 40 AND Salary > 200K

Execution Plan A: Filter (Age < 40 AND Salary > 200K)Table Scan (Employees)

Execution Plan B:Filter (Age < 40)Table Lookup (Employees) by Salary

Page 11: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 11

Effect of Workload on Physical Design

Which column(s) should we index?Right answer may be:

SalaryAgeBothNeither!

Depends on the workload, and requires knowledge of statistics

SELECT NameFROM EmployeesWHERE Age < 40 AND Salary > 200K

SELECT NameFROM EmployeesWHERE Age < 20 AND Salary > 50K

Page 12: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 12

AutoAdmin: Key ContributionsA What-if architecture for exploring the space of hypothetical designs (SIGMOD 98)

Workload drivenIntegrated physical database design tool(VLDB 97, VLDB 00)

Recommends indexes and Materialized ViewsPart of Microsoft SQL Server product since 1998

Statistics selection (ICDE 00, SIGMOD 02)Execution feedback driven statistics building(SIGMOD 99, SIGMOD 01)

Page 13: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 13

“What-If” Architectures

Page 14: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 14

“What-If” Architecture Overview

Query

Optimizer(Extended)

Database Server

Workload

AutoAdmin

Recommendation

“What-if”

Application

Page 15: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 15

“What-If” Analysis of Physical Design

Estimate quantitatively the impact of physical design on workload

e.g., if we add an index on T.c, which queries benefit and by how much?

Without making actual changes to physical design

Time consuming Resource intensive

Search efficiently the space of hypothetical designs

Page 16: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 16

Workload-driven Physical Design for Databases

Page 17: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 17

Physical Database Design:Problem Statement

Workloadqueries and updates

ConfigurationA set of indexes, materialized views from a search spaceCost obtained by “what-if” realization of the configuration

ConstraintsUpper bound on storage space for indexes

Search: Pick a configuration that is of “lowest” cost for the given database and workload (VLDB 1997)

Page 18: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 18

Architecture of Tuning Wizard in Microsoft SQL Server

Candidate Selection

Workload

Recommendation

ConfigurationEnumeration

Microsoft

SQL

Server

ServerExtensions

Page 19: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 19

Search Space

Large Search Space for indexesMany columns to choose fromKinds of indexes

Explosive search space for materialized viewsQuery optimizers use physical design in novel waysPhysical design choices interact

Page 20: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 20

AutoAdmin Milestones

Started in late summer 1996SQL Server 7.0: Ships index tuning wizard (1998)SQL Server 2000: Integrated recommendations for indexes and materialized Views Shared research results widely

Page 21: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 21

Workload Driven Statistics Management

Page 22: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 22

ExampleSELECT * FROM lineitem, ordersWHERE l_orderkey = o_orderkey ANDl_shipdate = '01-02-99' AND o_orderdate = '01-01-99'

orders lineitem

Index Nested Loop Join

Result

orders lineitem

Merge Join

Result

With stats Cost = 25

Without stats Cost = 112

Page 23: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 23

Essential Set of Statistics

“Chicken-and-egg” problemCannot tell if additional statistics are necessary until we actually build them!Need a test for equivalence without having to build any statistics in (C – S)

S

C

Page 24: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 24

ExampleSELECT E.EmployeeName, D.DeptName FROM Employees E, Department D WHERE E.DeptId = D.DeptID AND E.Age < 40 AND E.Salary > 200KStatistics on E.Age are missingMay not need statistics on E.Age if predicate E.Salary > 200K is very selective

Page 25: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5

/

1

0

/

2

0

0

2

(

c

)

M

i

c

r

o

s

o

f

t

C

o

r

p

o

r

a

t

i

o

n

2

5

F

o

r

m

a

l

i

z

i

n

g

E

s

s

e

n

t

i

a

l

S

t

a

t

i

s

t

i

c

s

O

u

r

G

o

a

l

:

F

i

n

d

a

s

u

b

s

e

t

t

h

a

t

i

s

a

s

g

o

o

d

a

s

h

a

v

i

n

g

a

l

l

s

t

a

t

i

s

t

i

c

s

b

u

t

a

v

o

i

d

p

r

i

c

e

o

f

m

a

i

n

t

a

i

n

i

n

g

a

l

l

F

o

r

g

i

v

e

n

w

o

r

k

l

o

a

d

W

h

a

t

i

s

a

s

g

o

o

d

?

t

-

O

p

t

i

m

i

z

e

r

-

C

o

s

t

e

q

u

i

v

a

l

e

n

c

e

C

o

s

t

(

Q

,

C

)

a

n

d

C

o

s

t

(

Q

,

S

)

a

r

e

w

i

t

h

i

n

t

%

o

f

e

a

c

h

o

t

h

e

r

P

u

b

l

i

c

a

t

i

o

n

:

I

E

E

E

I

C

D

E

2

0

0

Page 26: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 26

Essential Statistics(IEEE ICDE 2000)

In the absence of statistics:Query Optimizers use “magic numbers” for selectivity of predicates

For Age < 40, assume selectivity = 0.30Data distribution independent

MNSA (Magic Number Sensitivity Analysis)Set magic numbers to a few different valuesIf varying selectivity does not affect plan

⇒⇒⇒⇒ additional statistics will not help Else⇒⇒⇒⇒ Select a “promising” statistics to build

Page 27: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 27

Statistics on Queries

Reduce optimizer error by building statistics on query expressions (SIT)A very promising ideaLike materialized views – a manageability challenge Recent work from AutoAdmin (SIGMOD 2002)

Page 28: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 28

Execution Feedback Driven Statistics Building

Page 29: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 29

Self-Tuning Statistics

Think Maps Why care about maps for Greenland? Need detailed maps for areas you visitMake maps more detailed each time you visit

Idea: Start with “uniformity” assumptionProgressively refine with execution feedbackSingle and multidimensional histograms SIGMOD 99, SIGMOD 2001

Page 30: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 30

More on Self-Tuning Database Systems

More at MicrosoftSQL Server 7.0 introduced several auto-tuning features

IBM AlmadenWork by Mario and Shel LEO at IBM ARC has similar goals as AutoAdmin

Page 31: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 31

Rethinking Database Systems

Page 32: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 32

Featurism hurts Self-TuningFeaturism has turned into a curse

Yet another indexing smart /join method/optimizer transformation added

Abusing ExtensibilityEliminate all second-order optimizations

Turning into black magicHard to abstract principlesCannot educate next generation of engineersPerformance is unpredictable

Self-Tuning is difficult

Page 33: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 33

Role ModelsEx. 1: Aircraft with many subsystems (engine, fuselage, electrical control, etc.)Ex. 2: RISC hardwareNo single engineer understands entire system

Local theories for individual subsystems andreasonable understanding of interactions

Few points of interaction with stable and narrow interfacesBuilt-in system support for debugging subcomponents (incl. Performance tuning)

Page 34: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 34

RISC Philosophy for DBMS

Details in VLDB 2000 vision paperPackage as components with simplified functionalityEnforce

Layered approachStrong limits on interaction (narrow APIs)Multiple consumers for a component

Components must have manageable complexityEncapsulation must include predictable performance and self-tuningNot a new idea – but an idea worth revisiting

Page 35: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 35

Final WordsDBMS has to be self-tuning to be a good software componentAutoAdmin

Exploit workload and execution feedback richly for enabling self-tuningDemonstrated through technology incorporated in Microsoft SQL Server

Despite advances, self-tuning remains a very formidable challenge

Need to think “self-tuning” globally by paying attention “locally”RISC DBMS architectures – worth revisiting?

Page 36: Self-Tuning Database Systems: The AutoAdmin Experienceinfolab.stanford.edu/infoseminar/archive/SpringY2002/speakers/suraj… · Research Group Overview. 5/10/2002 (c) Microsoft Corporation

5/10/2002 (c) Microsoft Corporation 36

More Information

Data Management, Exploration and Mining Group Homepage

http://research.microsoft.com/dmx

Microsoft SQL Server White papers on Self-Tuning technologyMy contacts

http://research.microsoft.com/users/[email protected]