database performance topics: - db design - optimization & indexing - monitoring and tuning joe...

34
Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

Upload: gilbert-hodge

Post on 27-Dec-2015

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

Database Performance Topics:- DB Design- Optimization & Indexing- Monitoring and Tuning

Joe CarolaSiemens Medical Solutions, Health Services

Page 2: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

2

Joe Carola, Siemens HS Bio

• 30+ years in Information Technology– 26 of them dedicated to Relational Database, covering all areas of

database design, implementation, support, performance, etc. • Full History:

– Development: Prog Trainee, Prog/Analyst, Sys Analyst– DBA trainee, DBA, Mgr-DBA (“Actor on the Scene”)– Lead DB Consultant for Codd and Date Consulting– Director-DBA– Technical Database Architect (Currently)– DB2, Microsoft SQL Server, Oracle, Sybase SQL Server– Mainframe, Unix, Wintel – 1993 recipient of an International Database User Group Award for

Information Excellence, based on his contributions in the area of Relational Database technology, and has presented locally and internationally on a variety of Relational Database topics.

– Chairman, Delaware Valley DB2 User Group

Page 3: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

3

Agenda – “Practical Stuff” DB Design

• In the simplest terms - Logical to Physical Optimization and Indexing

• Optimizer• Index Types and how they are used

Monitoring and Tuning• Monitoring Process and what to monitor• Tuning Steps

The above is where I see that the rubber

meets the road……

Page 4: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

DB Design

Page 5: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

5

DB Design Logical Design

• Provides complete understanding of data and it usage• Defining data entities and their attributes• Provides primary and foreign key definitions

Physical Design• Based on information gathered during Logical design

– Data must be understood to do this correctly and efficiently• Provides physical aspects to enhance data usage

– Data types, data lengths, row sizes, • Provides precise access paths (Indexes) to rows of data

– Support primary and foreign keys– Secondary indexes

Poor Logical and Physical Database design can be the largest reason for performance issues• The price for poor DB Design must be paid at execution time

Page 6: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

6

DB Design Normalization: A synthesis of data design

1st Normal Form – Data is dependent on…………..The Key2nd Normal Form – Data is dependent on The whole Key3rd Normal Form - “ “ “ And nothing but the Key,

“so help me Codd”• Edgar F. (Ted) Codd – Developed the Relational Model

“A Relational Model for Large Shared Data Banks” (1970)– Solid, yet complex, mathematical foundation

• Relational Algebra• Domains, Attributes, Tuples, and Relations (OMG!)

– Re-stated to simpler terms…..• Simple to understand tables, rows, and columns• The Simplicity is partially the reason for the performance issues being addressed every

day– Too many shortcuts are taken– Too many non-experienced data designers are designing and implementing database

applications

Page 7: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

7

DB Design 3rd Normal Form is basically 1st cut physical

• Next step after 3rd NF in Physical DB Design is a very important step for Performance, Concurrency, Operations, etc.

– De-Normalization takes place here• Storing of data in summary or derived format• If it doesn’t happen, it takes place at execution time

Result: High processing costs – Materialization of the result dataAdministration costs – Maintenance of the dataLow Currency – Concurrent Access to the data

• However…….– Anomalies are created as a result of De-normalization

• Insert, Delete, Update• They all cost extra processing also

– Must strike a balance based on requirements on performance, availability, storage, administration

Page 8: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

Optimization&

Indexing

Page 9: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

9

Optimization and Indexing Must understand the basics of indexes and performance

statistics.• As a general rule, indexes should be kept as narrow as possible,

most likely following a business use requirement, to reduce the amount of processing overhead associated with each query.

Being familiar with how optimization works will improve the accuracy of your decision making when designing indexes• Understanding how the optimizer works is the first step toward

the establishment of a truly optimized database environment As the sophistication of your database implementation

increases…….• The need to optimize performance will also increase.

Page 10: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

10

Optimization and Indexing SQL Query tuning is one of the most important tasks to

improve application performance Biggest bang for the performance dollar over everything

else (“IMO”):• Network, Storage, Memory, Processor

Should be done in the design and testing phases• However, no amount of Database tuning or SQL statement tuning

can make up for inefficient application design/coding– 60% to 80% of Application Problems come from poorly written SQL or the

code around it

i.e. Prog101 abuse can wreck an application too!!

Page 11: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

11

Optimizer Responsible for choosing the least costly

way to execute SQL (DML). Creates an access path with it’s decision

• Performed at plan compilation time• Determines Access Methods

– Index Usage– Table Scan– Join Method– Sort

• Determines if Data and/or Index pages can be read in advance

– Asynchronous Pre-fetch

Page 12: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

12

The Importance of Statistics Statistics provide the optimizer with the

information to make decisions

RDBMSCatalog

OrDictionary

Generation

Table

Indexspace

Tablespace

As the data in a column changes, index and column statistics can become out-of-date and cause the query optimizer to make less than optimal decisions on how to process a query.

Page 13: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

13

Statistical Terms/Concepts Cardinality

• Measures how many unique values exist in the table Density

• Measures the uniqueness of values within a table. • Helps the optimizer determine how many rows will be returned

for a given key value• Indexes with high densities will likely be ignored by the

optimizer– i.e. the index is highly non-unique

Selectivity• Measures the number of rows that will be returned by a particular

query.• Needed by Optimizer to calculate the relative cost of a query plan

Page 14: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

From Request to ResponseREQUESTREQUEST RESPONSERESPONSE

RELATIONAL DATA SERVICESRELATIONAL DATA SERVICES

REQUESTED DATA

I/OI/O

DATA MANAGERDATA MANAGER

PREDICATE ANY OTHERWITH INDEX (ES) INDEX KEY

Non IndexedPREDICATE APPLIES

BUFFER MANAGERBUFFER MANAGER

STAGE 1PREDICATESSTAGE 1 - Evaluated at the time the data rows are retrieved (SARGABLE). Performance advantage in using STAGE 1 PREDICATES because this stage eliminates ROWS passed to STAGE 2 via the Data Manager..

STAGE 2PREDICATES

STAGE 2 - Evaluated after data retrieval via the relational (NON-SARGABLE, Residual) data services which is more expensive than the Data Manager.

Page 15: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

15

Indexing A very necessary part of Successful

Database Implementation

Why do some ofmy queries run so slow!

I wonder what queries will be run ?What indexes will be needed?What columns will be used as predicates?What ORDER BY will be used most often?

Page 16: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

Indexes are a good thing to add, however there is something to avoid…..

“Thanks for fixing my query, what did

you do?”

”I added an index to one of the columns

“Great! Then addindexes to all the

columns in my table

#!*#!!!

Page 17: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

17

Types of Indexes There are two types of indexes: clustered and non-

clustered, each with unique advantages depending on the data set.

Clustered index • Dictates the storage order of the data in a table. Because the data is

sorted, clustered indexes are more efficient on columns of data that are most often searched for ranges of values. This index type also excels at finding a specific row when the indexed value is unique.

Non-clustered index• Similar to an index in a textbook where the data is stored in one

place and the data value in another. A query searches for the data value by first searching the non-clustered index to find the location of the data value in the table and then retrieves the data directly from that location. The non-clustered index is useful for queries resulting in exact matches.

Page 18: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

18

Basic Index Usage

11

Leaf Page

22

33

Root Page

Non-Leaf

Page

DataPage

DataPage

Data Page

Matching Index Scan

Select * From TABLE1 Where INDEXED_COL1 = 12345Select * From TABLE1 Where INDEXED_COL1 = 12345

Page 19: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

19

Basic Index UsageNon-Matching Index Scan

1

2

Root Page

Non-Leaf

Page

Leaf Page

DataPage

DataPage

DataPage

Select * From TABLE1 Where INDEXED_COL1 > 00001Select * From TABLE1 Where INDEXED_COL1 > 00001

Page 20: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

20

Basic Index Usage

1

Root Page

Non-Leaf Page

Leaf Page

Index Only

Select COL1 From TABLE1 Where INDEXED_COL1 > 00001Select COL1 From TABLE1 Where INDEXED_COL1 > 00001

Page 21: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

21

Join Methods Nested Loop Join

SELECT A,B,X,Y FROM OUTER, INNER WHERE A=10 AND B=X

Tables: OUTER INNER COMPOSITEColumns: A B X Y A B X Y

10 3

10 1

10 210 6

10 1

5 A3 B2 C1 D2 E9 F7 G

10 3 3 B10 1 1 D10 2 2 C10 2 2 E10 1 1 D

1.) Scan the outer table,For each qualifying row……… 2.) find all matching rows

in the inner table, via table space scan or index access.

The nested loop join produces this result

Page 22: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

22

Join Methods Merge Scan Join

SELECT A,B,X,Y FROM OUTER, INNER WHERE A=10 AND B=X

Tables: OUTER INNER COMPOSITEColumns: A B X Y A B X Y

10 110 110 210 310 6

1 D2 C2 E3 B5 A7 G9 F

10 1 1 D 10 1 1 D10 2 2 C 10 2 2 E10 3 3 B

2.) Scan the outer table,For each qualifying row….… 3.) Scan a group if

matching rows in the inner table.

The merge scan join produces this result

1.) Condense and sort the outer table, or access it through an index on column B…...

Condense and sort the inner table.

Page 23: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

23

Join Methods Hybrid / Hash Join

SELECT C2,C33 FROM OUTER, INNER WHERE C1 = A AND C2 = C22

C1C2

A1

A1

A2

A3

A6

..

OUTERROW

123456

C22 C33

1 D

2 C

2 E

3 B

5 A

7 G

INNER

RID

P1P2P3P4P5P6

C2RID

1P1

1P1

2P2

2P3

3P4

PARTIAL ROWS

C2 C33

1D

1D

2C

2E

3B

RESULT

RID LIST

P1 P1 P2 P3P4

1.) Apply local predicates and organize qualifying rows in join column sequence by either sorting or accessing via join column index….

3.) Create partial rows, and sort in RID sequence...….2.) Obtain only inner table RIDs via index

access using sequenced join column key values...….

4.) List Prefetch inner table rows and complete partial rows

Page 24: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

24

An Ounce of Prevention…., Make your queries simple and efficient, ensuring the

least costly access path available.• Try not to overload your tables with indexes• Try not to overload your indexes• Try not to overload your queries

Keep the Database healthy• Reorganization

– Eliminates empty space, and fragmentation– Reduces I/O

Generate Statistics (if they are not automatic)• The Optimizer is very smart, but data attributes are always

changing– DB Size/Volume, Data Skewness, Data Content

Analyze SQL Query and access path selection prior to implementing into a production environment.• Execute the Explain Plan periodically to determine what method

the Optimizer is selecting for an access path.

Page 25: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

25

“Explain” Plan / SHOWPLAN Phase of the optimizer that captures information used in

selecting the query access plan Why use an Explain Plan?

• Gives clues as to why the optimizer made access decisions• Can be used in advance of execution• Can be used to maintain a history of problem query access

– Before/After new indexes additions– Before/After Statistics are Generated/Re-Generated– Before/After Data additions/changes/deletions

• Problem determination is easier by comparing reference plans

Page 26: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

26

Example Graphical SHOWPLAN

Page 27: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

Monitoring and Tuning

Page 28: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

28

Monitor and Tuning A Constant Process

• A very necessary part of successful database implementation

• Must be there to guarantee ongoing, optimal Database Performance

1.) Collect Data

2.) Analyze Data

3.) Consider Fixes

4.) Apply Fixes

Design DataObject DataActivity Data

Real timePeriodicHistorical

RedesignTune

Repeat

Page 29: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

29

Monitoring and Tuning What to monitor

• Healthiness of Database Objects– Growth– Fragmentation

• Exists when TS and/or indexes have pages in which the logical ordering, based on key or link value, does not match the physical ordering of the pages inside the file

• Causes additional I/O and additional storage• Causes of Fragmentation

– DML (Insert, Delete, Update)– Inserts/Updates cause Page Splits– Delete/Updates cause holes

Page 30: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

30

Monitoring and Tuning What to monitor

• Fragmentation illustrated

• Reorganization– Reorders pages, compresses entries on a page

• Always be sure to run new Statistics collection (for the Optimizer)

Uniform pages in order

Non-uniform pages, out of order

Index 1 Page 1

Index 1 Page 2

Index 1 Page 3

Index 1 Page 4

Index 1 Page 5

Index 1 Page 6

Index 1 Page 7

Index 1 Page 8

Index 1 Page 1

Index 1 Page 2

Index 2 Page 1

Index 2 Page 2

Index 1 Page 4

Index 1 Page 5

Index 3 Page 1

Index 1 Page 8

Page 31: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

31

Monitoring and Tuning What to monitor

• Object Usage– Access Patterns (Random, Sequential, Indexed, Non-Indexed)– I/O (Volume, Latency)– They tend to change over time as users learn the application

• Memory Usage– Buffer Hit Ratio– Data/Index pages in the Buffer will avoid an I/O

• Processing Activity– CPU utilization

• Will indicate excessive searching and/or sorting– Parallel, Non-Parallel

• Can speed up large searches• Can also monopolize all the processors

• Locking– Timeouts– Deadlocks

Page 32: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

32

Monitoring and Tuning How to monitor – Tool usage

DBMS

SQL Request

Result

StatisticalGeneration

Performance DB

Tool toCollect &Interpret

Alerts

Reports

Page 33: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

33

Monitoring and TuningSteps• Find the statements that consume the most resources

– “Heavy Hitters” • Physical Reads will indicate SQL requiring disk access to get queries

– Most expensive part of a Query!!!• Buffer Gets indicate the amount of searching going on within a query

High Buffer Gets = Lots of Searching = Lots of Processing• Sorts information will indicate if SQL is doing an excessive amount of sorting

• Find the offending statements without adding to the performance problem

– Use simple top down approach• Avoid heavy tracing• Know the Database Design and Usage• Run Explain Plan on SQL

Page 34: Database Performance Topics: - DB Design - Optimization & Indexing - Monitoring and Tuning Joe Carola Siemens Medical Solutions, Health Services

AdditionalQuestions?