1 wenguang wangrichard b. bunt department of computer science university of saskatchewan november...

Post on 01-Jan-2016

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Wenguang Wang Richard B. BuntDepartment of Computer Science

University of Saskatchewan

November 14, 2000

Simulating DB2 Buffer Pool Management

2

Outline

• Background

• Problem

• Methodology

• Simulation results

• Future work

3

Background• What is buffer pool

Buffer Pool

Database on Disks

Writes Reads

Applications

Upper layer of DBMSDBMS

4

Problem

• Buffer pool management is important to the performance of any DBMS

• The config and tuning of buffer pool is not a easy problem for the database administrator

• The buffer pool management of a DBMS is very complex

• It is hard to study and test the buffer pool management algorithm directly

5

Methodology

• Trace-driven simulation provides an effective approach

• Compare to the real DBMS:– Simulator is easier to be controlled

– Simulator requires much lower computing resources (CPU, memory, disk, running time)

– New algorithm is easier to be implemented and tested in the simulator

– Changes that cannot be done or are not easy to do in the real system can be simulated in the simulator

6

Methodology

• Create trace-driven simulation tools– Collect trace– Process trace– Develop simulator– Verify simulator

• Perform experiments by the simulator– Understand the effect of buffer pool parameters– Give suggestions to the tuning of buffer pool– Design and test alternate buffer pool algorithms

7

System Configuration

• DBMS — IBM DB2– Relational DBMS– Distributed DBMS which supports multiple

nodes. Because buffer pools on different nodes are independent, only the single node DB2 is studied

• Workload — the TPC-C benchmark– An On-Line Transaction Processing benchmark– Many clients send simple queries simultaneously

to the DBMS on the server side– A large amount of data are updated by the queries

8

System Configuration (cont.)

• DB2 version 6.1 running on Windows NT Server 4.0

• TPC-C database – Small application: 50-warehouses (5GB data)

spanning over 9 physical disks

9

Trace Collection• Trace tools of DB2

• Suspend the TPC-C benchmark periodically to record big enough trace

Buffer Pool

Database on Disks

Writes Reads

Applications

Upper layer of DBMSDBMSTrace point

10

Trace Volume

• 60M buffer pool requests

• 200K TPC-C transactions

• Equivalent to 30 minutes TPC-C run when no traces are recorded

11

Buffer Pool Simulator

• To simulate the buffer pool management algorithm and the disk activities

• About 8000 lines C++ code

12

Architecture of the Simulator

13

Clock-Pointer

Page Cleaners

Cleaning pages

DB2 Buffer Pool Algorithm

Clock Algorithm

Threshold: triggers the page cleaning activity

Database on Disks

14

Buffer Pool Algorithm (cont.)

Threshold

Asynchronous writesperformed by page

cleaners

Dirty pages

Clean Region

Dirty Region

Expand

BufferPool

Reads

Synchronouswrites

Database on Disks

TPC-C

15

Simulator Verification

• Compare the throughput curve

• Compare the run-time statistics– Hit ratio– Dirty page percentage

• Test the effect of parameters– Dirty page threshold– Number of page cleaners

16

Simulator Verification— Similar Throughput Curve

17

Page Distribution of Buffer Pool

18

I/O Activities of the Buffer Pool

19

Simulation Results Under Default Configuration

• Page cleaners cannot clean out pages fast enough

under the default configuration (2 page cleaners)

• Too many dirty pages (87%) in the buffer pool

under the default configuration

• The existence of too many dirty pages lowers the

buffer pool hit ratio and performance

20

Effect of More Page Cleaners

21

IO Activities Under More Page Cleaners

22

Effect of Number of Page Cleaners

23

Effect of Buffer Pool Parameters

• Threshold cannot affect performance when the number of page cleaners is small

• Setting an appropriate number of page cleaners is important to performance

• Appropriate number of page cleaners are different for different workloads

24

Future work• Gain more understanding of the buffer pool

algorithm from the simulator and DB2

• Extend the work to a much larger TPC-C database

• Investigate alternative algorithms of the buffer pool management algorithm which are easier to be managed and tuned

• Test the alternative algorithms first in the simulator and then in the real system

25

Questions?

top related