apache derby performance - huihoo

32
Apache Derby Performance Olav Sandstå, Dyre Tjeldvoll, Knut Anders Hatlen Database Technology Group Sun Microsystems

Upload: others

Post on 12-Feb-2022

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Apache Derby Performance - Huihoo

Apache Derby Performance

Olav Sandstå, Dyre Tjeldvoll, Knut Anders HatlenDatabase Technology GroupSun Microsystems

Page 2: Apache Derby Performance - Huihoo

• Derby Architecture• Performance Evaluation of Derby• Performance Tips• Comparing Derby, MySQL and PostgreSQL

Overview

Page 3: Apache Derby Performance - Huihoo

Derby Architecture: Embedded

Java Virtual Machine

Application

JDBC

SQL

Access

Log Data

StorageDatabase buffer

● Parsing of SQL● Compiling ● Optimization● Byte-code gen.● Execution

● Access path● Indexes

● Database buffer● Log generation● Log execution

● Synchronous write of log records

● Asynchronous write of data (checkpoint)● Synchronous read of data

Page 4: Apache Derby Performance - Huihoo

Derby Architecture: Client-Server

Java Virtual Machine

Application

JDBC client

Java Virtual Machine

JDBC

SQL

Access

Log Data

StorageDatabase buffer

Network Server

Page 5: Apache Derby Performance - Huihoo

Performance Evaluation of Derby

Overview:• Evaluation of disk and file system configurations:

> Disk write cache> Database and transaction log on separate disks

• Database configurations:> Size of database buffer

• Comparing Embedded and Client-Server

Page 6: Apache Derby Performance - Huihoo

What Is “Performance”?

• How to measure database performance?> Throughput> Response time> Scalability

• Which configuration?> Out-of-the-box configuration> Carefully tuned

• How to compare database systems with different properties and different tuning possibilities?

Page 7: Apache Derby Performance - Huihoo

Test Configuration

Load clients:

1. TPC-B like load:> 3 UPDATES > 1 SELECT> 1 INSERT

2. Single-record SELECT:> Select of one record on

primary key

Test platform:

• Sun JVM 1.5.0• Solaris 10 x86• 2 x 2.4 GHz AMD Opteron• 2 GB RAM• 2 SCSI disks• UFS file system

Page 8: Apache Derby Performance - Huihoo

Effect of Write Cache on Disk

TPC-B like load:

WARNING: The write cache reduces probability of successful recovery after power failure

Page 9: Apache Derby Performance - Huihoo

Separate Data and Log Devices

Log device:> Sequential write of

transaction log> Synchronous as part of

commit> Group commit

Data device:> Data in database buffer

regularly written do disk as part of checkpoint

> Data read from disk on demand Tip: use separate disks for

data and log device

Page 10: Apache Derby Performance - Huihoo

Database Buffer

• Cache of frequently used data pages in memory

• Cache-miss leads to a read from disk (or disk cache)

• Size: > default 4 MB> derby.storage. pageCacheSize

Performance tip: • increase the size of the database buffer to get frequently

accessed data in memory

Example:single-record select:

Page 11: Apache Derby Performance - Huihoo

Comparing Embedded and Client-Server

• TPC-B like load:

Page 12: Apache Derby Performance - Huihoo

Comparing Embedded and Client-Server

• Single-record SELECT:

Page 13: Apache Derby Performance - Huihoo

Comparing Embedded and Client-Server:CPU Usage

TPC-B Embedded

TPC-B Client-Server

Select Embedded

Select Client-Server

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

CPU usage per transaction [ms]

User CPU

System CPU

• Network Server add 30%-50% to the CPU usage

• System CPU usage:> Message sending and

receiving

• User CPU usage:> Message parsing> Character set conversions

Page 14: Apache Derby Performance - Huihoo

Performance Tips

Page 15: Apache Derby Performance - Huihoo

Performance Tips 1:Use Prepared Statements• Compilation of SQL

statements is expensive:> Derby generates Java byte

code and loads generated classes

• Prepared statements eliminate this cost

Tip:• USE prepared statements • and REUSE them

PreparedStatement Statement

0

0.25

0.5

0.75

1

1.25

1.5

1.75

2

2.25

2.5

2.75

System time

User time

CP

U u

sag

e (

ms/

tx)

Page 16: Apache Derby Performance - Huihoo

Performance Tips 2:Avoid Table Scans• Use indexes to optimize much used access paths:

> CREATE INDEX....

• Check the query plan:> derby.language.logQueryPlan=true

• Use RunTimeStatistics:> SYSCS_UTIL.SYSCS_SET_RUNTIMESTATISTICS(1)> SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS()

Tip: • Use indexes• Use Derby's tools to understand the query execution

Page 17: Apache Derby Performance - Huihoo

Comparing the Performance ofMySQL, PostgreSQL and Derby

Page 18: Apache Derby Performance - Huihoo

Performance Evaluation:

MySQL, PostgreSQL and Derby

Evaluated performance of:• MySQL/InnoDB (version 5.0.10)• PostgreSQL (version 8.0.3) • Derby Embedded (version 10.1.1.0)• Derby Client-Server

Page 19: Apache Derby Performance - Huihoo

Database Configurations

Configurations:• “Out-of-box” performance• No tuning, except:

> size of database buffer> database and transaction

log on separate disks➔ No Benchmark

Load:> 1-100 concurrent clients

Databases:1. Main-memory database:

> 10 MB user data> 64 MB database buffer

2. Disk database:> 10 GB user data> 64 MB database buffer

Page 20: Apache Derby Performance - Huihoo

Throughput: TPC-B like load

Main-memory database (10 MB): Disk-based database (10 GB):

Page 21: Apache Derby Performance - Huihoo

Throughput: Single-record Select

Main-memory database (10 MB): Disk-based database (10GB):

Page 22: Apache Derby Performance - Huihoo

Observations

• Derby outperforms MySQL on disk-based databases> Derby has 100% higher throughput than MySQL

• MySQL performs better on small main-memory databases> Update-intensive load: Derby has 20-50% lower

throughput> Read-intensive load: Derby has 50% lower throughput

• PostgreSQL performs best on read-only databases, and has lowest throughput on update-intensive databases

Why?

Page 23: Apache Derby Performance - Huihoo

CPU Usage Main-Memory Database

Derby embedded

Derby client/server

MySQL PostgreSQL

00.10.20.30.40.50.6

0.70.80.9

11.11.21.31.41.5

TPC-B

System time

User time

ms/

tran

sact

ion

Derby embedded

Derby client/server

MySQL PostgreSQL0

0.025

0.05

0.075

0.1

0.125

0.15

0.175

0.2

0.225

0.25

0.275

0.3

0.325

SELECT

System time

User time

ms/

tran

sact

ion

Page 24: Apache Derby Performance - Huihoo

CPU Usage Disk-Based Database

Derby embedded

Derby client/server

MySQL PostgreSQL0

0.1

0.20.3

0.40.5

0.6

0.70.8

0.91

1.11.2

1.31.4

1.5

CPU usage per transaction

System time

User time

ms/

tra

nsa

ctio

n

Derby embedded

Derby client/server

MySQL PostgreSQL0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

CPU usage per transaction

System time

User time

ms/

tra

nsa

ctio

n

TPC-B SELECT

Page 25: Apache Derby Performance - Huihoo

Network I/O

Derby client/server

MySQL PostgreSQL0

0.51

1.52

2.53

3.54

4.55

5.56

6.57

7.58

8.59

Incoming packets

Outgoing packets

Pac

kets

per

tra

nsa

ctio

n

Derby client/server

MySQL PostgreSQL0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Incoming packets

Outgoing packets

Pac

kets

per

tra

nsa

ctio

n

TPC-B SELECT

Observation:• Derby has two extra round-trips for SELECT operations

Page 26: Apache Derby Performance - Huihoo

Disk I/O: Main-Memory Database (TPC-B)

Derby MySQL PostgreSQL0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

Write Operations

Data

Log

Writ

e o

pe

ratio

ns

pe

r tr

an

sact

ion

Derby MySQL PostgreSQL0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Write Volume

KB

per

tran

sact

ion

Page 27: Apache Derby Performance - Huihoo

Disk I/O: Disk-Based Database (TPC-B)

Derby MySQL PostgreSQL0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

Data Volume

KB

pe

r tr

ansa

ctio

n

Derby MySQL PostgreSQL0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

I/O Operations

Log Data write Data read

Ope

ratio

ns p

er tr

ans

act

ion

Page 28: Apache Derby Performance - Huihoo

Disk I/O: Disk-Based Database (SELECT)

Derby MySQL PostgreSQL0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

Data Volume

KB

pe

r tr

ansa

ctio

n

Derby MySQL PostgreSQL0

0.25

0.5

0.75

1

1.25

1.5

1.75

2

2.25

2.5

2.75

I/O Operations

Data write Data read

Ope

ratio

ns p

er tr

ansa

ctio

n

Page 29: Apache Derby Performance - Huihoo

Conclusions: Resource Usage

• MySQL performs better than Derby when> The database is small and fits in the database buffer> Throughput becomes CPU-bound> Derby uses more CPU and sends more messages over the net

• Derby performs better than MySQL when> The database is large and does not fit in the database buffer> Throughput becomes I/O-bound

• PosgreSQL performs best with read-only load> Update-intensive load results in much disk I/O

Page 30: Apache Derby Performance - Huihoo

Performance Improvement Activities

CPU usage:• High CPU usage during

initialization of database buffer (DONE)

• Reduce CPU usage in network server (message parsing and generation)

• Improve use of hash tables and reduce lock contention

Network IO:• Reduce number of

messages sent and received

Disk IO:• Improve fairness of

scheduling of I/O requests (DONE)

• Allow concurrent read operations on data files

Performance tip:• Next version of Derby will have even better performance

Page 31: Apache Derby Performance - Huihoo

Summary

• Disk Write Cache: > Improves the performance, but.....> WARNING: durability and the probability of successful recovery

after a power failure are reduced

• Data and log devices:> Keep the transaction log and database on a separate devices

• Database buffer:> Keep frequently accessed data in the database buffer

• Client-Server versus Embedded:> Throughput: down by 5% (TPC-B) to 30% (SELECT)

• Use:> Prepared statements and indexes