database & technology 1 _ guy harrison _ making the most of ssd in oracle11g.pdf

67
©2011 Quest Software, Inc. All rights reserved.. Guy Harrison Director, R&D Melbourne Email: [email protected] Twitter: @guyharrison Web: http://www.guyharrison.net Making the most of Solid State Disk in Oracle 11g

Upload: insync2011

Post on 17-Jul-2015

848 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

©2011 Quest Software, Inc. All rights reserved..

Guy Harrison Director, R&D Melbourne

Email: [email protected] Twitter: @guyharrison Web: http://www.guyharrison.net

Making the most of Solid State Disk in Oracle 11g

Page 2: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Introductions

Page 3: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf
Page 4: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf
Page 5: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf
Page 6: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf
Page 7: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf
Page 8: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf
Page 9: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

0 10 20 30 40 50 60 70 80

Blue

Yellow

Red

Pct

Star trek shirt fatality analysis

Page 10: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

10

©2011 Quest Software, Inc. All rights reserved..

Agenda

• Brief History of Magnetic Disk

• Solid State Disk (SSD) technologies

• SSD internals

• Oracle DB flash cache architecture

• Performance comparisons

• Recommendations and Suggestions

Page 11: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

11

©2011 Quest Software, Inc. All rights reserved..

A brief history of disk

Page 12: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

5MB HDD circa 1956

Page 13: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

28MB HDD - 1961 1800 RPM

Page 14: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

The more that things change....

Page 15: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

15

©2011 Quest Software, Inc. All rights reserved..

Moore’s law

• Transistor density doubles every 18 months

• Exponential growth is observed in most electronic components: •  CPU clock speeds

•  RAM

•  Hard Disk Drive storage density

• But not in mechanical components •  Service time (Seek latency) – limited by actuator arm speed and disk

circumference

•  Throughput (rotational latency) – limited by speed of rotation, circumference and data density

Page 16: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Disk trends 2001-2009

260 1,635

-630

1,013

-390

-1,000

-500

0

500

1,000

1,500

2,000

IO Rate Disk Capacity IO/Capacity CPU IO/CPU

%ag

e ch

ange

Page 17: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

17

©2011 Quest Software, Inc. All rights reserved..

Solid State Disk

Page 18: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

SSD to the rescue?

4,000

80

25

15

0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500

Magnetic Disk

SSD SATA Flash

SSD PCI flash

SSD DDR-RAM

Seek time (us)

Page 19: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Power consumption

8

10

20

0.08

0.15

0.01 0.1 1 10 100

Idle

Seek

Start up

Watts (logarithmic scale)

Flash SSD

SATA HDD

Page 20: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Economics of SSD

2.38

1.53

0.05

0.05

0.06

0.06

0.09

1.00

6.88

21.88

24.92

53.44

0.00 10.00 20.00 30.00 40.00 50.00 60.00

0.00 0.50 1.00 1.50 2.00 2.50

Seagate SATA HDD

Seagate SAS HDD

Intel MLC SATA SSD

Intel SLC SATA SSD

FusionIO PCI MLC Duo SSD

FusionIO PCI SLC SSD

$/GB

$/IOP

$/IOP

$/GB

Page 21: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Tiered storage management

Main Memory

DDR SSD

Flash SSD

Fast Disk (SAS, RAID 0+1)

Slow Disk (SATA, RAID 5)

Tape, Flat Files, Hadoop

$/IOP $/

GB

Page 22: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

22

©2011 Quest Software, Inc. All rights reserved..

SSD technology and internals

Page 23: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Flavours of Flash SSD

DDR RAM Drive SATA flash drive PCI flash drive SSD storage Server

Page 24: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

PCI SSD vs SATA SSD

PCI vs SATA •  SATA was designed for traditional disk drives with high latencies

•  PCI is designed for high speed devices

•  PCI SSD has latency ~ 1/3rd of SATA

Page 25: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

25 Booth 1107

Page 26: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

26

©2011 Quest Software, Inc. All rights reserved..

Flash SSD Technology

•  Cell: One (SLC) or Two (MLC) bits •  Page: Typically 4K •  Block: Typically 128-512K

Storage Hierarchy:

•  Read and first write require single page IO •  Overwriting a page requires an erase & overwrite of the block

Writes:

•  100,000 erase cycles for SLC before failure •  5,000 – 10,000 erase cycles for MLC

Write endurance:

Page 27: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

27

©2011 Quest Software, Inc. All rights reserved..

Flash SSD performance

25

250

2000

0 200 400 600 800 1000 1200 1400 1600 1800 2000

Read (4k page seek)

First insert (4k page write)

Update (256K block erase)

Microseconds

Page 28: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Flash Disk write degradation

All Blocks empty: Write time=250 us

25% part full: •  Write time= ( ¾ * 250 us + 1/4 * 2000 us) = 687 us

75% part full •  Write time = ( ¼ * 250 us + ¾ * 2000 us ) = 1562 us

Empty

Partially Full

Page 29: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Valid Data Page

Empty Data Page

InValid Data Page

Free Block Pool

Used Block Pool

SSD Controller Insert

Data Insert

Page 30: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Valid Data Page

Empty Data Page

Invalid Data Page

Free Block Pool

Used Block Pool

SSD Controller Update

Data Update

Page 31: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Valid Data Page

Empty Data Page

Invalid Data Page

Free Block Pool

Used Block Pool

SSD Controller

Garbage Collection

Page 32: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

32

©2011 Quest Software, Inc. All rights reserved..

Page 33: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

33

©2011 Quest Software, Inc. All rights reserved..

11g DB flash Cache

Page 34: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

34

©2011 Quest Software, Inc. All rights reserved..

Oracle DB flash cache

• Introduced in 11gR2 for OEL and Solaris only • Secondary cache maintained by the DBWR, but only when idle cycles permit • Architecture is tolerant of poor flash write performance

Page 35: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Buffer cache and Free buffer waits

Database files

Buffer cache

DBWR

Oracle process Free

Buffer Waits

Write dirty blocks to disk

Write to buffer cache

Read from disk

Read from buffer cache

Free buffer waits often occur when reads are much faster than writes....

Page 36: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Flash Cache

Database files

Buffer cache

DBWR

Oracle process

Write dirty blocks to disk

Write to buffer cache

Read from disk

Read from buffer cache

Flash Cache

Write clean blocks (time permitting)

Read from flash cache

DB Flash cache architecture is designed to accelerate buffered reads

Page 37: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

37

©2011 Quest Software, Inc. All rights reserved..

Configuration

• Create filesystem from flash device

• Set DB_FLASH_CACHE_FILE and DB_FLASH_CACHE_SIZE.

• Consider Filesystemio_options=setall

Page 38: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

38

©2011 Quest Software, Inc. All rights reserved..

Flash KEEP pool

• You can prioritise blocks for important objects using the FLASH_CACHE clause:

Page 39: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

39

©2011 Quest Software, Inc. All rights reserved..

Oracle Db flash cache statistics

http://guyharrison.squarespace.com/storage/flash_insert_stats.sql

Page 40: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Flash Cache Efficiency

http://guyharrison.squarespace.com/storage/flash_time_savings.sql

Page 41: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Flash cache Contents

http://guyharrison.squarespace.com/storage/flashContents.sql

Page 42: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

42

©2011 Quest Software, Inc. All rights reserved..

Performance tests

Page 43: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

43

©2011 Quest Software, Inc. All rights reserved..

Test systems

• Low end system: •  Dell Optiplex dual-core 4GB RAM

•  2xSeagate 7500RPM Baracuda SATA HDD

•  Intel X-25E SLC SATA SSD

• Higher end system: •  Dell R510 2xquad core, 32 GB RAM

•  4x300GB 15K RPM,6Gbps Dell SAS HDD

•  1xFusionIO ioDrive SLC PCI SSD

Page 44: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Performance: indexed reads(X-25)

529.7

143.27

48.17

0 100 200 300 400 500 600

No Flash

Flash cache

Flash tablespace

Elapsed (s)

CPU

db file IO

flash cache IO

Other

Page 45: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Performance: Read/Write (X-25)

3,289

1,693

200

0 500 1000 1500 2000 2500 3000 3500

No Flash

Flash Cache

Flash tablespace

Elapsed time (s)

CPU

db file IO

write complete

free buffer

flash cache IO

Other

Page 46: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Random reads – FusionIO

2,211

583

121

0 500 1000 1500 2000 2500

SAS disk, no flash cache

SAS disk, flash cache

Table on SSD

Elapsed time (s)

CPU

Other

DB File IO

Flash cache IO

Page 47: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Updates – Fusion IO

6,219

1,934

529

0 1000 2000 3000 4000 5000 6000 7000

SAS disk, no flash cache

SAS disk, flash cache

Table on SSD

Elapsed Time (s)

DB CPU

db file IO

log file IO

flash cache

free buffer waits

Other

Page 48: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Full table scan – FusionIO

418

398

72

0 50 100 150 200 250 300 350 400 450

SAS disk, no flash cache

SAS disk, flash cache

Table on SSD

Elasped time (s)

CPU

Other

DB File IO

Flash Cache IO

Page 49: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

49

Sorting – what we expect Ti

me

PGA Memory available (MB)

Table/Index IO CPU Time Temp Segment IO

Memory Sort

Single Pass Disk Sort

Multi-pass Disk Sort

Page 50: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

50

Disk Sorts – temporary tablespace

0

500

1000

1500

2000

2500

3000

3500

4000

0 50 100 150 200 250 300

Elap

sed

time

(s)

Sort Area Size

SAS based TTS SSD based TTS

Single Pass Disk Sort

Multi-pass Disk Sort

Page 51: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Redo performance – Fusion IO

292.39

291.93

0 50 100 150 200 250 300 350

Flash based redo log

SAS based redo log

Elapsed time (s)

CPU

Log IO

Page 52: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

52

Concurrent redo workload (x10)

1,605

1,637

397

331

1,944

1,681

0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500

SAS based redo log

Flash based redo log

Elapsed time (s)

CPU

Other

Log File IO

Page 53: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

53

©2011 Quest Software, Inc. All rights reserved..

Buffer Cache bottlenecks

• Flash cache architecture avoids ‘free buffer waits’ due to flash IO, but write complete waits can still occur on hot blocks.

• Free buffer waits are still likely against the database files, due to high physical read rates created by the flash cache

Page 54: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

54

©2011 Quest Software, Inc. All rights reserved..

Write degradation

• In theory, high sustained write IO can lead to SSD degradation when GC fails to cope with the block erase/update cycle

• In practice, this is rarely noticeable from Oracle: •  Oracle write IO is largely asynchronous (DBWR)

•  Almost all write activity has at least an equal amount of read activity

•  Garbage collection and wear levelling algorithms are sophisticated in decent SSD drives

Page 55: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

55

©2011 Quest Software, Inc. All rights reserved..

Page 56: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

56

©2011 Quest Software, Inc. All rights reserved..

Page 57: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

57

©2011 Quest Software, Inc. All rights reserved..

Fusion IO direct cache

57

Read-intensive, potentially massive

tablespaces

• Temp Tablespace •  Hot Segments •  Hot Partitions •  DB Flash Cache

(limited to the size of the SSD)

Regular Block Device

ioMemory VSL

File System/ Raw Devices/ ASM

directCache

File System/ Raw Devices/ ASM

Caching Block Device

ioMemory VSL

LUN

Page 58: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

Fusion IO direct cache – Table scans

147

147

147

36

0 20 40 60 80 100 120 140 160

No cache 1st scan

No cache 2nd scan

direct cache on 1st scan

direct cache on 2nd scan

Elapsed time (s)

CPU

IO

Other

Page 59: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

59

©2011 Quest Software, Inc. All rights reserved..

Exadata

59

Page 60: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf
Page 61: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

61

©2011 Quest Software, Inc. All rights reserved..

Exadata flash storage

• 4x96GB PCI Flash drives on each storage server • Flash can be configured as:

•  Exadata Smart Flash Cache (ESFC)

•  Solid State Disk available to ASM disk groups

• ESFC is not the same as the DB flash cache: •  Maintained by cellsrv, not DBWR

•  DOES supprort full table scans

•  DOES NOT support smart scans •  Unless CELL_FLASH_CACHE= KEEP,

•  Statistics accessed via the cellcli program

• Considerations for cache vs SSD may be similar

Page 62: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

62

©2011 Quest Software, Inc. All rights reserved..

Summary

Page 63: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

63

©2011 Quest Software, Inc. All rights reserved..

Recommendations

• Don’t wait for SSD to become as cheap as HDD •  Magnetic HDD will always be cheaper per GB, SSD cheaper per IO

• Consider a mixed or tiered storage strategy •  Using DB flash cache, selective SSD tablespaces or partitions

•  Use SSD where your IO bottleneck is greatest and SSD advantage is significant

• DB flash cache offers an easy way to leverage SSD for OLTP workloads, but has few advantages for OLAP or Data Warehouse

Page 64: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

64

©2011 Quest Software, Inc. All rights reserved..

How to use SSD

• Database flash cache •  If your bottleneck is single block (indexed reads) and you are on OEL or

Solaris 11GR2

• Flash tablespace •  Optimize read/writes against “hot” segments or partitions

• Flash temp tablespace •  If multi-pass disk sorts or hash joins are your bottleneck

• FusionIO direct cache •  If you want to optimize both scans and index reads OR you are not on

OEL/Solaris 11GR2

64

Page 65: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

65

©2011 Quest Software, Inc. All rights reserved..

Page 66: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

66

©2011 Quest Software, Inc. All rights reserved..

Page 67: Database & Technology 1 _ Guy Harrison _ Making the most of SSD in Oracle11g.pdf

67

©2011 Quest Software, Inc. All rights reserved..

References •  Latest version of this presentation:

http://www.slideshare.net/gharriso/ssd-and-the-db-flash-cache

• Guy Harrison blog (guyharrison.net) postings: •  All blog posts:

•  http://guyharrison.squarespace.com/blog/tag/ssd

•  SSD guiide (work in progress): •  http://guyharrison.squarespace.com/ssdguide/

•  Kevin Closson:

•  http://kevinclosson.wordpress.com/2009/12/15/pardon-me-where-is-that-flash-cache-part-ii/

• General articles on SSD: •  http://www.anandtech.com/storage/showdoc.aspx?i=3631 •  http://en.wikipedia.org/wiki/Flash_memory •  http://www.virident.com/downloads/Virident_Sustained_Performance_Whitepaper.pdf