systemzilla db2 user group mar2014

Upload: iamsam4u

Post on 12-Oct-2015

23 views

Category:

Documents


0 download

DESCRIPTION

hi

TRANSCRIPT

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    1/17

    2011 IBM Corporation1 IBM Internal Use Only

    [email protected]

    763.228.6463

    Freakish Database Performance With

    Flash Storage

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    2/17

    2011 IBM Corporation2 IBM Internal Use Only

    Agenda

    Share some experience with using solid state/ f lash storagefor database workloads:

    OLTP (2TB)

    Warehouse (76TB)

    Which workload characteristics can best leverage flash

    storage?

    What are some best practices

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    3/17

    2011 IBM Corporation3 IBM Internal Use Only

    OLTP Workload

    Initial profile

    A brokerage house package

    Batch cycle comprised of five Java programs

    (only one can be parallelized)

    1.5M transactions in 8 hours after extensive

    application and SQL tuning

    1.68TB uncompressed

    Online backup time (backup, then gzip) in 36

    hours

    The Challenge

    Goal: 1.5 M trans in 5.5 hours

    Stretch goal: 1.5M trans in 2 hours

    Improve backup time

    What is possible if CPU, memory,

    storage, and network are notconstrained?

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    4/17

    2011 IBM Corporation4 IBM Internal Use Only

    The Setup

    No holds barred 2 x 64 cores, 3.86

    GHz, 1TB RAM

    86TB HDD, 256GB

    cache2 ms average

    response time

    1TB SSD

    10GbE

    Approaches Enabled compression

    No database tuning

    All-HDD

    MixedSSD (logs & temp), HDD (data &

    indexes)

    All-SSD

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    5/17

    2011 IBM Corporation5 IBM Internal Use Only

    Results

    Mixed SSD (logs & temp) & HDD (data, indexes) 14%

    AllSSD 26%

    Disk Utilization < 1% busy

    Average IOPS 20

    Throughput 450KB/s

    Application Engines 30

    Uncompressed offline backup 3040 min

    Compressed online/offline backup (SSD to HDD) 18 min

    Accept all default database settings out of the box

    STMM

    Auto runstats

    Auto online table reorg

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    6/17

    2011 IBM Corporation6 IBM Internal Use Only

    Application Engines Performance

    Most improvements resulted from more CPUs for

    the application CPU intensive

    Verbose application logging

    Application logs generated more IOs

    than database!

    More application engines generating

    transactions to reduce batch elapsed time

    Low database IO profile

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    7/17 2011 IBM Corporation7 IBM Internal Use Only

    Final Results

    Results

    Goal: 1.5M trans in 5.5 hours Y

    Stretch goal: 1.5M trans in 2 hours Y

    Improve backup time: 18 minutes v. 36 hours Y

    Best result: 1.5M trans in 1.1 hours! (All SSD) Y

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    8/17 2011 IBM Corporation8 IBM Internal Use Only

    Warehouse Workload

    Initial profile

    Servers and storage running 100% all day

    long

    Maxed out at around 3040 active users

    Half-stroked disks to get performance and

    throughput

    The Challenge

    Aging servers and storage

    Data center floor space, cooling, and

    power consumption constraints

    Same or better performance

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    9/17 2011 IBM Corporation9 IBM Internal Use Only

    The Setup

    Approach

    Replacement will be very fast, very

    small, very simple

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    10/17 2011 IBM Corporation10 IBM Internal Use Only

    Database IO Improvement for Warehouse Workload

    76TB IBM SSD v. Old HDD

    Sub-millisecond IO response time Sustained

    Synchronous reads 21.8x

    Synchronous writes 13.6x

    Asynchronous reads 17.6x

    Asynchronous writes 18.34x

    Data pages per asynchronous request 1.8x

    Note: Asynchronous IOs are ~18x faster, each asynchronous request

    is ~2x more effective due to 32K page size, that is a 36x improvement.

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    11/17 2011 IBM Corporation11 IBM Internal Use Only

    Benchmark Queries Improvement for Warehouse Workload

    76TB IBM SSD v. Old HDD

    Benchmark details

    Actual IO and CPU intensive queries captured from business

    users

    Runs weekly to monitor any performance degradation with

    respect to new and organic growth in the warehouse over

    time

    Noise queries (75) + benchmark queries (25) = 100

    All SSD Old

    Noise queries completed 85% 32%

    BM queries completed100%

    (first time ever)

    64%(historically never reached 100)

    CPU utilization 30% 100%

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    12/17 2011 IBM Corporation12 IBM Internal Use Only

    Benchmark Queries Speed Up Factor for Warehouse Workload

    (Plotted on Logarithmic Scale)

    76TB IBM SSD v. Old HDD

    Speed up details

    Average: 2.21 (log) or 163.96x faster

    Median: 1.48 (log) or 29.96x faster (50% is at least ~30x faster)

    Low: 0.56 (log) or 3.59x faster

    High: 3.05 (log) or 1,113.56x faster

    Time is measured as elapsed time (prepare + execute + fetch)

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    13/17 2011 IBM Corporation13 IBM Internal Use Only

    CPU Utilization

    About 30% busy BTW We are also using disk level encryption (SED)

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    14/17 2011 IBM Corporation14 IBM Internal Use Only

    EXP30 Ultra SSD

    IO Specifications

    Each drive: SFF (1.8), 1/5 of 1U, 387GB

    IO drawer: 30 drives (6 x 5). Total raw capacity: 11.6TB (30 x 387GB).

    Cache: 3.1GB

    IOPS: 400K (100% read) / 280K (70/30 R/W) / 165K (100% write)

    Two POWER 740 servers connected to one IO drawer

    PCIe attached via GX++ adapter (8Gb/s)

    Configured as 5+p LUNs (130GB LUNs)

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    15/17 2011 IBM Corporation15 IBM Internal Use Only

    Deployment Considerations

    IO adapter card (HBA)

    At 120K400K IOPS per IO drawer, and 32K IO size, it is

    possible to saturate the HBA

    Plan for adequate number of HBAs

    If using SAN then be sure the bandwidth to the storage

    server is consistent along the whole path, for example,

    8Gb/s

    Balance IOs across HBAs and front end ports for even

    utilization

    Be cautious about mixing flash storage & HDD drives in

    one HBA

    Fewer, larger LUNs (500GB700GB)

    LUNs do take up available system memory and CPU

    cycles on the server

    Multiple logical volumes per LUN, no reason to stripe LV

    across LUNs

    Use large page size (32K), extent size, but ensure that the

    database bufferpool(s) are adequately sized to accept big reads

    Optimize data movement with less IOPS. It is not about driving

    up IOPS

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    16/17 2011 IBM Corporation16 IBM Internal Use Only

    Candidate Application Considerations

    High IO profile

    Indexes, data

    Database logs and temp spaces can take advantage of

    cache write through already, may not be the best candidates

    Applications that can parallelize well to take advantage of higher IO

    throughput

    Before we can process more transactions per second the

    applications need to be able to generate more transactions per

    second

    For example, we needed to increase the number of applicationengines from 3 to 30 in order to generate 8x throughput in

    transaction rate

    Applications that spend more time fetching result sets across a network,

    rather than executing complex queries in the database, will likely see less

    improvement (slow consumers)

    client_idle_wait_time (ms) (time spent waiting for client/application to sendits next request)

    If the database spends more time waiting for client/application tosend work then improving database response time alone will not

    improve throughput.

    Increase application parallelism

    Look for network congestion issues

    call monreport.dbsummary(600), examine client_idle_wait_time

  • 5/21/2018 SystemZilla DB2 User Group Mar2014

    17/17 2011 IBM Corporation17 IBM Internal Use Only

    Why Consider Flash Storage

    Greatly beneficial for high IO workloads

    Much smaller footprint, much more energy efficient

    Servers (11), IO drawers (7), power supply all fit in

    one rack!

    Achieve high performance, and throughput quickly without

    tuning

    Performance, reliability, price