design and evaluation of architectures for commercial applications

67
Western Research Laboratory Design and Design and Evaluation of Evaluation of Architectures for Architectures for Commercial Commercial Applications Applications Luiz André Barroso Luiz André Barroso Part I: benchmarks Part I: benchmarks

Upload: sivan

Post on 08-Jan-2016

27 views

Category:

Documents


4 download

DESCRIPTION

Design and Evaluation of Architectures for Commercial Applications. Part I: benchmarks. Luiz André Barroso. Why architects should learn about commercial applications?. Because they are very different from typical benchmarks Because they are demanding on many interesting architectural features - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Design and Evaluation of Architectures for Commercial Applications

Western Research Laboratory

Design and Evaluation of Design and Evaluation of Architectures for Architectures for Commercial ApplicationsCommercial Applications

Luiz André BarrosoLuiz André Barroso

Part I: benchmarksPart I: benchmarks

Page 2: Design and Evaluation of Architectures for Commercial Applications

2 UPC, February 1999

Why architects should learn about Why architects should learn about commercial applications?commercial applications?

Because they are very different from typical Because they are very different from typical benchmarksbenchmarks

Because they are demanding on many interesting Because they are demanding on many interesting architectural featuresarchitectural features

Because they are driving the sales of mid-range Because they are driving the sales of mid-range and high-end systemsand high-end systems

Page 3: Design and Evaluation of Architectures for Commercial Applications

3 UPC, February 1999

Shortcomings of popular benchmarksShortcomings of popular benchmarks SPECSPEC

uniprocessor-orienteduniprocessor-orientedsmall cache footprintssmall cache footprintsexacerbates impact of CPU core issuesexacerbates impact of CPU core issues

SPLASHSPLASHsmall cache footprintssmall cache footprintsextremely optimized sharingextremely optimized sharing

STREAMSSTREAMSno real sharing/communicationno real sharing/communicationmainly bandwidth-orientedmainly bandwidth-oriented

Page 4: Design and Evaluation of Architectures for Commercial Applications

4 UPC, February 1999

SPLASH vs. Online Transaction Processing SPLASH vs. Online Transaction Processing (OLTP)(OLTP)

A typical SPLASH app. hasA typical SPLASH app. has

> 3x the issue rate,> 3x the issue rate,

~26x less cycles spent in memory barriers,~26x less cycles spent in memory barriers,

1/4 of the TLB miss ratios,1/4 of the TLB miss ratios,

< 1/2 the fraction of cache-to-cache transfers,< 1/2 the fraction of cache-to-cache transfers,

~22x smaller instruction cache miss ratio,~22x smaller instruction cache miss ratio,

~1/2 L2$ miss ratio~1/2 L2$ miss ratio

...of an OLTP ...of an OLTP app.app.

Page 5: Design and Evaluation of Architectures for Commercial Applications

5 UPC, February 1999

But the real reason we care? $$$!But the real reason we care? $$$!

Server market:Server market:Total: > $50 billionTotal: > $50 billionNumeric/scientific computing: < $2 billionNumeric/scientific computing: < $2 billionRemaining $48 billion?Remaining $48 billion?

– OLTPOLTP

– DSSDSS

– Internet/WebInternet/Web Trend is for numerical/scientific to remain a nicheTrend is for numerical/scientific to remain a niche

Page 6: Design and Evaluation of Architectures for Commercial Applications

6 UPC, February 1999

Relevance of server vs. PC marketRelevance of server vs. PC market

High profit marginsHigh profit margins Performance is a differentiating factorPerformance is a differentiating factor If you sell the server you will probably sell:If you sell the server you will probably sell:

the clientthe client the storagethe storage the networking infrastructurethe networking infrastructure the middlewarethe middleware the servicethe service ......

Page 7: Design and Evaluation of Architectures for Commercial Applications

7 UPC, February 1999

Need for speed in the commercial marketNeed for speed in the commercial market

Applications pushing the envelopeApplications pushing the envelopeEnterprise resource planning (ERP)Enterprise resource planning (ERP)Electronic commerceElectronic commerceData mining/warehousingData mining/warehousingADSL serversADSL servers

Specialized solutionsSpecialized solutions Intel splitting Pentium line into 3-tiersIntel splitting Pentium line into 3-tiersOracle’s raw iron initiativeOracle’s raw iron initiativeNetwork Appliances’ machinesNetwork Appliances’ machines

Page 8: Design and Evaluation of Architectures for Commercial Applications

8 UPC, February 1999

Seminar disclaimerSeminar disclaimer

Hardware centric approach:Hardware centric approach: target is build better machines, not better softwaretarget is build better machines, not better software focus on fundamental behavior, not on software focus on fundamental behavior, not on software

“features”“features” Stick to general purpose paradigmStick to general purpose paradigm Emphasis on CPU+memory system issuesEmphasis on CPU+memory system issues Lots of things missing:Lots of things missing:

object-relational and object-oriented databasesobject-relational and object-oriented databasespublic domain/academic database enginespublic domain/academic database enginesmany othersmany others

Page 9: Design and Evaluation of Architectures for Commercial Applications

9 UPC, February 1999

OverviewOverview

Day I: Introduction and workloadsDay I: Introduction and workloadsBackground on commercial applicationsBackground on commercial applicationsSoftware structure of a commercial RDBMSSoftware structure of a commercial RDBMSStandard benchmarksStandard benchmarks

– TPC-BTPC-B– TPC-CTPC-C– TPC-DTPC-D– TPC-WTPC-W

Cost and pricing trendsCost and pricing trendsScaling down TPC benchmarksScaling down TPC benchmarks

Page 10: Design and Evaluation of Architectures for Commercial Applications

10 UPC, February 1999

Overview(2)Overview(2)

Day 2: Evaluation methods/toolsDay 2: Evaluation methods/tools IntroductionIntroductionSoftware instrumentation (ATOM) Software instrumentation (ATOM) Hardware measurement & profilingHardware measurement & profiling

– IPROBEIPROBE– DCPIDCPI– ProfileMeProfileMe

Tracing & trace-driven simulationTracing & trace-driven simulationUser-level simulatorsUser-level simulatorsComplete machine simulators (SimOS)Complete machine simulators (SimOS)

Page 11: Design and Evaluation of Architectures for Commercial Applications

11 UPC, February 1999

Overview (3)Overview (3)

Day III: Architecture studiesDay III: Architecture studiesMemory system characterizationMemory system characterizationOut-of-order processorsOut-of-order processorsSimultaneous multithreadingSimultaneous multithreadingFinal remarksFinal remarks

Page 12: Design and Evaluation of Architectures for Commercial Applications

12 UPC, February 1999

Background on commercial applicationsBackground on commercial applications

Database applications:Database applications:Online Transaction Processing (OLTP)Online Transaction Processing (OLTP)

– massive number of short queriesmassive number of short queries

– read/update indexed tablesread/update indexed tables

– canonical example: banking systemcanonical example: banking systemDecision Support Systems (DSS)Decision Support Systems (DSS)

– smaller number of complex queriessmaller number of complex queries

– mostly read-only over large (non-indexed) tablesmostly read-only over large (non-indexed) tables

– canonical example: business analysiscanonical example: business analysis

Page 13: Design and Evaluation of Architectures for Commercial Applications

13 UPC, February 1999

Background (2)Background (2)

Web/Internet applicationsWeb/Internet applicationsWeb serverWeb server

– many requests for small/medium filesmany requests for small/medium filesProxyProxy

– many short-lived connection requestsmany short-lived connection requests– content caching and coherencecontent caching and coherence

Web search indexWeb search index– DSS with a Web front-endDSS with a Web front-end

E-commerce siteE-commerce site– OLTP with a Web front-endOLTP with a Web front-end

Page 14: Design and Evaluation of Architectures for Commercial Applications

14 UPC, February 1999

Background (3)Background (3)

Common characteristicsCommon characteristicsLarge amounts of data manipulationLarge amounts of data manipulation Interactive response times requiredInteractive response times requiredHighly multithreaded by designHighly multithreaded by design

– suitable for large multiprocessorssuitable for large multiprocessorsSignificant I/O requirementsSignificant I/O requirementsExtensive/complex interactions with the operating Extensive/complex interactions with the operating

systemsystemRequire robustness and resiliency to failuresRequire robustness and resiliency to failures

Page 15: Design and Evaluation of Architectures for Commercial Applications

15 UPC, February 1999

Database performance bottlenecksDatabase performance bottlenecks

I/O-bound until recently (Thakkar, ISCA’90)I/O-bound until recently (Thakkar, ISCA’90) Many improvements since thenMany improvements since then

multithreading of DB enginemultithreading of DB engine I/O prefetchingI/O prefetchingVLM (very large memory) database cachingVLM (very large memory) database cachingmore efficient OS interactionsmore efficient OS interactionsRAIDsRAIDsnon-volatile DRAM (NVDRAM)non-volatile DRAM (NVDRAM)

Today’s bottlenecks:Today’s bottlenecks:Memory systemMemory systemProcessor architectureProcessor architecture

Page 16: Design and Evaluation of Architectures for Commercial Applications

16 UPC, February 1999

Structure of a database workloadStructure of a database workload

clients Application server(optional)

Database server

Simple logic checks Formulates and issues DB query

Executes query

Page 17: Design and Evaluation of Architectures for Commercial Applications

17 UPC, February 1999

Who is who in the database market?Who is who in the database market?

DB engine:DB engine:Oracle is dominantOracle is dominantother players: Microsoft, Sybase, Informixother players: Microsoft, Sybase, Informix

Database applications:Database applications:SAP is dominantSAP is dominantother players: Oracle Apps, PeopleSoft, Baanother players: Oracle Apps, PeopleSoft, Baan

Hardware:Hardware:players: Sun, IBM, HP and Compaqplayers: Sun, IBM, HP and Compaq

Page 18: Design and Evaluation of Architectures for Commercial Applications

18 UPC, February 1999

Who is who in the database market? (2)Who is who in the database market? (2)

Historically, mainly mainframe proprietary OSHistorically, mainly mainframe proprietary OS Today:Today:

Unix: 40%Unix: 40%NT: 8%NT: 8%Proprietary: 52%Proprietary: 52%

In two years:In two years:Unix 46%Unix 46%NT 19%NT 19%Proprietary 35%Proprietary 35%

Page 19: Design and Evaluation of Architectures for Commercial Applications

19 UPC, February 1999

Overview of a RDBMS: Oracle8Overview of a RDBMS: Oracle8

Similar in structure to most commercial enginesSimilar in structure to most commercial engines Runs on:Runs on:

uniprocessorsuniprocessorsSMP multiprocessorsSMP multiprocessorsNUMA multiprocessors*NUMA multiprocessors*

For clusters or message passing multiprocessors:For clusters or message passing multiprocessors:Oracle Parallel Server (OPS)Oracle Parallel Server (OPS)

Page 20: Design and Evaluation of Architectures for Commercial Applications

20 UPC, February 1999

The Oracle RDBMSThe Oracle RDBMS

Physical structurePhysical structureControl filesControl files

– basic info on the database, it’s structure and statusbasic info on the database, it’s structure and statusData filesData files

– tables: actual database datatables: actual database data

– indexes: sorted list of pointers to dataindexes: sorted list of pointers to data

– rollback segments: keep data for recovery upon a rollback segments: keep data for recovery upon a failed transactionfailed transaction

Log filesLog files– compressed storage of DB updatescompressed storage of DB updates

Page 21: Design and Evaluation of Architectures for Commercial Applications

21 UPC, February 1999

Index filesIndex files

Critical in speeding up access to data by avoiding Critical in speeding up access to data by avoiding expensive scansexpensive scans

The more selective the index, the faster the accessThe more selective the index, the faster the access Drawbacks:Drawbacks:

Very selective indexes may occupy lots of storageVery selective indexes may occupy lots of storageUpdates to indexed data are more expensiveUpdates to indexed data are more expensive

Page 22: Design and Evaluation of Architectures for Commercial Applications

22 UPC, February 1999

Files or raw disk devicesFiles or raw disk devices

Most DB engines can directly access disks as raw Most DB engines can directly access disks as raw devicesdevices

Idea is to bypass the file systemIdea is to bypass the file system Manageability/flexibility somewhat compromisedManageability/flexibility somewhat compromised Performance boost not large (~10-15%)Performance boost not large (~10-15%) Most customer installations use file systemsMost customer installations use file systems

Page 23: Design and Evaluation of Architectures for Commercial Applications

23 UPC, February 1999

Transactions & rollback segmentsTransactions & rollback segments

Single transaction can access/update many itemsSingle transaction can access/update many items Atomicity is required:Atomicity is required:

transaction either happens or nottransaction either happens or not

old value of old value of balance(X)balance(X) is kept in a rollback is kept in a rollback segmentsegment

rollback: old values restored, all locks releasedrollback: old values restored, all locks released

Example: bank transfer Transaction A (accounts X,Y; value M) { read account balance(X) subtract M from balance(X) add M to balance(Y) commit}

failurefailure

Page 24: Design and Evaluation of Architectures for Commercial Applications

24 UPC, February 1999

Transactions & log filesTransactions & log files A transaction is only committed after it’s side A transaction is only committed after it’s side

effects are in stable storageeffects are in stable storage Writing all modified DB blocks would be too Writing all modified DB blocks would be too

expensiveexpensive random disk writes are costlyrandom disk writes are costly a whole DB block has to be written backa whole DB block has to be written back no coalescing of updatesno coalescing of updates

Alternative: write only a log of modificationsAlternative: write only a log of modifications sequential I/O writes (enables NVDRAM optimizations)sequential I/O writes (enables NVDRAM optimizations) batching of multiple commitsbatching of multiple commits

Background process periodically writes dirty data Background process periodically writes dirty data blocks out blocks out

Page 25: Design and Evaluation of Architectures for Commercial Applications

25 UPC, February 1999

Transactions & log files (2)Transactions & log files (2)

When a block is written to disk the log file entries When a block is written to disk the log file entries are deletedare deleted

If the system crashes:If the system crashes: in-memory dirty blocks are lostin-memory dirty blocks are lost

Recovery procedure:Recovery procedure:goes through the log files and applies all updates to goes through the log files and applies all updates to

the databasethe database

Page 26: Design and Evaluation of Architectures for Commercial Applications

26 UPC, February 1999

Transactions & concurrency controlTransactions & concurrency control

Many transactions in-flight at any given timeMany transactions in-flight at any given timeLocking of data items is requiredLocking of data items is required

Lock granularity:Lock granularity:

Efficient row-level locking is needed for high Efficient row-level locking is needed for high transaction throughputtransaction throughput

Table

Block

Row

concurrenc y

ove rh ead

Page 27: Design and Evaluation of Architectures for Commercial Applications

27 UPC, February 1999

233

Row-level lockingRow-level locking Each new transaction is assigned an unique IDEach new transaction is assigned an unique ID A transaction table keeps track of all active transactionsA transaction table keeps track of all active transactions Lock: write ID in directory entry for rowLock: write ID in directory entry for row Unlock: remove ID from transaction tableUnlock: remove ID from transaction table

Data block directory

Transaction table

234235

120 230

233

Dat

a bl

ock

Simultaneous release of all locksSimultaneous release of all locks Simultaneous release of all locksSimultaneous release of all locks

233233233233

Page 28: Design and Evaluation of Architectures for Commercial Applications

28 UPC, February 1999

Transaction read consistencyTransaction read consistency A transaction that reads a full table should see a A transaction that reads a full table should see a

consistent snapshotconsistent snapshot

For performance, reads shouldn’t lock a tableFor performance, reads shouldn’t lock a table

Problem: intervening writesProblem: intervening writes

Solution: leverage rollback mechanismSolution: leverage rollback mechanism intervening write saves old value in rollback segmentintervening write saves old value in rollback segment

Page 29: Design and Evaluation of Architectures for Commercial Applications

29 UPC, February 1999

Oracle: software structureOracle: software structure Server processesServer processes

actual execution of transactionsactual execution of transactions

DB writerDB writer flush dirty blocks to diskflush dirty blocks to disk

Log writerLog writer writes redo logs to disk at writes redo logs to disk at

commit timecommit time Process and system monitorsProcess and system monitors

misc. activity monitoring and misc. activity monitoring and recoveryrecovery

Processes communicate Processes communicate through SGA and IPCthrough SGA and IPC

Page 30: Design and Evaluation of Architectures for Commercial Applications

30 UPC, February 1999

Oracle: software structure(2)Oracle: software structure(2) SGA: SGA:

shared memory segment mapped shared memory segment mapped by all processes by all processes

Block buffer areaBlock buffer area cache of database blockscache of database blocks larger portion of physical memorylarger portion of physical memory

Metadata areaMetadata area where most communication takes where most communication takes

placeplace synchronization structuressynchronization structures shared proceduresshared procedures directory informationdirectory information

Block buffer area

Redo buffers

Data dictionary

Fixed region

Shared pool

System Global Area (SGA)

Metadata area

Incr

easi

ng v

irtua

l add

ress

Page 31: Design and Evaluation of Architectures for Commercial Applications

31 UPC, February 1999

Oracle: software structure(3)Oracle: software structure(3)

Hiding I/O latency:Hiding I/O latency:many server processes/processormany server processes/processor large block buffer arealarge block buffer area

Process dynamics:Process dynamics: server reads/updates database server reads/updates database

(allocates entries in the redo buffer pool)(allocates entries in the redo buffer pool) at commit time server signals Log writer and sleepsat commit time server signals Log writer and sleeps Log writer wakes up, coalesces multiple commits and issues Log writer wakes up, coalesces multiple commits and issues

log file writelog file write after log is written, Log writer signals suspended serversafter log is written, Log writer signals suspended servers

Page 32: Design and Evaluation of Architectures for Commercial Applications

32 UPC, February 1999

Oracle: NUMA issuesOracle: NUMA issues

Single SGA region complicates NUMA localizationSingle SGA region complicates NUMA localization Single log writer process becomes a bottleneckSingle log writer process becomes a bottleneck Oracle8 is incorporating NUMA-friendly Oracle8 is incorporating NUMA-friendly

optimizationsoptimizations Current large NUMA systems use OPS even on a Current large NUMA systems use OPS even on a

single address spacesingle address space

Page 33: Design and Evaluation of Architectures for Commercial Applications

33 UPC, February 1999

Oracle Parallel Server (OPS)Oracle Parallel Server (OPS)

Runs on clusters of SMPs/NUMAsRuns on clusters of SMPs/NUMAs Layered on top of RDBMS engineLayered on top of RDBMS engine Shared data through diskShared data through disk Performance very dependent on how well data can Performance very dependent on how well data can

be partitionedbe partitioned Not supported by most application vendorsNot supported by most application vendors

Page 34: Design and Evaluation of Architectures for Commercial Applications

34 UPC, February 1999

Running Oracle: other issuesRunning Oracle: other issues

Most memory allocated to block buffer areaMost memory allocated to block buffer area Need to eliminate OS double bufferingNeed to eliminate OS double buffering Best performance attained by limiting process Best performance attained by limiting process

migrationmigration In large SMPs, dedicating one processor to I/O may In large SMPs, dedicating one processor to I/O may

be advantageousbe advantageous

Page 35: Design and Evaluation of Architectures for Commercial Applications

35 UPC, February 1999

TPC Database BenchmarksTPC Database Benchmarks

Transaction Processing Performance Council (TPC)Transaction Processing Performance Council (TPC)Established about 10 years agoEstablished about 10 years agoMission: define representative benchmark standards Mission: define representative benchmark standards

for vendors (hardware/software) to compare their for vendors (hardware/software) to compare their productsproducts

Focus on both performance and price/performanceFocus on both performance and price/performanceStrict rules about how the benchmark is ranStrict rules about how the benchmark is ranOnly widely used benchmarksOnly widely used benchmarks

Page 36: Design and Evaluation of Architectures for Commercial Applications

36 UPC, February 1999

TPC pricing rulesTPC pricing rules

Must includeMust includeAll hardwareAll hardware

– server, I/O, networking, switches, clientsserver, I/O, networking, switches, clientsAll softwareAll software

– OS, any middleware, database engineOS, any middleware, database engine5-year maintenance contract5-year maintenance contractCan include usual discountsCan include usual discountsAudited components must be products Audited components must be products

Page 37: Design and Evaluation of Architectures for Commercial Applications

37 UPC, February 1999

TPC history of benchmarksTPC history of benchmarks TPC-ATPC-A

First OLTP benchmarkFirst OLTP benchmark Based on Jim Gray’s Debit-Credit benchmarkBased on Jim Gray’s Debit-Credit benchmark

TPC-BTPC-B Simpler version of TPC-ASimpler version of TPC-A Meant as a stress test of the server onlyMeant as a stress test of the server only

TPC-CTPC-C Current TPC OLTP benchmarkCurrent TPC OLTP benchmark Much more complex than TPC-A/BMuch more complex than TPC-A/B

TPC-DTPC-D Current TPC DSS benchmarkCurrent TPC DSS benchmark

TPC-WTPC-W New Web-based e-commerce benchmarkNew Web-based e-commerce benchmark

Page 38: Design and Evaluation of Architectures for Commercial Applications

38 UPC, February 1999

The TPC-B benchmarkThe TPC-B benchmark Models a bank with many branchesModels a bank with many branches

1 transaction type: account update1 transaction type: account update

Metrics: Metrics: tpsB (transactions/second)tpsB (transactions/second) $/tpsB$/tpsB

Scale requirement:Scale requirement: 1 tpsB needs 100,000 accounts 1 tpsB needs 100,000 accounts

Branch

Teller Account

History

100,00010

Begin transaction Update account balance Write entry in history table Update teller balance Update branch balanceCommit

Page 39: Design and Evaluation of Architectures for Commercial Applications

39 UPC, February 1999

TPC-B: other requirementsTPC-B: other requirements

System must be ACIDSystem must be ACID (A)tomicity(A)tomicity

– transactions either commit or leave the system as if transactions either commit or leave the system as if were never issuedwere never issued

(C)onsistency(C)onsistency– transactions take system from a consistent state to transactions take system from a consistent state to

anotheranother (I)solation(I)solation

– concurrent transactions execute as if in some serial concurrent transactions execute as if in some serial orderorder

(D)urability(D)urability– results of committed transactions are resilient to faultsresults of committed transactions are resilient to faults

Page 40: Design and Evaluation of Architectures for Commercial Applications

40 UPC, February 1999

The TPC-C benchmarkThe TPC-C benchmark

Current TPC OLTP benchmarkCurrent TPC OLTP benchmark

Moderately complex OLTPModerately complex OLTP

Models a wholesale supplier managing ordersModels a wholesale supplier managing orders

Workload consists of five transaction typesWorkload consists of five transaction types

Users and database scale linearly with throughputUsers and database scale linearly with throughput

Specification was approved July 23, 1992Specification was approved July 23, 1992

Page 41: Design and Evaluation of Architectures for Commercial Applications

41 UPC, February 1999

TPC-C: schemaTPC-C: schema

WarehouseWarehouseWW

LegendLegend

Table NameTable Name<cardinality><cardinality>

one-to-manyone-to-manyrelationshiprelationship

secondary indexsecondary index

DistrictDistrictW*10W*10

1010

CustomerCustomerW*30KW*30K

3K3K

HistoryHistoryW*30K+W*30K+

1+1+

ItemItem100K (fixed)100K (fixed)

StockStockW*100KW*100K100K100K WW

OrderOrderW*30K+W*30K+1+1+

Order-LineOrder-LineW*300K+W*300K+

10-1510-15

New-OrderNew-OrderW*5KW*5K0-10-1

Page 42: Design and Evaluation of Architectures for Commercial Applications

42 UPC, February 1999

TPC-C: transactionsTPC-C: transactions

New-order: enter a new order from a customerNew-order: enter a new order from a customer Payment: update customer balance to reflect a Payment: update customer balance to reflect a

paymentpayment Delivery: deliver orders (done as a batch Delivery: deliver orders (done as a batch

transaction)transaction) Order-status: retrieve status of customer’s most Order-status: retrieve status of customer’s most

recent orderrecent order Stock-level: monitor warehouse inventoryStock-level: monitor warehouse inventory

Page 43: Design and Evaluation of Architectures for Commercial Applications

43 UPC, February 1999

TPC-C: transaction flowTPC-C: transaction flow

22

11

Select txn from menu:Select txn from menu:1. New-Order 1. New-Order 45%45%2. Payment 2. Payment 43%43%3. Order-Status3. Order-Status 4%4%4. Delivery 4. Delivery 4%4%5. Stock-Level 5. Stock-Level 4%4%

Input screenInput screen

Output screenOutput screen

Measure menu Response TimeMeasure menu Response Time

Measure txn Response TimeMeasure txn Response Time

Keying time

Think time

33

Go back to 1Go back to 1

Page 44: Design and Evaluation of Architectures for Commercial Applications

44 UPC, February 1999

TPC-C: other requirementsTPC-C: other requirements

TransparencyTransparency tables can be split horizontally and vertically provided tables can be split horizontally and vertically provided

it is hidden from the applicationit is hidden from the application SkewSkew

1% of new-order txn are to a random remote 1% of new-order txn are to a random remote warehousewarehouse

15% of payment txn are to a random remote 15% of payment txn are to a random remote warehousewarehouse

Metrics:Metrics:performance: new-order transactions/minute (tpmC)performance: new-order transactions/minute (tpmC)cost/performance: $/tpmCcost/performance: $/tpmC

Page 45: Design and Evaluation of Architectures for Commercial Applications

45 UPC, February 1999

TPC-C: scaleTPC-C: scale

Maximum of 12 tpmC per warehouseMaximum of 12 tpmC per warehouse Consequently:Consequently:

A quad-Xeon system today (~20,000 tpmC) needsA quad-Xeon system today (~20,000 tpmC) needs– over 1668 warehousesover 1668 warehouses

– over 1 TB of disk storage!!over 1 TB of disk storage!!

That’s a VERY expensive benchmark to run!That’s a VERY expensive benchmark to run!

Page 46: Design and Evaluation of Architectures for Commercial Applications

46 UPC, February 1999

TPC-C: side effects of the skew rulesTPC-C: side effects of the skew rules

Very small fraction of transactions go to remote Very small fraction of transactions go to remote warehouseswarehouses

Transparency rules allow data partitioningTransparency rules allow data partitioning Consequence:Consequence:

Clusters of powerful machines show exceptional Clusters of powerful machines show exceptional numbersnumbers

Compaq has current TPC-C record of over 100 Compaq has current TPC-C record of over 100 KtpmC with an 8-node memory channel clusterKtpmC with an 8-node memory channel cluster

Skew rules are expected to change in the futureSkew rules are expected to change in the future

Page 47: Design and Evaluation of Architectures for Commercial Applications

47 UPC, February 1999

The TPC-D benchmarkThe TPC-D benchmark

Current DSS benchmark from TPCCurrent DSS benchmark from TPC Moderately complex decision support workloadModerately complex decision support workload Models a worldwide reseller of partsModels a worldwide reseller of parts Queries ask real world business questionsQueries ask real world business questions 17 ad hoc DSS queries (Q1 to Q17)17 ad hoc DSS queries (Q1 to Q17) 2 update queries2 update queries

Page 48: Design and Evaluation of Architectures for Commercial Applications

48 UPC, February 1999

TPC-D: schemaTPC-D: schema

CustomerCustomerSF*150KSF*150K

LineItemLineItemSF*6000KSF*6000K

OrderOrderSF*1500KSF*1500K

SupplierSupplierSF*10KSF*10K

NationNation2525

RegionRegion55

PartSuppPartSuppSF*800KSF*800K

PartPartSF*200KSF*200K

Page 49: Design and Evaluation of Architectures for Commercial Applications

49 UPC, February 1999

TPC-D: scaleTPC-D: scale

Unlike TPC-C, scale not tied to performanceUnlike TPC-C, scale not tied to performance Size determined by a Scale Factor (SF)Size determined by a Scale Factor (SF)

SF = {1,10,30,100,300,1000,3000,10000}SF = {1,10,30,100,300,1000,3000,10000} SF=1 means a 1GB database sizeSF=1 means a 1GB database size Majority of current results are in the 100GB and Majority of current results are in the 100GB and

300GB range300GB range Indices and temporary tables can significantly Indices and temporary tables can significantly

increase the total disk capacity. (3-5x is typical)increase the total disk capacity. (3-5x is typical)

Page 50: Design and Evaluation of Architectures for Commercial Applications

50 UPC, February 1999

TPC-D example queryTPC-D example query

Forecasting Revenue Query (Q6)Forecasting Revenue Query (Q6) This query quantifies the amount of revenue increase that would have resulted from This query quantifies the amount of revenue increase that would have resulted from

eliminating company-wide discounts in a given percentage range in a given year. eliminating company-wide discounts in a given percentage range in a given year. Asking this type of “what if” query can be used to look for ways to increase Asking this type of “what if” query can be used to look for ways to increase revenuesrevenues

Considers all line-items shipped in a yearConsiders all line-items shipped in a year Query definition:Query definition: SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM

WHERE L_SHIPDATE >= DATE ‘WHERE L_SHIPDATE >= DATE ‘[DATE][DATE]]’ ]’ AND L_SHIPDATE < DATE ‘AND L_SHIPDATE < DATE ‘[DATE][DATE]’ + INTERVAL ‘1’ YEAR ’ + INTERVAL ‘1’ YEAR AND L_DISCOUNTBETWEEN AND L_DISCOUNTBETWEEN [DISCOUNT][DISCOUNT] - 0.01 AND - 0.01 AND [DISCOUNT][DISCOUNT] + 0.01 + 0.01 AND L_QUANTITY < AND L_QUANTITY < [QUANTITY][QUANTITY]

Page 51: Design and Evaluation of Architectures for Commercial Applications

51 UPC, February 1999

TPC-D execution rulesTPC-D execution rules Power TestPower Test

Queries submitted in a single stream (i.e., no concurrency)Queries submitted in a single stream (i.e., no concurrency) Each Query Set is a permutation of the 17 read-only queriesEach Query Set is a permutation of the 17 read-only queries

Throughput TestThroughput Test

Multiple concurrent query streams Multiple concurrent query streams Single update stream Single update stream

CacheCache FlushFlush

QueryQuerySet 0Set 0(optional)(optional)

UF1UF1 QueryQuerySet 0Set 0

UF2UF2

Timed SequenceTimed SequenceWarm-up, not timedWarm-up, not timed

Query Set 1Query Set 1Query Set 2Query Set 2

Query Set NQuery Set NUF1 UF2 UF1 UF2 UF1 UF2UF1 UF2 UF1 UF2 UF1 UF2Updates:Updates:

.. .. ..

Page 52: Design and Evaluation of Architectures for Commercial Applications

52 UPC, February 1999

TPC-D: metricsTPC-D: metrics

Power Metric (QppD)Power Metric (QppD)Geometric Mean Geometric Mean

Throughput (QthD)Throughput (QthD)Arithmetic MeanArithmetic Mean

Both Metrics represent Both Metrics represent “Queries per Gigabyte Hour”“Queries per Gigabyte Hour”

QppD Size SF

QI i UI jj

j

i

i@

( , ) ( , )

3600

0 0191

2

1

17

where

QI(i,0) Timing Interval for Query i, stream 0

UI(j,0) Timing Interval for Update j, stream 0

SF Scale Factor

QthD SizeS

SFTS@

17

3600

where:

S number of query streams

T elapsed time of test (in seconds)S

Page 53: Design and Evaluation of Architectures for Commercial Applications

53 UPC, February 1999

TPC-D: metrics(2)TPC-D: metrics(2)

Composite Query-Per-Hour Rating (QphD)Composite Query-Per-Hour Rating (QphD)The Power and Throughput metrics are combined to The Power and Throughput metrics are combined to

get the composite queries per hour.get the composite queries per hour.

Reported metrics are:Reported metrics are:– Power: QppD@SizePower: QppD@Size

– Throughput: QthD@SizeThroughput: QthD@Size

– Price/Performance: $/QphD@SizePrice/Performance: $/QphD@Size

QphD Size QppD Size QthD Size@ @ @

Page 54: Design and Evaluation of Architectures for Commercial Applications

54 UPC, February 1999

TPC-D: other issuesTPC-D: other issues

Queries are complex and long-runningQueries are complex and long-running Crucial that DB engine parallelizes queries for Crucial that DB engine parallelizes queries for

acceptable performanceacceptable performance Quality of query parallelizer is the most important Quality of query parallelizer is the most important

factorfactor Large improvements are still observed from Large improvements are still observed from

generation to generation of softwaregeneration to generation of software

Page 55: Design and Evaluation of Architectures for Commercial Applications

55 UPC, February 1999

The TPC-W benchmarkThe TPC-W benchmark

Just introducedJust introduced Represent a business that markets and sells over Represent a business that markets and sells over

the internetthe internet Includes security/authenticationIncludes security/authentication Uses dynamically generated pages (e.g. cgi-bins)Uses dynamically generated pages (e.g. cgi-bins) Metric: Web Interactions Per Second (WIPS)Metric: Web Interactions Per Second (WIPS) Transactions:Transactions:

Browse, shopping-cart, buy, user-registration, and Browse, shopping-cart, buy, user-registration, and searchsearch

Page 56: Design and Evaluation of Architectures for Commercial Applications

56 UPC, February 1999

A look at current audited TPC-C systemsA look at current audited TPC-C systems

Leader in price/performance:Leader in price/performance:Compaq ProLiant 7000-6/450, MS SQL 7.0, NTCompaq ProLiant 7000-6/450, MS SQL 7.0, NT

– 4x 450MHz Xeons, 2MB cache, 4GB DRAM, 1.4 TB 4x 450MHz Xeons, 2MB cache, 4GB DRAM, 1.4 TB diskdisk

– 22,479 tpmC, $18.84/tpmC22,479 tpmC, $18.84/tpmC Leader in non-cluster performance:Leader in non-cluster performance:

Sun Enterprise 6500, Sybase 11.9, Solaris7Sun Enterprise 6500, Sybase 11.9, Solaris7– 24x 336MHz UltraSPARC IIs, 4MB cache, 24 GB 24x 336MHz UltraSPARC IIs, 4MB cache, 24 GB

DRAM, 4TB diskDRAM, 4TB disk

– 53,050 tpmC, $76.00/tpmC53,050 tpmC, $76.00/tpmC

Page 57: Design and Evaluation of Architectures for Commercial Applications

57 UPC, February 1999

Audited TPC-C systems: price breakdownAudited TPC-C systems: price breakdown

Server sub-component pricesServer sub-component prices$/CPU $/MB DRAM $/GB Disk

Compaq Proliant $4,816.00 $3.92 $145.33Sun E6500 $15,375.00 $9.16 $382.03

Server Price Breakdown

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Compaq Proliant Sun E6500

Disk

Memory

CPU

Base

Page 58: Design and Evaluation of Architectures for Commercial Applications

58 UPC, February 1999

Using TPC benchmarks for architecture studiesUsing TPC benchmarks for architecture studies

Brute force approach: use full audit-sized systemBrute force approach: use full audit-sized system Who can afford it?Who can afford it? How can you run it on top of a simulator?How can you run it on top of a simulator? How can you explore a wide design space?How can you explore a wide design space?

Solution: scaling down the sizeSolution: scaling down the size

Page 59: Design and Evaluation of Architectures for Commercial Applications

59 UPC, February 1999

Careful Scaling of WorkloadsCareful Scaling of Workloads

Identify architectural issue under studyIdentify architectural issue under study Apply appropriate scaling to simplify monitoring and Apply appropriate scaling to simplify monitoring and

enable simulation studiesenable simulation studies

Most scaling experiments on real machinesMost scaling experiments on real machinessimulation-only is not a viable option!simulation-only is not a viable option!

Validation through sanity checks and comparison Validation through sanity checks and comparison with audit-sized runswith audit-sized runs

Page 60: Design and Evaluation of Architectures for Commercial Applications

60 UPC, February 1999

Scaling OLTPScaling OLTP Forget about TPC complianceForget about TPC compliance Determine lower bound on DB sizeDetermine lower bound on DB size

monitor contention for smaller tables/indexesmonitor contention for smaller tables/indexes DB size will change with number of processorsDB size will change with number of processors

I/O bandwidth requirements vary with fraction of DB I/O bandwidth requirements vary with fraction of DB resident in memoryresident in memory

completely in-memory run: no special I/O requirementscompletely in-memory run: no special I/O requirements favor more small disks vs. few large onesfavor more small disks vs. few large ones place all redo logs on a separate diskplace all redo logs on a separate disk reduce OS double-bufferingreduce OS double-buffering

Limit number of transactions executedLimit number of transactions executed

Page 61: Design and Evaluation of Architectures for Commercial Applications

61 UPC, February 1999

Scaling OLTP(2)Scaling OLTP(2) Achieve representative cache behaviorAchieve representative cache behavior

relevant data structures >> size of hardware caches relevant data structures >> size of hardware caches (metadata area size is key)(metadata area size is key)

maintain same number of processes/CPU as larger maintain same number of processes/CPU as larger runrun

Simplify setup by running clients on the server Simplify setup by running clients on the server machinemachine

need to make lighter-weight versions of the clientsneed to make lighter-weight versions of the clients Ensure efficient executionEnsure efficient execution

excessive migration, idle time, OS or application excessive migration, idle time, OS or application spinning distorts metricsspinning distorts metrics

Page 62: Design and Evaluation of Architectures for Commercial Applications

62 UPC, February 1999

Scaling DSSScaling DSS Determine lower bound DB sizeDetermine lower bound DB size

sufficient work in parallel sectionsufficient work in parallel section Ensure representative cache behaviorEnsure representative cache behavior

DB >> hardware cachesDB >> hardware cachesmaintain same number of processes/CPU as large maintain same number of processes/CPU as large

runrun Reduce execution time through sampling Reduce execution time through sampling Major difficulty is ensuring representative query Major difficulty is ensuring representative query

plansplans DSS results more volatile due to improvements in DSS results more volatile due to improvements in

query optimizersquery optimizers

Page 63: Design and Evaluation of Architectures for Commercial Applications

63 UPC, February 1999

Tuning, tuning, tuningTuning, tuning, tuning

Ensure scaled workload is running efficientlyEnsure scaled workload is running efficiently Requires a large number of monitoring runs on Requires a large number of monitoring runs on

actual hardware platformactual hardware platform Resembles “black art” on OracleResembles “black art” on Oracle Self-tuning features in Microsoft SQL 7.0 are Self-tuning features in Microsoft SQL 7.0 are

promisingpromisingability for user overrides is desirable, but missingability for user overrides is desirable, but missing

Page 64: Design and Evaluation of Architectures for Commercial Applications

64 UPC, February 1999

Does Scaling Work?Does Scaling Work?

Page 65: Design and Evaluation of Architectures for Commercial Applications

65 UPC, February 1999

TPC-C: scaled vs. full sizeTPC-C: scaled vs. full size

Breakdown profile of CPU cyclesBreakdown profile of CPU cycles Platform: 8-proc. AlphaServer 8400Platform: 8-proc. AlphaServer 8400

TPC-C, scaled

1-issue8% 2-issue

8%

tlb3%

repl trap5%

br/pc mispr.2%

mb3%

scache hit17%

bcache hit30%

bcache miss24%

TPC-C, full-size

1-issue11%

2-issue8%

tlb1%

repl trap2%

br/pc mispr.

3%

mb6%

scache hit22%

bcache hit20%

bcache miss27%

Page 66: Design and Evaluation of Architectures for Commercial Applications

66 UPC, February 1999

Using simpler OLTP benchmarks:Using simpler OLTP benchmarks:

Although “obsolete” TPC-B can be used in architectural Although “obsolete” TPC-B can be used in architectural studiesstudies

TPC-C, full-size

1-issue11%

2-issue8%

tlb1%

repl trap2%

br/pc mispr.

3%

mb6%

scache hit22%

bcache hit20%

bcache miss27%

TPC-B, scaled

1-issue7%

2-issue6%

tlb2%

repl. trap5%

br/pc mispr.2%

mb9%

scache hit16%

bcache hit16%

bcache miss37%

Page 67: Design and Evaluation of Architectures for Commercial Applications

67 UPC, February 1999

Benchmarks wrap-upBenchmarks wrap-up

Commercial applications are complex, but need to Commercial applications are complex, but need to be considered during design evaluationbe considered during design evaluation

TPC benchmarks cover a wide range of TPC benchmarks cover a wide range of commercial application areascommercial application areas

Scaled down TPC benchmarks can be used for Scaled down TPC benchmarks can be used for architecture studiesarchitecture studies

Architect needs deep understanding of the Architect needs deep understanding of the workloadworkload