design and evaluation of architectures for commercial applications
DESCRIPTION
Design and Evaluation of Architectures for Commercial Applications. Part I: benchmarks. Luiz André Barroso. Why architects should learn about commercial applications?. Because they are very different from typical benchmarks Because they are demanding on many interesting architectural features - PowerPoint PPT PresentationTRANSCRIPT
Western Research Laboratory
Design and Evaluation of Design and Evaluation of Architectures for Architectures for Commercial ApplicationsCommercial Applications
Luiz André BarrosoLuiz André Barroso
Part I: benchmarksPart I: benchmarks
2 UPC, February 1999
Why architects should learn about Why architects should learn about commercial applications?commercial applications?
Because they are very different from typical Because they are very different from typical benchmarksbenchmarks
Because they are demanding on many interesting Because they are demanding on many interesting architectural featuresarchitectural features
Because they are driving the sales of mid-range Because they are driving the sales of mid-range and high-end systemsand high-end systems
3 UPC, February 1999
Shortcomings of popular benchmarksShortcomings of popular benchmarks SPECSPEC
uniprocessor-orienteduniprocessor-orientedsmall cache footprintssmall cache footprintsexacerbates impact of CPU core issuesexacerbates impact of CPU core issues
SPLASHSPLASHsmall cache footprintssmall cache footprintsextremely optimized sharingextremely optimized sharing
STREAMSSTREAMSno real sharing/communicationno real sharing/communicationmainly bandwidth-orientedmainly bandwidth-oriented
4 UPC, February 1999
SPLASH vs. Online Transaction Processing SPLASH vs. Online Transaction Processing (OLTP)(OLTP)
A typical SPLASH app. hasA typical SPLASH app. has
> 3x the issue rate,> 3x the issue rate,
~26x less cycles spent in memory barriers,~26x less cycles spent in memory barriers,
1/4 of the TLB miss ratios,1/4 of the TLB miss ratios,
< 1/2 the fraction of cache-to-cache transfers,< 1/2 the fraction of cache-to-cache transfers,
~22x smaller instruction cache miss ratio,~22x smaller instruction cache miss ratio,
~1/2 L2$ miss ratio~1/2 L2$ miss ratio
...of an OLTP ...of an OLTP app.app.
5 UPC, February 1999
But the real reason we care? $$$!But the real reason we care? $$$!
Server market:Server market:Total: > $50 billionTotal: > $50 billionNumeric/scientific computing: < $2 billionNumeric/scientific computing: < $2 billionRemaining $48 billion?Remaining $48 billion?
– OLTPOLTP
– DSSDSS
– Internet/WebInternet/Web Trend is for numerical/scientific to remain a nicheTrend is for numerical/scientific to remain a niche
6 UPC, February 1999
Relevance of server vs. PC marketRelevance of server vs. PC market
High profit marginsHigh profit margins Performance is a differentiating factorPerformance is a differentiating factor If you sell the server you will probably sell:If you sell the server you will probably sell:
the clientthe client the storagethe storage the networking infrastructurethe networking infrastructure the middlewarethe middleware the servicethe service ......
7 UPC, February 1999
Need for speed in the commercial marketNeed for speed in the commercial market
Applications pushing the envelopeApplications pushing the envelopeEnterprise resource planning (ERP)Enterprise resource planning (ERP)Electronic commerceElectronic commerceData mining/warehousingData mining/warehousingADSL serversADSL servers
Specialized solutionsSpecialized solutions Intel splitting Pentium line into 3-tiersIntel splitting Pentium line into 3-tiersOracle’s raw iron initiativeOracle’s raw iron initiativeNetwork Appliances’ machinesNetwork Appliances’ machines
8 UPC, February 1999
Seminar disclaimerSeminar disclaimer
Hardware centric approach:Hardware centric approach: target is build better machines, not better softwaretarget is build better machines, not better software focus on fundamental behavior, not on software focus on fundamental behavior, not on software
“features”“features” Stick to general purpose paradigmStick to general purpose paradigm Emphasis on CPU+memory system issuesEmphasis on CPU+memory system issues Lots of things missing:Lots of things missing:
object-relational and object-oriented databasesobject-relational and object-oriented databasespublic domain/academic database enginespublic domain/academic database enginesmany othersmany others
9 UPC, February 1999
OverviewOverview
Day I: Introduction and workloadsDay I: Introduction and workloadsBackground on commercial applicationsBackground on commercial applicationsSoftware structure of a commercial RDBMSSoftware structure of a commercial RDBMSStandard benchmarksStandard benchmarks
– TPC-BTPC-B– TPC-CTPC-C– TPC-DTPC-D– TPC-WTPC-W
Cost and pricing trendsCost and pricing trendsScaling down TPC benchmarksScaling down TPC benchmarks
10 UPC, February 1999
Overview(2)Overview(2)
Day 2: Evaluation methods/toolsDay 2: Evaluation methods/tools IntroductionIntroductionSoftware instrumentation (ATOM) Software instrumentation (ATOM) Hardware measurement & profilingHardware measurement & profiling
– IPROBEIPROBE– DCPIDCPI– ProfileMeProfileMe
Tracing & trace-driven simulationTracing & trace-driven simulationUser-level simulatorsUser-level simulatorsComplete machine simulators (SimOS)Complete machine simulators (SimOS)
11 UPC, February 1999
Overview (3)Overview (3)
Day III: Architecture studiesDay III: Architecture studiesMemory system characterizationMemory system characterizationOut-of-order processorsOut-of-order processorsSimultaneous multithreadingSimultaneous multithreadingFinal remarksFinal remarks
12 UPC, February 1999
Background on commercial applicationsBackground on commercial applications
Database applications:Database applications:Online Transaction Processing (OLTP)Online Transaction Processing (OLTP)
– massive number of short queriesmassive number of short queries
– read/update indexed tablesread/update indexed tables
– canonical example: banking systemcanonical example: banking systemDecision Support Systems (DSS)Decision Support Systems (DSS)
– smaller number of complex queriessmaller number of complex queries
– mostly read-only over large (non-indexed) tablesmostly read-only over large (non-indexed) tables
– canonical example: business analysiscanonical example: business analysis
13 UPC, February 1999
Background (2)Background (2)
Web/Internet applicationsWeb/Internet applicationsWeb serverWeb server
– many requests for small/medium filesmany requests for small/medium filesProxyProxy
– many short-lived connection requestsmany short-lived connection requests– content caching and coherencecontent caching and coherence
Web search indexWeb search index– DSS with a Web front-endDSS with a Web front-end
E-commerce siteE-commerce site– OLTP with a Web front-endOLTP with a Web front-end
14 UPC, February 1999
Background (3)Background (3)
Common characteristicsCommon characteristicsLarge amounts of data manipulationLarge amounts of data manipulation Interactive response times requiredInteractive response times requiredHighly multithreaded by designHighly multithreaded by design
– suitable for large multiprocessorssuitable for large multiprocessorsSignificant I/O requirementsSignificant I/O requirementsExtensive/complex interactions with the operating Extensive/complex interactions with the operating
systemsystemRequire robustness and resiliency to failuresRequire robustness and resiliency to failures
15 UPC, February 1999
Database performance bottlenecksDatabase performance bottlenecks
I/O-bound until recently (Thakkar, ISCA’90)I/O-bound until recently (Thakkar, ISCA’90) Many improvements since thenMany improvements since then
multithreading of DB enginemultithreading of DB engine I/O prefetchingI/O prefetchingVLM (very large memory) database cachingVLM (very large memory) database cachingmore efficient OS interactionsmore efficient OS interactionsRAIDsRAIDsnon-volatile DRAM (NVDRAM)non-volatile DRAM (NVDRAM)
Today’s bottlenecks:Today’s bottlenecks:Memory systemMemory systemProcessor architectureProcessor architecture
16 UPC, February 1999
Structure of a database workloadStructure of a database workload
clients Application server(optional)
Database server
Simple logic checks Formulates and issues DB query
Executes query
17 UPC, February 1999
Who is who in the database market?Who is who in the database market?
DB engine:DB engine:Oracle is dominantOracle is dominantother players: Microsoft, Sybase, Informixother players: Microsoft, Sybase, Informix
Database applications:Database applications:SAP is dominantSAP is dominantother players: Oracle Apps, PeopleSoft, Baanother players: Oracle Apps, PeopleSoft, Baan
Hardware:Hardware:players: Sun, IBM, HP and Compaqplayers: Sun, IBM, HP and Compaq
18 UPC, February 1999
Who is who in the database market? (2)Who is who in the database market? (2)
Historically, mainly mainframe proprietary OSHistorically, mainly mainframe proprietary OS Today:Today:
Unix: 40%Unix: 40%NT: 8%NT: 8%Proprietary: 52%Proprietary: 52%
In two years:In two years:Unix 46%Unix 46%NT 19%NT 19%Proprietary 35%Proprietary 35%
19 UPC, February 1999
Overview of a RDBMS: Oracle8Overview of a RDBMS: Oracle8
Similar in structure to most commercial enginesSimilar in structure to most commercial engines Runs on:Runs on:
uniprocessorsuniprocessorsSMP multiprocessorsSMP multiprocessorsNUMA multiprocessors*NUMA multiprocessors*
For clusters or message passing multiprocessors:For clusters or message passing multiprocessors:Oracle Parallel Server (OPS)Oracle Parallel Server (OPS)
20 UPC, February 1999
The Oracle RDBMSThe Oracle RDBMS
Physical structurePhysical structureControl filesControl files
– basic info on the database, it’s structure and statusbasic info on the database, it’s structure and statusData filesData files
– tables: actual database datatables: actual database data
– indexes: sorted list of pointers to dataindexes: sorted list of pointers to data
– rollback segments: keep data for recovery upon a rollback segments: keep data for recovery upon a failed transactionfailed transaction
Log filesLog files– compressed storage of DB updatescompressed storage of DB updates
21 UPC, February 1999
Index filesIndex files
Critical in speeding up access to data by avoiding Critical in speeding up access to data by avoiding expensive scansexpensive scans
The more selective the index, the faster the accessThe more selective the index, the faster the access Drawbacks:Drawbacks:
Very selective indexes may occupy lots of storageVery selective indexes may occupy lots of storageUpdates to indexed data are more expensiveUpdates to indexed data are more expensive
22 UPC, February 1999
Files or raw disk devicesFiles or raw disk devices
Most DB engines can directly access disks as raw Most DB engines can directly access disks as raw devicesdevices
Idea is to bypass the file systemIdea is to bypass the file system Manageability/flexibility somewhat compromisedManageability/flexibility somewhat compromised Performance boost not large (~10-15%)Performance boost not large (~10-15%) Most customer installations use file systemsMost customer installations use file systems
23 UPC, February 1999
Transactions & rollback segmentsTransactions & rollback segments
Single transaction can access/update many itemsSingle transaction can access/update many items Atomicity is required:Atomicity is required:
transaction either happens or nottransaction either happens or not
old value of old value of balance(X)balance(X) is kept in a rollback is kept in a rollback segmentsegment
rollback: old values restored, all locks releasedrollback: old values restored, all locks released
Example: bank transfer Transaction A (accounts X,Y; value M) { read account balance(X) subtract M from balance(X) add M to balance(Y) commit}
failurefailure
24 UPC, February 1999
Transactions & log filesTransactions & log files A transaction is only committed after it’s side A transaction is only committed after it’s side
effects are in stable storageeffects are in stable storage Writing all modified DB blocks would be too Writing all modified DB blocks would be too
expensiveexpensive random disk writes are costlyrandom disk writes are costly a whole DB block has to be written backa whole DB block has to be written back no coalescing of updatesno coalescing of updates
Alternative: write only a log of modificationsAlternative: write only a log of modifications sequential I/O writes (enables NVDRAM optimizations)sequential I/O writes (enables NVDRAM optimizations) batching of multiple commitsbatching of multiple commits
Background process periodically writes dirty data Background process periodically writes dirty data blocks out blocks out
25 UPC, February 1999
Transactions & log files (2)Transactions & log files (2)
When a block is written to disk the log file entries When a block is written to disk the log file entries are deletedare deleted
If the system crashes:If the system crashes: in-memory dirty blocks are lostin-memory dirty blocks are lost
Recovery procedure:Recovery procedure:goes through the log files and applies all updates to goes through the log files and applies all updates to
the databasethe database
26 UPC, February 1999
Transactions & concurrency controlTransactions & concurrency control
Many transactions in-flight at any given timeMany transactions in-flight at any given timeLocking of data items is requiredLocking of data items is required
Lock granularity:Lock granularity:
Efficient row-level locking is needed for high Efficient row-level locking is needed for high transaction throughputtransaction throughput
Table
Block
Row
concurrenc y
ove rh ead
27 UPC, February 1999
233
Row-level lockingRow-level locking Each new transaction is assigned an unique IDEach new transaction is assigned an unique ID A transaction table keeps track of all active transactionsA transaction table keeps track of all active transactions Lock: write ID in directory entry for rowLock: write ID in directory entry for row Unlock: remove ID from transaction tableUnlock: remove ID from transaction table
Data block directory
Transaction table
234235
120 230
233
Dat
a bl
ock
Simultaneous release of all locksSimultaneous release of all locks Simultaneous release of all locksSimultaneous release of all locks
233233233233
28 UPC, February 1999
Transaction read consistencyTransaction read consistency A transaction that reads a full table should see a A transaction that reads a full table should see a
consistent snapshotconsistent snapshot
For performance, reads shouldn’t lock a tableFor performance, reads shouldn’t lock a table
Problem: intervening writesProblem: intervening writes
Solution: leverage rollback mechanismSolution: leverage rollback mechanism intervening write saves old value in rollback segmentintervening write saves old value in rollback segment
29 UPC, February 1999
Oracle: software structureOracle: software structure Server processesServer processes
actual execution of transactionsactual execution of transactions
DB writerDB writer flush dirty blocks to diskflush dirty blocks to disk
Log writerLog writer writes redo logs to disk at writes redo logs to disk at
commit timecommit time Process and system monitorsProcess and system monitors
misc. activity monitoring and misc. activity monitoring and recoveryrecovery
Processes communicate Processes communicate through SGA and IPCthrough SGA and IPC
30 UPC, February 1999
Oracle: software structure(2)Oracle: software structure(2) SGA: SGA:
shared memory segment mapped shared memory segment mapped by all processes by all processes
Block buffer areaBlock buffer area cache of database blockscache of database blocks larger portion of physical memorylarger portion of physical memory
Metadata areaMetadata area where most communication takes where most communication takes
placeplace synchronization structuressynchronization structures shared proceduresshared procedures directory informationdirectory information
Block buffer area
Redo buffers
Data dictionary
Fixed region
Shared pool
System Global Area (SGA)
Metadata area
Incr
easi
ng v
irtua
l add
ress
31 UPC, February 1999
Oracle: software structure(3)Oracle: software structure(3)
Hiding I/O latency:Hiding I/O latency:many server processes/processormany server processes/processor large block buffer arealarge block buffer area
Process dynamics:Process dynamics: server reads/updates database server reads/updates database
(allocates entries in the redo buffer pool)(allocates entries in the redo buffer pool) at commit time server signals Log writer and sleepsat commit time server signals Log writer and sleeps Log writer wakes up, coalesces multiple commits and issues Log writer wakes up, coalesces multiple commits and issues
log file writelog file write after log is written, Log writer signals suspended serversafter log is written, Log writer signals suspended servers
32 UPC, February 1999
Oracle: NUMA issuesOracle: NUMA issues
Single SGA region complicates NUMA localizationSingle SGA region complicates NUMA localization Single log writer process becomes a bottleneckSingle log writer process becomes a bottleneck Oracle8 is incorporating NUMA-friendly Oracle8 is incorporating NUMA-friendly
optimizationsoptimizations Current large NUMA systems use OPS even on a Current large NUMA systems use OPS even on a
single address spacesingle address space
33 UPC, February 1999
Oracle Parallel Server (OPS)Oracle Parallel Server (OPS)
Runs on clusters of SMPs/NUMAsRuns on clusters of SMPs/NUMAs Layered on top of RDBMS engineLayered on top of RDBMS engine Shared data through diskShared data through disk Performance very dependent on how well data can Performance very dependent on how well data can
be partitionedbe partitioned Not supported by most application vendorsNot supported by most application vendors
34 UPC, February 1999
Running Oracle: other issuesRunning Oracle: other issues
Most memory allocated to block buffer areaMost memory allocated to block buffer area Need to eliminate OS double bufferingNeed to eliminate OS double buffering Best performance attained by limiting process Best performance attained by limiting process
migrationmigration In large SMPs, dedicating one processor to I/O may In large SMPs, dedicating one processor to I/O may
be advantageousbe advantageous
35 UPC, February 1999
TPC Database BenchmarksTPC Database Benchmarks
Transaction Processing Performance Council (TPC)Transaction Processing Performance Council (TPC)Established about 10 years agoEstablished about 10 years agoMission: define representative benchmark standards Mission: define representative benchmark standards
for vendors (hardware/software) to compare their for vendors (hardware/software) to compare their productsproducts
Focus on both performance and price/performanceFocus on both performance and price/performanceStrict rules about how the benchmark is ranStrict rules about how the benchmark is ranOnly widely used benchmarksOnly widely used benchmarks
36 UPC, February 1999
TPC pricing rulesTPC pricing rules
Must includeMust includeAll hardwareAll hardware
– server, I/O, networking, switches, clientsserver, I/O, networking, switches, clientsAll softwareAll software
– OS, any middleware, database engineOS, any middleware, database engine5-year maintenance contract5-year maintenance contractCan include usual discountsCan include usual discountsAudited components must be products Audited components must be products
37 UPC, February 1999
TPC history of benchmarksTPC history of benchmarks TPC-ATPC-A
First OLTP benchmarkFirst OLTP benchmark Based on Jim Gray’s Debit-Credit benchmarkBased on Jim Gray’s Debit-Credit benchmark
TPC-BTPC-B Simpler version of TPC-ASimpler version of TPC-A Meant as a stress test of the server onlyMeant as a stress test of the server only
TPC-CTPC-C Current TPC OLTP benchmarkCurrent TPC OLTP benchmark Much more complex than TPC-A/BMuch more complex than TPC-A/B
TPC-DTPC-D Current TPC DSS benchmarkCurrent TPC DSS benchmark
TPC-WTPC-W New Web-based e-commerce benchmarkNew Web-based e-commerce benchmark
38 UPC, February 1999
The TPC-B benchmarkThe TPC-B benchmark Models a bank with many branchesModels a bank with many branches
1 transaction type: account update1 transaction type: account update
Metrics: Metrics: tpsB (transactions/second)tpsB (transactions/second) $/tpsB$/tpsB
Scale requirement:Scale requirement: 1 tpsB needs 100,000 accounts 1 tpsB needs 100,000 accounts
Branch
Teller Account
History
100,00010
Begin transaction Update account balance Write entry in history table Update teller balance Update branch balanceCommit
39 UPC, February 1999
TPC-B: other requirementsTPC-B: other requirements
System must be ACIDSystem must be ACID (A)tomicity(A)tomicity
– transactions either commit or leave the system as if transactions either commit or leave the system as if were never issuedwere never issued
(C)onsistency(C)onsistency– transactions take system from a consistent state to transactions take system from a consistent state to
anotheranother (I)solation(I)solation
– concurrent transactions execute as if in some serial concurrent transactions execute as if in some serial orderorder
(D)urability(D)urability– results of committed transactions are resilient to faultsresults of committed transactions are resilient to faults
40 UPC, February 1999
The TPC-C benchmarkThe TPC-C benchmark
Current TPC OLTP benchmarkCurrent TPC OLTP benchmark
Moderately complex OLTPModerately complex OLTP
Models a wholesale supplier managing ordersModels a wholesale supplier managing orders
Workload consists of five transaction typesWorkload consists of five transaction types
Users and database scale linearly with throughputUsers and database scale linearly with throughput
Specification was approved July 23, 1992Specification was approved July 23, 1992
41 UPC, February 1999
TPC-C: schemaTPC-C: schema
WarehouseWarehouseWW
LegendLegend
Table NameTable Name<cardinality><cardinality>
one-to-manyone-to-manyrelationshiprelationship
secondary indexsecondary index
DistrictDistrictW*10W*10
1010
CustomerCustomerW*30KW*30K
3K3K
HistoryHistoryW*30K+W*30K+
1+1+
ItemItem100K (fixed)100K (fixed)
StockStockW*100KW*100K100K100K WW
OrderOrderW*30K+W*30K+1+1+
Order-LineOrder-LineW*300K+W*300K+
10-1510-15
New-OrderNew-OrderW*5KW*5K0-10-1
42 UPC, February 1999
TPC-C: transactionsTPC-C: transactions
New-order: enter a new order from a customerNew-order: enter a new order from a customer Payment: update customer balance to reflect a Payment: update customer balance to reflect a
paymentpayment Delivery: deliver orders (done as a batch Delivery: deliver orders (done as a batch
transaction)transaction) Order-status: retrieve status of customer’s most Order-status: retrieve status of customer’s most
recent orderrecent order Stock-level: monitor warehouse inventoryStock-level: monitor warehouse inventory
43 UPC, February 1999
TPC-C: transaction flowTPC-C: transaction flow
22
11
Select txn from menu:Select txn from menu:1. New-Order 1. New-Order 45%45%2. Payment 2. Payment 43%43%3. Order-Status3. Order-Status 4%4%4. Delivery 4. Delivery 4%4%5. Stock-Level 5. Stock-Level 4%4%
Input screenInput screen
Output screenOutput screen
Measure menu Response TimeMeasure menu Response Time
Measure txn Response TimeMeasure txn Response Time
Keying time
Think time
33
Go back to 1Go back to 1
44 UPC, February 1999
TPC-C: other requirementsTPC-C: other requirements
TransparencyTransparency tables can be split horizontally and vertically provided tables can be split horizontally and vertically provided
it is hidden from the applicationit is hidden from the application SkewSkew
1% of new-order txn are to a random remote 1% of new-order txn are to a random remote warehousewarehouse
15% of payment txn are to a random remote 15% of payment txn are to a random remote warehousewarehouse
Metrics:Metrics:performance: new-order transactions/minute (tpmC)performance: new-order transactions/minute (tpmC)cost/performance: $/tpmCcost/performance: $/tpmC
45 UPC, February 1999
TPC-C: scaleTPC-C: scale
Maximum of 12 tpmC per warehouseMaximum of 12 tpmC per warehouse Consequently:Consequently:
A quad-Xeon system today (~20,000 tpmC) needsA quad-Xeon system today (~20,000 tpmC) needs– over 1668 warehousesover 1668 warehouses
– over 1 TB of disk storage!!over 1 TB of disk storage!!
That’s a VERY expensive benchmark to run!That’s a VERY expensive benchmark to run!
46 UPC, February 1999
TPC-C: side effects of the skew rulesTPC-C: side effects of the skew rules
Very small fraction of transactions go to remote Very small fraction of transactions go to remote warehouseswarehouses
Transparency rules allow data partitioningTransparency rules allow data partitioning Consequence:Consequence:
Clusters of powerful machines show exceptional Clusters of powerful machines show exceptional numbersnumbers
Compaq has current TPC-C record of over 100 Compaq has current TPC-C record of over 100 KtpmC with an 8-node memory channel clusterKtpmC with an 8-node memory channel cluster
Skew rules are expected to change in the futureSkew rules are expected to change in the future
47 UPC, February 1999
The TPC-D benchmarkThe TPC-D benchmark
Current DSS benchmark from TPCCurrent DSS benchmark from TPC Moderately complex decision support workloadModerately complex decision support workload Models a worldwide reseller of partsModels a worldwide reseller of parts Queries ask real world business questionsQueries ask real world business questions 17 ad hoc DSS queries (Q1 to Q17)17 ad hoc DSS queries (Q1 to Q17) 2 update queries2 update queries
48 UPC, February 1999
TPC-D: schemaTPC-D: schema
CustomerCustomerSF*150KSF*150K
LineItemLineItemSF*6000KSF*6000K
OrderOrderSF*1500KSF*1500K
SupplierSupplierSF*10KSF*10K
NationNation2525
RegionRegion55
PartSuppPartSuppSF*800KSF*800K
PartPartSF*200KSF*200K
49 UPC, February 1999
TPC-D: scaleTPC-D: scale
Unlike TPC-C, scale not tied to performanceUnlike TPC-C, scale not tied to performance Size determined by a Scale Factor (SF)Size determined by a Scale Factor (SF)
SF = {1,10,30,100,300,1000,3000,10000}SF = {1,10,30,100,300,1000,3000,10000} SF=1 means a 1GB database sizeSF=1 means a 1GB database size Majority of current results are in the 100GB and Majority of current results are in the 100GB and
300GB range300GB range Indices and temporary tables can significantly Indices and temporary tables can significantly
increase the total disk capacity. (3-5x is typical)increase the total disk capacity. (3-5x is typical)
50 UPC, February 1999
TPC-D example queryTPC-D example query
Forecasting Revenue Query (Q6)Forecasting Revenue Query (Q6) This query quantifies the amount of revenue increase that would have resulted from This query quantifies the amount of revenue increase that would have resulted from
eliminating company-wide discounts in a given percentage range in a given year. eliminating company-wide discounts in a given percentage range in a given year. Asking this type of “what if” query can be used to look for ways to increase Asking this type of “what if” query can be used to look for ways to increase revenuesrevenues
Considers all line-items shipped in a yearConsiders all line-items shipped in a year Query definition:Query definition: SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM
WHERE L_SHIPDATE >= DATE ‘WHERE L_SHIPDATE >= DATE ‘[DATE][DATE]]’ ]’ AND L_SHIPDATE < DATE ‘AND L_SHIPDATE < DATE ‘[DATE][DATE]’ + INTERVAL ‘1’ YEAR ’ + INTERVAL ‘1’ YEAR AND L_DISCOUNTBETWEEN AND L_DISCOUNTBETWEEN [DISCOUNT][DISCOUNT] - 0.01 AND - 0.01 AND [DISCOUNT][DISCOUNT] + 0.01 + 0.01 AND L_QUANTITY < AND L_QUANTITY < [QUANTITY][QUANTITY]
51 UPC, February 1999
TPC-D execution rulesTPC-D execution rules Power TestPower Test
Queries submitted in a single stream (i.e., no concurrency)Queries submitted in a single stream (i.e., no concurrency) Each Query Set is a permutation of the 17 read-only queriesEach Query Set is a permutation of the 17 read-only queries
Throughput TestThroughput Test
Multiple concurrent query streams Multiple concurrent query streams Single update stream Single update stream
CacheCache FlushFlush
QueryQuerySet 0Set 0(optional)(optional)
UF1UF1 QueryQuerySet 0Set 0
UF2UF2
Timed SequenceTimed SequenceWarm-up, not timedWarm-up, not timed
Query Set 1Query Set 1Query Set 2Query Set 2
Query Set NQuery Set NUF1 UF2 UF1 UF2 UF1 UF2UF1 UF2 UF1 UF2 UF1 UF2Updates:Updates:
.. .. ..
52 UPC, February 1999
TPC-D: metricsTPC-D: metrics
Power Metric (QppD)Power Metric (QppD)Geometric Mean Geometric Mean
Throughput (QthD)Throughput (QthD)Arithmetic MeanArithmetic Mean
Both Metrics represent Both Metrics represent “Queries per Gigabyte Hour”“Queries per Gigabyte Hour”
QppD Size SF
QI i UI jj
j
i
i@
( , ) ( , )
3600
0 0191
2
1
17
where
QI(i,0) Timing Interval for Query i, stream 0
UI(j,0) Timing Interval for Update j, stream 0
SF Scale Factor
QthD SizeS
SFTS@
17
3600
where:
S number of query streams
T elapsed time of test (in seconds)S
53 UPC, February 1999
TPC-D: metrics(2)TPC-D: metrics(2)
Composite Query-Per-Hour Rating (QphD)Composite Query-Per-Hour Rating (QphD)The Power and Throughput metrics are combined to The Power and Throughput metrics are combined to
get the composite queries per hour.get the composite queries per hour.
Reported metrics are:Reported metrics are:– Power: QppD@SizePower: QppD@Size
– Throughput: QthD@SizeThroughput: QthD@Size
– Price/Performance: $/QphD@SizePrice/Performance: $/QphD@Size
QphD Size QppD Size QthD Size@ @ @
54 UPC, February 1999
TPC-D: other issuesTPC-D: other issues
Queries are complex and long-runningQueries are complex and long-running Crucial that DB engine parallelizes queries for Crucial that DB engine parallelizes queries for
acceptable performanceacceptable performance Quality of query parallelizer is the most important Quality of query parallelizer is the most important
factorfactor Large improvements are still observed from Large improvements are still observed from
generation to generation of softwaregeneration to generation of software
55 UPC, February 1999
The TPC-W benchmarkThe TPC-W benchmark
Just introducedJust introduced Represent a business that markets and sells over Represent a business that markets and sells over
the internetthe internet Includes security/authenticationIncludes security/authentication Uses dynamically generated pages (e.g. cgi-bins)Uses dynamically generated pages (e.g. cgi-bins) Metric: Web Interactions Per Second (WIPS)Metric: Web Interactions Per Second (WIPS) Transactions:Transactions:
Browse, shopping-cart, buy, user-registration, and Browse, shopping-cart, buy, user-registration, and searchsearch
56 UPC, February 1999
A look at current audited TPC-C systemsA look at current audited TPC-C systems
Leader in price/performance:Leader in price/performance:Compaq ProLiant 7000-6/450, MS SQL 7.0, NTCompaq ProLiant 7000-6/450, MS SQL 7.0, NT
– 4x 450MHz Xeons, 2MB cache, 4GB DRAM, 1.4 TB 4x 450MHz Xeons, 2MB cache, 4GB DRAM, 1.4 TB diskdisk
– 22,479 tpmC, $18.84/tpmC22,479 tpmC, $18.84/tpmC Leader in non-cluster performance:Leader in non-cluster performance:
Sun Enterprise 6500, Sybase 11.9, Solaris7Sun Enterprise 6500, Sybase 11.9, Solaris7– 24x 336MHz UltraSPARC IIs, 4MB cache, 24 GB 24x 336MHz UltraSPARC IIs, 4MB cache, 24 GB
DRAM, 4TB diskDRAM, 4TB disk
– 53,050 tpmC, $76.00/tpmC53,050 tpmC, $76.00/tpmC
57 UPC, February 1999
Audited TPC-C systems: price breakdownAudited TPC-C systems: price breakdown
Server sub-component pricesServer sub-component prices$/CPU $/MB DRAM $/GB Disk
Compaq Proliant $4,816.00 $3.92 $145.33Sun E6500 $15,375.00 $9.16 $382.03
Server Price Breakdown
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Compaq Proliant Sun E6500
Disk
Memory
CPU
Base
58 UPC, February 1999
Using TPC benchmarks for architecture studiesUsing TPC benchmarks for architecture studies
Brute force approach: use full audit-sized systemBrute force approach: use full audit-sized system Who can afford it?Who can afford it? How can you run it on top of a simulator?How can you run it on top of a simulator? How can you explore a wide design space?How can you explore a wide design space?
Solution: scaling down the sizeSolution: scaling down the size
59 UPC, February 1999
Careful Scaling of WorkloadsCareful Scaling of Workloads
Identify architectural issue under studyIdentify architectural issue under study Apply appropriate scaling to simplify monitoring and Apply appropriate scaling to simplify monitoring and
enable simulation studiesenable simulation studies
Most scaling experiments on real machinesMost scaling experiments on real machinessimulation-only is not a viable option!simulation-only is not a viable option!
Validation through sanity checks and comparison Validation through sanity checks and comparison with audit-sized runswith audit-sized runs
60 UPC, February 1999
Scaling OLTPScaling OLTP Forget about TPC complianceForget about TPC compliance Determine lower bound on DB sizeDetermine lower bound on DB size
monitor contention for smaller tables/indexesmonitor contention for smaller tables/indexes DB size will change with number of processorsDB size will change with number of processors
I/O bandwidth requirements vary with fraction of DB I/O bandwidth requirements vary with fraction of DB resident in memoryresident in memory
completely in-memory run: no special I/O requirementscompletely in-memory run: no special I/O requirements favor more small disks vs. few large onesfavor more small disks vs. few large ones place all redo logs on a separate diskplace all redo logs on a separate disk reduce OS double-bufferingreduce OS double-buffering
Limit number of transactions executedLimit number of transactions executed
61 UPC, February 1999
Scaling OLTP(2)Scaling OLTP(2) Achieve representative cache behaviorAchieve representative cache behavior
relevant data structures >> size of hardware caches relevant data structures >> size of hardware caches (metadata area size is key)(metadata area size is key)
maintain same number of processes/CPU as larger maintain same number of processes/CPU as larger runrun
Simplify setup by running clients on the server Simplify setup by running clients on the server machinemachine
need to make lighter-weight versions of the clientsneed to make lighter-weight versions of the clients Ensure efficient executionEnsure efficient execution
excessive migration, idle time, OS or application excessive migration, idle time, OS or application spinning distorts metricsspinning distorts metrics
62 UPC, February 1999
Scaling DSSScaling DSS Determine lower bound DB sizeDetermine lower bound DB size
sufficient work in parallel sectionsufficient work in parallel section Ensure representative cache behaviorEnsure representative cache behavior
DB >> hardware cachesDB >> hardware cachesmaintain same number of processes/CPU as large maintain same number of processes/CPU as large
runrun Reduce execution time through sampling Reduce execution time through sampling Major difficulty is ensuring representative query Major difficulty is ensuring representative query
plansplans DSS results more volatile due to improvements in DSS results more volatile due to improvements in
query optimizersquery optimizers
63 UPC, February 1999
Tuning, tuning, tuningTuning, tuning, tuning
Ensure scaled workload is running efficientlyEnsure scaled workload is running efficiently Requires a large number of monitoring runs on Requires a large number of monitoring runs on
actual hardware platformactual hardware platform Resembles “black art” on OracleResembles “black art” on Oracle Self-tuning features in Microsoft SQL 7.0 are Self-tuning features in Microsoft SQL 7.0 are
promisingpromisingability for user overrides is desirable, but missingability for user overrides is desirable, but missing
64 UPC, February 1999
Does Scaling Work?Does Scaling Work?
65 UPC, February 1999
TPC-C: scaled vs. full sizeTPC-C: scaled vs. full size
Breakdown profile of CPU cyclesBreakdown profile of CPU cycles Platform: 8-proc. AlphaServer 8400Platform: 8-proc. AlphaServer 8400
TPC-C, scaled
1-issue8% 2-issue
8%
tlb3%
repl trap5%
br/pc mispr.2%
mb3%
scache hit17%
bcache hit30%
bcache miss24%
TPC-C, full-size
1-issue11%
2-issue8%
tlb1%
repl trap2%
br/pc mispr.
3%
mb6%
scache hit22%
bcache hit20%
bcache miss27%
66 UPC, February 1999
Using simpler OLTP benchmarks:Using simpler OLTP benchmarks:
Although “obsolete” TPC-B can be used in architectural Although “obsolete” TPC-B can be used in architectural studiesstudies
TPC-C, full-size
1-issue11%
2-issue8%
tlb1%
repl trap2%
br/pc mispr.
3%
mb6%
scache hit22%
bcache hit20%
bcache miss27%
TPC-B, scaled
1-issue7%
2-issue6%
tlb2%
repl. trap5%
br/pc mispr.2%
mb9%
scache hit16%
bcache hit16%
bcache miss37%
67 UPC, February 1999
Benchmarks wrap-upBenchmarks wrap-up
Commercial applications are complex, but need to Commercial applications are complex, but need to be considered during design evaluationbe considered during design evaluation
TPC benchmarks cover a wide range of TPC benchmarks cover a wide range of commercial application areascommercial application areas
Scaled down TPC benchmarks can be used for Scaled down TPC benchmarks can be used for architecture studiesarchitecture studies
Architect needs deep understanding of the Architect needs deep understanding of the workloadworkload