thinking clearly (paper)

8/3/2019 Thinking Clearly (Paper)

1/15

2010MethodRCorporation.Allrightsreserved.1

ThinkingClearlyAboutPerformance

CaryMillsapMethodRCorporation,Southlake,Texas,USA

[email protected]

Revised2010/05/11

Creatinghighperformanceasanattributeofcomplexsoftwareisextremelydifficult

business for developers, technology administrators, architects, system analysts, and

project managers. However, by understanding some fundamental principles,

performance problem solving and prevention can be made far simpler and more

reliable. This paper describes those principles, linking them together in a coherent

journey covering the goals, the terms, the tools,and the decisionsthat youneed to

maximize yourapplications chance of having a long,productive,high-performance

life.ExamplesinthispapertouchuponOracleexperiences,butthescopeofthepaper

isnotrestrictedtoOracleproducts.

TABLEOFCONTENTS

ThinkingClearlyAboutPerformance..................................11 AnAxiomaticApproach ....................................................12 WhatisPerformance?............................................... ......... 23 ResponseTimevs.Throughput.....................................24 PercentileSpecifications .................................................. 35 ProblemDiagnosis ..................................................... ......... 36 TheSequenceDiagram.................. .................................... 4

7 TheProfile ................................................... ........................... 58 AmdahlsLaw ............................................. ........................... 59 Skew ............................................. ............................................. 610 MinimizingRisk ................................................. .................. 711 Efficiency. ..................................................... ........................... 712 Load........... ..................................................... ........................... 813 QueueingDelay .................................................. .................. 814 TheKnee ............................................. .................................... 915 RelevanceoftheKnee .............................................. .......1016 CapacityPlanning..............................................................10 17 RandomArrivals................................................................10 18 CoherencyDelay................................................................11 19 PerformanceTesting ................................................ .......1120Measuring.............................................................................12

21 PerformanceisaFeature...............................................12 22 Acknowledgments ..................................................... .......1323 AbouttheAuthor...............................................................13 24 Epilog:OpenDebateaboutKnees..............................13

1 ANAXIOMATICAPPROACHWhenIjoinedOracleCorporationin1989,

performancewhateveryonecalledOracletuning

wasdifficult.Onlyafewpeopleclaimedtheycoulddo

itverywell,andthosepeoplecommandednice,high

consultingrates.Whencircumstancesthrustmeinto

theOracletuningarena,Iwasquiteunprepared.

Recently,Ivebeenintroducedintotheworldof

MySQLtuning,andthesituationseemsverysimilar

towhatIsawinOracleovertwentyyearsago.

ItremindsmealotofhowdifficultIwouldhavetold

youthatbeginningalgebrawas,ifyouhadaskedme

whenIwasabout13yearsold.Atthatage,Ihadto

appealheavilytomymathematicalinstinctstosolve

equationslike3x+4=13.Theproblemwiththatis

thatmanyofusdidnthavemathematicalinstincts.I

canrememberlookingataproblemlike3x+4=13;

findxandbasicallystumblingupontheanswer x=3

usingtrialanderror.

Thetrial-and-errormethodoffeelingmywaythrough

algebraproblemsworkedalbeitslowlyand

uncomfortablyforeasyequations,butitdidntscale

astheproblemsgottougher,like3x+4=14.Now

what?MyproblemwasthatIwasntthinkingclearly

yetaboutalgebra.Myintroductionatagefifteento

JamesR.Harkeyputmeontheroadtosolvingthat

problem.

Mr.Harkeytaughtuswhathecalledanaxiomatic

approachtosolvingalgebraicequations.Heshowedus


2/15

2010MethodRCorporation.Allrightsreserved. 2

asetofstepsthatworkedeverytime(andhegaveus

plentyofhomeworktopracticeon).Inadditionto

workingeverytime,byexecutingthosesteps,we

necessarilydocumentedourthinkingasweworked.

Notonlywerewethinkingclearly,usingareliableand

repeatablesequenceofsteps,wewereprovingto

anyonewhoreadourworkthatwewerethinkingclearly.

OurworkforMr.Harkeylookedlikethis:

3.1x+4 =13 problemstatement

3.1x+44=134 subtractionpropertyofequality

3.1x =9 additiveinverseproperty,

simplification3.1x 3.1 =9 3.1 divisionpropertyofequalityx 2.903 multiplicativeinverseproperty,

simplification

ThiswasMr.Harkeysaxiomaticapproachtoalgebra,

geometry,trigonometry,andcalculus:onesmall,

logical,provable,andauditablestepatatime.Itsthe

firsttimeIeverreallygotmathematics.

Naturally,Ididntrealizeitatthetime,butofcourse

provingwasaskillthatwouldbevitalformysuccess

intheworldafterschool.Inlife,Ivefoundthat,of

course,knowingthingsmatters.Butprovingthose

thingstootherpeoplemattersmore.Withoutgood

provingskills,itsdifficulttobeagoodconsultant,a

goodleader,orevenagoodemployee.

Mygoalsincethemid-1990shasbeentocreatea

similarlyrigorousapproachtoOracleperformance

optimization.Lately,Iamexpandingthescopeofthat

goalbeyondOracle,to:Createanaxiomaticapproach

tocomputersoftwareperformanceoptimization.Ive

foundthatnotmanypeoplereallylikeitwhenItalk

likethat,soletssayitlikethis:

Mygoalistohelpyou thinkclearlyabouthowto

optimizetheperformanceofyourcomputer

software.

2 WHATISPERFORMANCE?Ifyougoogleforthewordperformance,yougetovera

halfabillionhitsonconceptsrangingfrombicycle

racingtothedreadedemployeereviewprocessthat

manycompaniesthesedaysarelearningtoavoid.WhenIgoogledforperformance,mostofthetophits

relatetothesubjectofthispaper:thetimeittakesfor

computersoftwaretoperformwhatevertaskyouask

ittodo.

Andthatsagreatplacetobegin:the task.Ataskisa

business-orientedunitofwork.Taskscannest: print

invoicesisatask;printoneinvoiceasub-taskisalso

atask.Whenacomputerusertalksaboutperformance

heusuallymeansthetimeittakesforthesystemto

executesometask.Responsetimeistheexecution

durationofatask,measuredintimepertask,like

secondsperclick.Forexample,myGooglesearchfor

thewordperformancehadaresponsetimeof

0.24seconds.TheGooglewebpagerenderedthat

measurementrightinmybrowser.Thatisevidence,tome,thatGooglevaluesmyperceptionofGoogle

performance.

Somepeopleareinterestedinanotherperformance

measure:Throughputisthecountoftaskexecutions

thatcompletewithinaspecifiedtimeinterval,like

clickspersecond.Ingeneral,peoplewhoare

responsiblefortheperformanceofgroupsofpeople

worrymoreaboutthroughputthanpeoplewhowork

inasolocontributorrole.Forexample,anindividual

accountantisusuallymoreconcernedaboutwhether

theresponsetimeofadailyreportwillrequirehimto

staylateafterworktoday.Themanagerofagroupof

accountsisadditionallyconcernedaboutwhetherthesystemiscapableofprocessingallthedatathatallof

heraccountantswillbeprocessingtoday.

3 RESPONSETIMEVS.THROUGHPUTThroughputandresponsetimehaveagenerally

reciprocaltypeofrelationship,butnotexactly.The

realrelationshipissubtlycomplex.

Example:Imaginethatyouhavemeasuredyour

throughputat1,000taskspersecondforsome

benchmark.What,then,isyourusersaverage

responsetime?

Itstemptingtosaythatyouraverageresponsetimewas1/1,000=.001secondspertask.Butitsnot

necessarilyso.

Imaginethatyoursystemprocessingthisthroughput

had1,000parallel,independent,homogeneous

servicechannelsinsideit(thatis,itsasystemwith

1,000independent,equallycompetentservice

providersinsideit,eachawaitingyourrequestforservice).Inthiscase,itispossiblethateachrequest

consumedexactly1second.

Now,wecanknowthataverageresponsetimewas

somewherebetween0secondspertaskand

1secondpertask.Butyoucannotderiveresponse

timeexclusively1fromathroughputmeasurement.Youhavetomeasureitseparately.

1Icarefullyincludethewordexclusivelyinthisstatement,

becausetherearemathematicalmodelsthatcancompute

responsetimeforagiventhroughput,butthemodels

requiremoreinputthanjustthroughput.


3/15


Thesubtletyworksintheotherdirection,too.Youcan

certainlyfliptheexampleIjustgavearoundandprove

it.However,ascarierexamplewillbemorefun.

Example:Yourclientrequiresanewtaskthatyoure

programmingtodeliverathroughputof100taskspersecondonasingle-CPUcomputer.Imaginethat

thenewtaskyouvewrittenexecutesinjust.001secondsontheclientssystem.Willyour

programyieldthethroughputthattheclient

requires?

Itstemptingtosaythatifyoucanrunthetaskonce

injustathousandthofasecond,thensurelyyoullbe

abletorunthattaskatleastahundredtimesinthe

spanofafullsecond.Andyoureright,ifthetaskrequestsarenicelyserialized,forexample,sothat

yourprogramcanprocessall100oftheclients

requiredtaskexecutionsinsidealoop,oneafterthe

other.

Butwhatifthe100taskspersecondcomeatyour

systematrandom,from100differentusersloggedintoyourclientssingle-CPUcomputer?Thenthe

gruesomerealitiesofCPUschedulersandserialized

resources(likeOraclelatchesandlocksandwritable

accesstobuffersinmemory)mayrestrictyour

throughputtoquantitiesmuchlessthantherequired

100taskspersecond.

Itmightwork.Itmightnot.Youcannotderivethroughputexclusivelyfromaresponsetime

measurement.Youhavetomeasureitseparately.

Responsetimeandthroughputarenotnecessarily

reciprocals.Toknowthemboth,youneedtomeasure

themboth.

So,whichismoreimportant:responsetime,orthroughput?Foragivensituation,youmightanswer

legitimatelyineitherdirection.Inmany

circumstances,theansweristhatbotharevital

measurementsrequiringmanagement.Forexample,a

systemownermayhaveabusinessrequirementthat

responsetimemustbe1.0secondsorlessforagiven

taskin99%ormoreofexecutions andthesystem

mustsupportasustainedthroughputof

1,000executionsofthetaskwithina10-minute

interval.

4 PERCENTILESPECIFICATIONSInthepriorsection,Iusedthephrasein99%ormore

ofexecutionstoqualifyaresponsetimeexpectation.

Manypeoplearemoreaccustomedtostatementslike,

averageresponsetimemustbersecondsorless.The

percentilewayofstatingrequirementsmapsbetter,

though,tothehumanexperience.

Example:Imaginethatyourresponsetimetolerance

is1secondforsometaskthatyouexecuteonyour

computereveryday.Furtherimaginethatthelistsof

numbersshowninExhibit1representthemeasuredresponsetimesoftenexecutionsofthattask.The

averageresponsetimeforeachlistis1.000seconds.

Whichonedoyouthinkyoudlikebetter?

ListA ListB

1 .924 .7962 .928 .798

3 .954 .802

4 .957 .823

5 .961 .919

6 .965 .977

7 .972 1.076

8 .979 1.216

9 .987 1.273

10 1.373 1.320

Exhibit1.Theaverageresponsetimeforeachof

thesetwolistsis1.000seconds.

Youcanseethatalthoughthetwolistshavethesame

averageresponsetime,thelistsarequitedifferentincharacter.InListA,90%ofresponsetimeswere

1secondorless.InListB,only60%ofresponse

timeswere1secondorless.Statedintheopposite

way,ListBrepresentsasetofuserexperiencesofwhich40%weredissatisfactory,butListA(having

thesameaverageresponsetimeasListB)represents

onlya10%dissatisfactionrate.

InListA,the90thpercentileresponsetimeis.987seconds.InListB,the90thpercentileresponse

timeis1.273seconds.Thesestatementsabout

percentilesaremoreinformativethanmerelysaying

thateachlistrepresentsanaverageresponsetimeof1.000seconds.

AsGEsays,Ourcustomersfeelthevariance,notthe

mean.2Expressingresponsetimegoalsaspercentiles

makeformuchmorecompellingrequirement

specificationsthatmatchwithenduserexpectations:

TheTrackShipmenttaskmustcompleteinless

than.5secondsinatleast99.9%ofexecutions.

5 PROBLEMDIAGNOSISInnearlyeveryperformanceproblemIvebeeninvited

torepair,theproblemstatementhasbeenastatement

aboutresponsetime.Itusedtotakelessthana

secondtodoX;nowitsometimestakes20+.Ofcourse,aspecificproblemstatementlikethatisoften

buriedbehindveneersofotherproblemstatements,

2GeneralElectricCompany:WhatIsSixSigma?The

RoadmaptoCustomerImpactat

http://www.ge.com/sixsigma/SixSigma.pdf


4/15


like,Ourwhole[adjectivesdeleted]systemissoslow

wecantuseit.3

Butjustbecausesomethinghashappenedalotforme

doesntmeanthatitswhatwillhappennextforyou.

Themostimportantthingforyoutodofirstisstate

theproblemclearly,sothatyoucanthenthinkaboutit

clearly.

Agoodwaytobeginistoask,whatisthegoalstate

thatyouwanttoachieve?Findsomespecificsthatyou

canmeasuretoexpressthis.Forexample,Response

timeofXismorethan20secondsinmanycases.Well

behappywhenresponsetimeis1secondorlessinat

least95%ofexecutions.

Thatsoundsgoodintheory,butwhatifyouruser

doesnthaveaspecificquantitativegoallike1second

orlessinatleast95%ofexecutions?Therearetwo

quantitiesrightthere(1and95);whatifyouruser

doesntknoweitheroneofthem?Worseyet,whatif

youruserdoeshavespecificideasabouthisexpectations,butthoseexpectationsareimpossibleto

meet?Howwouldyouknowwhatpossibleor

impossibleevenis?

Letsworkourwayuptothosequestions.

6 THESEQUENCEDIAGRAMAsequencediagram isatypeofgraphspecifiedinthe

UnifiedModelingLanguage(UML),usedtoshowthe

interactionsbetweenobjectsinthesequentialorder

thatthoseinteractionsoccur.Thesequencediagramis

anexceptionallyusefultoolforvisualizingresponse

time.Exhibit2showsastandardUMLsequence

diagramforasimpleapplicationsystemcomposedofa

browser,anapplicationserver,andadatabase.

3CaryMillsap,2009.Mywholesystemisslow.Nowwhat?

athttp://carymillsap.blogspot.com/2009/12/my-whole-

system-is-slow-now-what.html

Exhibit2.ThisUMLsequencediagramshowsthe

interactionsamongabrowser,anapplication

server,andadatabase.

Imaginenowdrawingthesequencediagramtoscale,

sothatthedistancebetweeneachrequestarrowcominganditscorrespondingresponsearrowgoing

outwereproportionaltothedurationspentservicing

therequest.IveshownsuchadiagraminExhibit3.

Exhibit3.AUMLsequencediagramdrawnto

scale,showingtheresponsetimeconsumedat

eachtierinthesystem.

WithExhibit3,youhaveagoodgraphical

representationofhowthecomponentsrepresentedin

yourdiagramarespendingyouruserstime.Youcanfeeltherelativecontributiontoresponsetimeby

lookingatthepicture.

Sequencediagramsarejustrightforhelpingpeople

conceptualizehowtheirresponseisconsumedona

givensystem,asonetierhandscontrolofthetaskto

thenext.Sequencediagramsalsoworkwelltoshow

howsimultaneousthreadsofprocessingworkin

parallel.Sequencediagramsaregoodtoolsfor


5/15


analyzingperformanceoutsideoftheinformation

technologybusiness,too. 4

Thesequencediagramisagoodconceptualtoolfor

talkingaboutperformance,buttothinkclearlyabout

performance,weneedsomethingelse,too.Heresthe

problem.Imaginethatthetaskyouresupposedtofix

hasaresponsetimeof2,468seconds(thats41minutes8seconds).Inthatroughly41minutes,

runningthattaskcausesyourapplicationserverto

execute322,968databasecalls.Exhibit4showswhat

yoursequencediagramforthattaskwouldlooklike.

Exhibit4.ThisUMLsequencediagramshows

322,968databasecallsexecutedbythe

applicationserver.

Therearesomanyrequestandresponsearrows

betweentheapplicationanddatabasetiersthatyoucantseeanyofthedetail.Printingthesequence

diagramonaverylongscrollisntausefulsolution,

becauseitwouldtakeusweeksofvisualinspection

beforewedbeabletoderiveusefulinformationfrom

thedetailswedsee.

Thesequencediagramisagoodtoolfor

conceptualizingflowofcontrolandthecorresponding

flowoftime.Buttothinkclearlyaboutresponsetime,

weneedsomethingelse.

7 THEPROFILEThesequencediagramdoesntscalewell.Todealwithtasksthathavehugecallcounts,weneedaconvenient

aggregationofthesequencediagramsothatwe

understandthemostimportantpatternsinhowour

4CaryMillsap,2009.PerformanceoptimizationwithGlobal

Entry.Ornot?at

http://carymillsap.blogspot.com/2009/11/performance-

optimization-with-global.html

timehasbeenspent.Exhibit5showsanexampleofa

tablecalledaprofile,whichdoesthetrick.Aprofileis

atabulardecompositionofresponsetime,typically

listedindescendingorderofcomponentresponse

timecontribution.

Functioncall R(sec) Calls

1 DB:fetch() 1,748.229 322,9682 App:await_db_netIO() 338.470 322,968

3 DB:execute() 152.654 39,142

4 DB:prepare() 97.855 39,142

5 Other 58.147 89,422

6 App:render_graph() 48.274 7

7 App:tabularize() 23.481 4

8 App:read() 0.890 2

Total 2,468.000

Exhibit5.Thisprofileshowsthedecompositionof

a2,468.000-secondresponsetime.

Example:TheprofileinExhibit5isrudimentary,but

itshowsyouexactlywhereyourslowtaskhasspent

yourusers2,468seconds.Withthedatashownhere,forexample,youcanderivethepercentageof

responsetimecontributionforeachofthefunction

callsidentifiedintheprofile.Youcanalsoderivethe

averageresponsetimeforeachtypeoffunctioncallduringyourtask.

Aprofileshowsyouwhereyourcodehasspentyour

timeandsometimesevenmoreimportantlywhere

ithasnot.Thereistremendousvalueinnothavingto

guessaboutthesethings.

FromthedatashowninExhibit5,you knowthat

70.8%ofyourusersresponsetimeisconsumedby

DB:fetch()calls.Furthermore,ifyoucandrilldownintotheindividualcallswhosedurationswere

aggregatedtocreatethisprofile,youcanknowhow

manyofthoseApp:await_db_netIO()calls

correspondedtoDB:fetch()calls,andyoucanknow

howmuchresponsetimeeachofthoseconsumed.

Withaprofile,youcanbegintoformulatetheanswer

tothequestion,Howlong shouldthistaskrun?

Which,bynow,youknowisanimportantquestion

inthefirststep(section5)ofanygoodproblem

diagnosis.

8 AMDAHLSLAWProfilinghelpsyouthinkclearlyaboutperformance.

EvenifGeneAmdahlhadntgivenusAmdahlsLawbackin1967,youdhaveprobablycomeupwithit

yourselfafterthefirstfewprofilesyoulookedat.

AmdahlsLawstates:

Performanceimprovementisproportionalto

howmuchaprogramusesthethingyou

improved.


6/15


Soifthethingyouretryingtoimproveonly

contributes5%toyourtaskstotalresponsetime,then

themaximumimpactyoullbeabletomakeis5%of

yourtotalresponsetime.Thismeansthatthecloserto

thetopofaprofilethatyouwork(assumingthatthe

profileissortedindescendingresponsetimeorder),

thebiggerthebenefitpotentialforyouroverallresponsetime.

Thisdoesntmeanthatyoualwaysworkaprofilein

top-downorder,though,becauseyoualsoneedto

considerthecostoftheremediesyoullbeexecuting,

too.5

Example:ConsidertheprofileinExhibit6.ItsthesameprofileasinExhibit5,excepthereyoucansee

howmuchtimeyouthinkyoucansaveby

implementingthebestremedyforeachrowinthe

profile,andyoucanseehowmuchyouthinkeach

remedywillcosttoimplement.

Potentialimprovement%andcostofinvestment R(sec) R(%)

1 34.5%superexpensive 1,748.229 70.8%

2 12.3%dirtcheap 338.470 13.7%

3 Impossibletoimprove 152.654 6.2%

4 4.0%dirtcheap 97.855 4.0%

5 0.1%superexpensive 58.147 2.4%

6 1.6%dirtcheap 48.274 2.0%

7 Impossibletoimprove 23.481 1.0%

8 0.0%dirtcheap 0.890 0.0%

Total 2,468.000

Exhibit6.Thisprofileshowsthepotentialfor

improvementandthecorrespondingcost

(difficulty)ofimprovementforeachlineitem

fromExhibit5.

Whatremedyactionwouldyouimplementfirst?

AmdahlsLawsaysthatimplementingtherepairon

line1hasthegreatestpotentialbenefitofsavingabout851seconds(34.5%of2,468seconds).Butifit

istrulysuperexpensive,thentheremedyonline2

mayyieldbetternetbenefitandthatsthe

constrainttowhichwereallyneedtooptimize

eventhoughthepotentialforresponsetimesavings

isonlyabout305seconds.

Atremendousvalueoftheprofileisthatyoucanlearn

exactlyhowmuchimprovementyoushouldexpectfor

aproposedinvestment.Itopensthedoortomaking

muchbetterdecisionsaboutwhatremediesto

implementfirst.Yourpredictionsgiveyouayardstick

formeasuringyourownperformanceasananalyst.

Andfinally,itgivesyouachancetoshowcaseyour

5CaryMillsap,2009.Ontheimportanceofdiagnosing

beforeresolvingat

http://carymillsap.blogspot.com/2009/09/on-importance-of-

diagnosing-before.html

clevernessandintimacywithyourtechnologyasyou

findmoreefficientremediesforreducingresponse

timemorethanexpected,atlower-than-expected

costs.

Whatremedyactionyouimplementfirstreallyboils

downtohowmuchyoutrustyourcostestimates.Does

dirtcheapreallytakeintoaccounttherisksthattheproposedimprovementmayinflictuponthesystem?

Forexample,itmayseemdirtcheaptochangethat

parameterordropthatindex,butdoesthatchange

potentiallydisruptthegoodperformancebehaviorof

somethingouttherethatyourenoteventhinking

aboutrightnow?Reliablecostestimationisanother

areainwhichyourtechnologicalskillspayoff.

Anotherfactorworthconsideringisthepolitical

capitalthatyoucanearnbycreatingsmallvictories.

Maybecheap,low-riskimprovementswontamountto

muchoverallresponsetimeimprovement,buttheres

valueinestablishingatrackrecordofsmallimprovementsthatexactlyfulfillyourpredictions

abouthowmuchresponsetimeyoullsavefortheslow

task.Atrackrecordofpredictionandfulfillment

ultimatelyespeciallyintheareaofsoftware

performance,wheremythandsuperstitionhave

reignedatmanylocationsfordecadesgivesyouthe

credibilityyouneedtoinfluenceyourcolleagues(your

peers,yourmanagers,yourcustomers,)toletyou

performincreasinglyexpensiveremediesthatmay

producebiggerpayoffsforthebusiness.

Awordofcaution,however:Dontgetcarelessasyou

rackupyoursuccessesandproposeeverbigger,

costlier,riskierremedies.Credibilityisfragile.Ittakesalotofworktobuilditupbutonlyonecareless

mistaketobringitdown.

9 SKEWWhenyouworkwithprofiles,yourepeatedly

encountersub-problemslikethisone:

Example:TheprofileinExhibit5revealedthat

322,968DB:fetch()callshadconsumed

1,748.229secondsofresponsetime.Howmuchunwantedresponsetimewouldweeliminateifwe

couldeliminatehalfofthosecalls?

Theanswerisalmostnever,Halfoftheresponse

time.Considerthisfarsimplerexampleforamoment:

Example:Fourcallstoasubroutineconsumedfour

seconds.Howmuchunwantedresponsetimewould

weeliminateifwecouldeliminatehalfofthosecalls?

Theanswerdependsupontheresponsetimesofthe

individualcallsthatwecouldeliminate.Youmight

haveassumedthateachofthecalldurationswasthe


7/15


average4/4=1second.Butnowhereintheproblem

statementdidItellyouthatthecalldurationswere

uniform.

Example:Imaginethefollowingtwopossibilities,

whereeachlistrepresentstheresponsetimesofthefoursubroutinecalls:

A={1,1,1,1}B={3.7,.1,.1,.1}

InlistA,theresponsetimesareuniform,sonomatterwhichhalf(two)ofthecallsweeliminate,

wellreducetotalresponsetimeto2seconds.

However,inlistB,itmakesatremendousdifference

whichtwocallsweeliminate.Ifweeliminatethefirst

twocalls,thenthetotalresponsetimewilldropto

.2seconds(a95%reduction).However,ifweeliminatethefinaltwocalls,thenthetotalresponse

timewilldropto3.8seconds(onlya5%reduction).

Skewisanon-uniformityinalistofvalues.The

possibilityofskewiswhatprohibitsyoufrom

providingapreciseanswertothequestionthatIaskedyouatthebeginningofthissection.Letslookagain:

Example:TheprofileinExhibit5revealedthat

322,968DB:fetch()callshadconsumed

1,748.229secondsofresponsetime.Howmuch

unwantedresponsetimewouldweeliminateifwe

couldeliminatehalfofthosecalls?

Withoutknowinganythingaboutskew,theonlyanswerwecanprovideis,Somewherebetween0

and1,748.229seconds.Thatisthemostprecise

correctansweryoucanreturn.

Imagine,however,thatyouhadtheadditional

informationavailableinExhibit7.Thenyoucouldformulatemuchmoreprecisebest-caseandworst-

caseestimates.Specifically,ifyouhadinformationlike

this,youdbesmarttotrytofigureouthow

specificallytoeliminatethe47,444callswithresponse

timesinthe.01-to.1-secondrange.

Range{mine


8/15


Example:Amiddletierprogrammakes100database

fetchcalls(andthus100networkI/Ocalls)tofetch994rows.Itcouldhavefetched994rowsin10fetch

calls(andthus90fewernetworkI/Ocalls).

Example:ASQLstatement7touchesthedatabase

buffercache7,428,322timestoreturna698-row

resultset.Anextrafilterpredicatecouldhave

returnedthe7rowsthattheenduserreallywantedtosee,withonly52touchesuponthedatabasebuffer

cache.

Certainly,ifasystemhassomeglobalproblemthat

createsinefficiencyforbroadgroupsoftasksacross

thesystem(e.g.,ill-conceivedindex,badlyset

parameter,poorlyconfiguredhardware),thenyou

shouldfixit.Butdonttuneasystemtoaccommodate

programsthatareinefficient.8Thereisalotmore

leverageincuringtheprograminefficiencies

themselves.Eveniftheprogramsarecommercial,off-

the-shelfapplications,itwillbenefityoubetterinthe

longruntoworkwithyoursoftwarevendortomakeyourprogramsefficient,thanitwilltotrytooptimize

yoursystemtobeasefficientasitcanwithinherently

inefficientworkload.

Improvementsthatmakeyourprogrammoreefficient

canproducetremendousbenefitsforeveryoneonthe

system.Itseasytoseehowtop-linereductionof

wastehelpstheresponsetimeofthetaskbeing

repaired.Whatmanypeopledontunderstandaswell

isthatmakingoneprogrammoreefficientcreatesa

side-effectofperformanceimprovementforother

programsonthesystemthathavenoapparent

relationtotheprogrambeingrepaired.Ithappens

becauseoftheinfluenceof loaduponthesystem.

12 LOADLoadiscompetitionforaresourceinducedby

concurrenttaskexecutions. Loadisthereasonthatthe

performancetestingdonebysoftwaredevelopers

doesntcatchalltheperformanceproblemsthatshow

uplaterinproduction.

Onemeasureofloadisutilization.Utilizationis

resourceusagedividedbyresourcecapacityfora

specifiedtimeinterval.Asutilizationforaresource

goesup,sodoestheresponsetimeauserwill

experiencewhenrequestingservicefromthat

resource.Anyonewhohasriddeninanautomobilein

abigcityduringrushhourhasexperiencedthis

7MychoiceofarticleadjectivehereisadeadgiveawaythatIwasintroducedtoSQLwithintheOraclecommunity.

8Admittedly,sometimesyouneedatourniquettokeepfrom

bleedingtodeath.Butdontuseastopgapmeasureasa

permanentsolution.Addresstheinefficiency.

phenomenon.Whenthetrafficisheavilycongested,

youhavetowaitlongeratthetollbooth.

Withcomputersoftware,thesoftwareyouusedoesn't

actuallygoslowerlikeyourcardoeswhenyoure

going30mphinheavytrafficinsteadof60mphonthe

openroad.Computersoftwarealwaysgoesthesame

speed,nomatterwhat(aconstantnumberofinstructionsperclockcycle),butcertainlyresponse

timedegradesasresourcesonyoursystemgetbusier.

Therearetworeasonsthatsystemsgetslowerasload

increases:queueingdelay,andcoherencydelay.Ill

addresseachaswecontinue.

13 QUEUEINGDELAYThemathematicalrelationshipbetweenloadand

responsetimeiswellknown.Onemathematicalmodel

calledM/M/mrelatesresponsetimetoloadin

systemsthatmeetoneparticularlyusefulsetof

specificrequirements.9Oneoftheassumptionsof

M/M/misthatthesystemyouaremodelinghas

theoreticallyperfectscalability.Havingtheoretically

perfectscalabilityisakintohavingaphysicalsystem

withnofriction,anassumptionthatsomany

problemsinintroductoryPhysicscoursesinvoke.

Regardlessofsomeoverreachingassumptionslikethe

oneaboutperfectscalability,M/M/mhasalottoteach

usaboutperformance.Exhibit8showsthe

relationshipbetweenresponsetimeandloadusing

M/M/m.

Exhibit8.Thiscurverelatesresponsetimeasa

functionofutilizationforanM/M/msystemwith

m=8servicechannels.

9CaryMillsapandJeffHolt,2003.OptimizingOracle

Performance.OReilly.SebastopolCA.

0.0 0.2 0.4 0.6 0.8 1.00

1

2

3

4

5

Utilization

Responsetime

R

MM

8 system


9/15


InExhibit8,youcanseemathematicallywhatyoufeel

whenyouuseasystemunderdifferentload

conditions.Atlowload,yourresponsetimeis

essentiallythesameasyourresponsetimewasatno

load.Asloadrampsup,yousenseaslight,gradual

degradationinresponsetime.Thatgradual

degradationdoesntreallydomuchharm,butasloadcontinuestorampup,responsetimebeginstodegrade

inamannerthatsneitherslightnorgradual.Rather,

thedegradationbecomesquiteunpleasantand,infact,

hyperbolic.

Responsetime,intheperfectscalabilityM/M/m

model,consistsoftwocomponents: servicetimeand

queueingdelay.Thatis,R=S+Q.Servicetime(S)is

thedurationthatataskspendsconsumingagiven

resource,measuredintimepertaskexecution,asin

secondsperclick.Queueingdelay(Q) isthetimethata

taskspendsenqueuedatagivenresource,awaitingits

opportunitytoconsumethatresource.Queueingdelay

isalsomeasuredintimepertaskexecution(e.g.,secondsperclick).

So,whenyouorderlunchatTacoTico,yourresponse

time(R)forgettingyourorderisthequeueingdelay

time(Q)thatyouspendqueuedinfrontofthecounter

waitingforsomeonetotakeyourorder,plusthe

servicetime(S)youspendwaitingforyourorderto

hityourhandsonceyoubegintalkingtotheorder

clerk.Queueingdelayisthedifferencebetweenyour

responsetimeforagiventaskandtheresponsetime

forthatsametaskonanotherwiseunloadedsystem

(dontforgetourperfectscalabilityassumption).

14 THEKNEEWhenitcomestoperformance,youwanttwothings

fromasystem:

1. Youwantthebestresponsetimeyoucanget:youdontwanttohavetowaittoolongfortaskstoget

done.

2. Andyouwantthebestthroughputyoucanget:youwanttobeabletocramasmuchloadasyou

possiblycanontothesystemsothatasmany

peopleaspossiblecanruntheirtasksatthesame

time.

Unfortunately,thesetwogoalsarecontradictory.

Optimizingtothefirstgoalrequiresyoutominimize

theloadonyoursystem;optimizingtotheotherone

requiresyoutomaximizeit.Youcantdoboth

simultaneously.Somewhereinbetweenatsomeload

level(thatis,atsomeutilizationvalue)istheoptimal

loadforthesystem.

Theutilizationvalueatwhichthisoptimalbalance

occursiscalledtheknee.10Thekneeistheutilization

valueforaresourceatwhichthroughputismaximized

withminimalnegativeimpacttoresponsetimes.

Mathematically,thekneeistheutilizationvalueat

whichresponsetimedividedbyutilizationisatits

minimum.Onenicepropertyofthekneeisthatitoccursattheutilizationvaluewherealinethroughthe

originistangenttotheresponsetimecurve.Ona

carefullyproducedM/M/mgraph,youcanlocatethe

kneequitenicelywithjustastraightedge,asshownin

Exhibit9.

Exhibit9.Thekneeoccursattheutilizationat

whichalinethroughtheoriginistangenttothe

responsetimecurve.

AnothernicepropertyoftheM/M/mkneeisthatyouonlyneedtoknowthevalueofoneparameterto

computeit.Thatparameteristhenumberofparallel,

homogeneous,independentservicechannels.Aservice

channelisaresourcethatsharesasinglequeuewith

otheridenticalsuchresources,likeaboothinatoll

plazaoraCPUinanSMPcomputer.

TheitalicizedlowercaseminthenameM/M/misthe

numberofservicechannelsinthesystembeing

modeled.11TheM/M/mkneevalueforanarbitrary

10Iamengagedinanongoingdebateaboutwhetheritis

appropriatetousethetermkneeinthiscontext.Forthetimebeing,Ishallcontinuetouseit.Seesection24fordetails.

11Bythispoint,youmaybewonderingwhattheothertwo

MsstandforintheM/M/mqueueingmodelname.They

relatetoassumptionsabouttherandomnessofthetimingof

yourincomingrequestsandtherandomnessofyourservicetimes.See

http://en.wikipedia.org/wiki/Kendall%27s_notation formore

information,orOptimizingOraclePerformanceforeven

more.

0.0 0.2 0.4 0.6 0.8 1.00

2

4

6

8

10

Utilization

Responsetim

eR

MM4, 0.665006

MM16, 0.810695


10/15


systemisdifficulttocalculate,butIvedoneitforyou.

Thekneevaluesforsomecommonservicechannel

countsareshowninExhibit10.

Servicechannelcount

Kneeutilization

1 50%

2 57%4 66%

8 74%

16 81%

32 86%64 89%

128 92%

Exhibit10.M/M/mkneevaluesforcommon

valuesofm.

Whyisthekneevaluesoimportant?Forsystemswith

randomlytimedservicerequests,allowingsustained

resourceloadsinexcessofthekneevalueresultsin

responsetimesandthroughputsthatwillfluctuateseverelywithmicroscopicchangesinload.Hence:

Onsystemswithrandomrequestarrivals,itisvitaltomanageloadsothatitwillnotexceed

thekneevalue.

15 RELEVANCEOFTHEKNEESo,howimportantcanthiskneeconceptbe,really?

Afterall,asIvetoldyou,theM/M/mmodelassumes

thisridiculouslyutopianideathatthesystemyoure

thinkingaboutscalesperfectly.Iknowwhatyoure

thinking:Itdoesnt.

ButwhatM/M/mgivesusistheknowledgethateven

ifyoursystemdidscaleperfectly,youwould stillbe

strickenwithmassiveperformanceproblemsonceyouraverageloadexceededthekneevaluesIvegiven

youinExhibit10.Yoursystemisntasgoodasthe

theoreticalsystemsthatM/M/mmodels.Therefore,

theutilizationvaluesatwhichyoursystemsknees

occurwillbemoreconstrainingthanthevaluesIve

givenyouinExhibit10.(Isaid valuesandkneesin

pluralform,becauseyoucanmodelyourCPUswith

onemodel,yourdiskswithanother,yourI/O

controllerswithanother,andsoon.)

Torecap:

Eachoftheresourcesinyoursystemhasaknee. Thatkneeforeachofyourresourcesislessthan

orequaltothekneevalueyoucanlookupin

Exhibit10.Themoreimperfectlyyoursystem

scales,thesmaller(worse)yourkneevaluewill

be.

Onasystemwithrandomrequestarrivals,ifyouallowyoursustainedutilizationforanyresource

onyoursystemtoexceedyourkneevalueforthat

resource,thenyoullhaveperformanceproblems.

Therefore,itisvitalthatyoumanageyourloadsothatyourresourceutilizationswillnotexceedyourknee

values.

16 CAPACITYPLANNINGUnderstandingthekneecancollapsealotof

complexityoutofyourcapacityplanningprocess.It

workslikethis:

1. Yourgoalcapacityforagivenresourceistheamountatwhichyoucancomfortablyrunyour

tasksatpeaktimeswithoutdrivingutilizations

beyondyourknees.

2. Ifyoukeepyourutilizationslessthanyourknees,yoursystembehavesroughlylinearly:nobighyperbolicsurprises.

3. However,ifyourelettingyoursystemrunanyofitsresourcesbeyondtheirkneeutilizations,then

youhaveperformanceproblems(whetheryoure

awareofitornot).

4. Ifyouhaveperformanceproblems,thenyoudontneedtobespendingyourtimewithmathematical

models;youneedtobespendingyourtimefixing

thoseproblemsbyeitherreschedulingload,

eliminatingload,orincreasingcapacity.

Thatshowcapacityplanningfitsintotheperformance

managementprocess.

17 RANDOMARRIVALSYoumighthavenoticedthatseveraltimesnow,Ihave

mentionedthetermrandomarrivals.Whyisthat

important?

Somesystemshavesomethingthatyouprobablydont

haverightnow:completelydeterministicjob

scheduling.Somesystemsitsrarethesedaysare

configuredtoallowservicerequeststoenterthe

systeminabsoluteroboticfashion,say,atapaceof

onetaskpersecond.Andbyonetaskpersecond,I

dontmeanatanaveragerateofonetaskpersecond(forexample,2tasksinonesecondand0tasksinthe

next),Imeanonetaskpersecond,likearobotmight

feedcarpartsintoabinonanassemblyline.

Ifarrivalsintoyoursystembehavecompletely

deterministicallymeaningthatyouknow exactly

whenthenextservicerequestiscomingthenyou

canrunresourceutilizationsbeyondtheirknee

utilizationswithoutnecessarilycreatinga


11/15


performanceproblem.Onasystemwithdeterministic

arrivals,yourgoalistorunresourceutilizationsupto

100%withoutcrammingsomuchworkloadintothe

systemthatrequestsbegintoqueue.

Thereasonthekneevalueissoimportantonasystem

withrandomarrivalsisthatrandomarrivalstendto

clusteruponyouandcausetemporaryspikesinutilization.Thosespikesneedenoughsparecapacity

toconsumesothatusersdonthavetoendure

noticeablequeueingdelays(whichcausenoticeable

fluctuationsinresponsetimes)everytimeoneof

thosespikesoccurs.

Temporaryspikesinutilizationbeyondyourknee

valueforagivenresourceareokaslongastheydont

exceedjustafewsecondsinduration.Howmany

secondsistoomany?Ibelieve(buthavenotyettried

toprove)thatyoushouldatleastensurethatyour

spikedurationsdonotexceed8seconds.12Theanswer

iscertainlythat,ifyoureunabletomeetyourpercentile-basedresponsetimepromisesoryour

throughputpromisestoyourusers,thenyourspikes

aretoolong.

18 COHERENCYDELAYYoursystemdoesnthavetheoreticallyperfect

scalability.EvenifIveneverstudiedyoursystemspecifically,itsaprettygoodbetthatnomatterwhat

computerapplicationsystemyourethinkingofright

now,itdoesnotmeettheM/M/mtheoretically

perfectscalabilityassumption.Coherencydelayisthe

factorthatyoucanusetomodeltheimperfection. 13

Coherencydelayisthedurationthatataskspends

communicatingandcoordinatingaccesstoashared

resource.Likeresponsetime,servicetime,and

queueingdelay,coherencydelayismeasuredintime

pertaskexecution,asinsecondsperclick.

Iwontdescribehereamathematicalmodelfor

predictingcoherencydelay.Butthegoodnewsisthat

ifyouprofileyoursoftwaretaskexecutions,youllsee

itwhenitoccurs.InOracle,timedeventslikethe

followingareexamplesofcoherencydelay:

enqueue

bufferbusywaits

12Youllrecognizethisnumberifyouveheardofthe8-

secondrule,whichyoucanlearnaboutat

http://en.wikipedia.org/wiki/Network_performance#8-

second_rule.

13NeilGunther,1993.UniversalLawofComputational

Scalability,at

http://en.wikipedia.org/wiki/Neil_J._Gunther#Universal_Law_

of_Computational_Scalability.

latchfree

Youcantmodelcoherencydelayslikethesewith

M/M/m.ThatsbecauseM/M/massumesthatallmof

yourservicechannelsareparallel,homogeneous,and

independent.Thatmeansthemodelassumesthatafter

youwaitpolitelyinaFIFOqueueforlongenoughthat

alltherequeststhatenqueuedaheadofyouhaveexitedthequeueforservice,itllbeyourturntobe

serviced.However,coherencydelaysdontworklike

that.

Example:ImagineanHTMLdataentryforminwhich

onebuttonlabeledUpdateexecutesaSQLupdate

statement,andanotherbuttonlabeledSave

executesaSQLcommitstatement.Anapplicationbuiltlikethiswouldalmostguaranteeabysmal

performance.Thatsbecausethedesignmakesit

possiblequitelikely,actuallyforausertoclick

Update,lookathiscalendar,realizeuh-ohheslate

forlunch,andthengotolunchfortwohoursbefore

clickingSavelaterthatafternoon.Theimpacttoothertasksonthissystemthatwantedtoupdatethesamerowwouldbedevastating.Each

taskwouldnecessarilywaitforalockontherow(or,

onsomesystems,worse:alockontherowspage)

untilthelockinguserdecidedtogoaheadandclick

Save.Oruntiladatabaseadministratorkilledthe

userssession,whichofcoursewouldhaveunsavorysideeffectstothepersonwhohadthoughthehad

updatedarow.

Inthiscase,theamountoftimeataskwouldwaiton

thelocktobereleasedhasnothingtodowithhow

busythesystemis.Itwouldbedependentupon

randomfactorsthatexistoutsideofthesystemsvariousresourceutilizations.Thatswhyyoucant

modelthiskindofthinginM/M/m,anditswhyyou

canneverassumethataperformancetestexecutedin

aunittestingtypeofenvironmentissufficientfora

makingago/no-godecisionaboutinsertionofnew

codeintoaproductionsystem.

19 PERFORMANCETESTINGAllthistalkofqueueingdelaysandcoherencydelays

leadstoaverydifficultquestion.Howcanyoupossibly

testanewapplicationenoughtobeconfidentthat

yourenotgoingtowreckyourproduction

implementationwithperformanceproblems?

Youcanmodel.Andyoucantest. 14However,nothing

youdowillbeperfect.Itisextremelydifficulttocreate

modelsandtestsinwhichyoullforeseeallyour

14TheComputerMeasurementGroupisanetworkof

professionalswhostudytheseproblemsvery,veryseriously.

YoucanlearnaboutCMGathttp://www.cmg.org.


12/15


productionproblemsinadvanceofactually

encounteringthoseproblemsinproduction.

Somepeopleallowtheapparentfutilityofthis

observationtojustifynottestingatall.Dontget

trappedinthatmentality.Thefollowingpointsare

certain:

Youllcatchalotmoreproblemsifyoutrytocatchthempriortoproductionthanifyoudonteven

try.

Youllnevercatchallyourproblemsinpre-productiontesting.Thatswhyyouneedareliable

andefficientmethodforsolvingtheproblemsthat

leakthroughyourpre-productiontesting

processes.

Somewhereinthemiddlebetweennotestingand

completeproductionemulationistherightamount

oftesting.Therightamountoftestingforaircraft

manufacturersisprobablymorethantherightamount

oftestingforcompaniesthatsellbaseballcaps.But

dontskipperformancetestingaltogether.Atthevery

least,yourperformancetestplanwillmakeyoua

morecompetentdiagnostician(andclearerthinker)

whenitcomestimetofixtheperformanceproblems

thatwillinevitablyoccurduringproductionoperation.

20 MEASURINGPeoplefeelthroughputandresponsetime.

Throughputisusuallyeasytomeasure.Measuring

responsetimeisusuallymuchmoredifficult.

(Remember,throughputandresponsetimearenot

reciprocals.)Itmaynotbedifficulttotimeanend-useractionwithastopwatch,butitmightbeverydifficult

togetwhatyoureallyneed,whichistheabilitytodrill

downintothedetailsofwhyagivenresponsetimeis

aslargeasitis.

Unfortunately,peopletendtomeasurewhatseasyto

measure,whichisnotnecessarilywhattheyshouldbe

measuring.Itsabug.Whenitsnoteasytomeasure

whatweneedtomeasure,wetendtoturnour

attentiontomeasurementsthat areeasytoget.

Measuresthatarentwhatyouneed,butthatareeasy

enoughtoobtainandseemrelatedtowhatyouneed

arecalledsurrogatemeasures.Examplesofsurrogatemeasuresincludesubroutinecallcountsandsamples

ofsubroutinecallexecutiondurations.

ImashamedthatIdonthavegreatercommandover

mynativelanguagethantosayitthisway,buthereis

acatchy,modernwaytoexpresswhatIthinkabout

surrogatemeasures:

Surrogatemeasuressuck.

Here,unfortunately,suckdoesntmeannever

work.Itwouldactuallybe betterifsurrogate

measuresneverworked.Thennobodywoulduse

them.Theproblemisthatsurrogatemeasureswork

sometimes.Thisinspirespeoplesconfidencethatthe

measurestheyreusingshouldworkallthetime,and

thentheydont.Surrogatemeasureshavetwobigproblems.Theycantellyouyoursystemsokwhenits

not.Thatswhatstatisticianscall typeIerror,thefalse

positive.Andtheycantellyouthatsomethingisa

problemwhenitsnot.Thatswhatstatisticianscall

typeIIerror,thefalsenegative.Iveseeneachtypeof

errorwasteyearsofpeoplestime.

Whenitcomestimetoassessthespecificsofareal

system,yoursuccessisatthemercyofhowgoodthe

measurementsarethatyoursystemallowsyouto

obtain.IvebeenfortunatetoworkintheOracle

marketsegment,wherethesoftwarevendoratthe

centerofouruniverseparticipatesactivelyinmaking

itpossibletomeasuresystemstherightway.Gettingapplicationsoftwaredeveloperstousethetoolsthat

Oracleoffersisanotherstory,butatleastthe

capabilitiesarethereintheproduct.

21 PERFORMANCEISAFEATUREPerformanceisasoftwareapplicationfeature,justlike

recognizingthatitsconvenientforastringoftheform

Case1234toautomaticallyhyperlinkoverto

case1234inyourbugtrackingsystem.15Performance,

likeanyotherfeature,doesntjusthappen;ithasto

bedesignedandbuilt.Todoperformancewell,you

havetothinkaboutit,studyit,writeextracodeforit,testit,andsupportit.

However,likemanyotherfeatures,youcantknow

exactlyhowperformanceisgoingtoworkoutwhile

itsstillearlyintheprojectwhenyourewriting

studying,designing,andcreatingtheapplication.For

manyapplications(arguably,forthevastmajority),

performanceiscompletelyunknownuntilthe

productionphaseofthesoftwaredevelopmentlife

cycle.Whatthisleavesyouwithisthis:

Sinceyoucantknowhowyourapplicationis

goingtoperforminproduction,youneedto

writeyourapplicationsothatitseasytofixperformanceinproduction.

AsDavidGarvinhastaughtus,itsmucheasierto

managesomethingthatseasytomeasure.16Writing

15FogBugz,whichissoftwarethatIenjoyusing,doesthis.

16DavidGarvin,1993.BuildingaLearningOrganizationin

HarvardBusinessReview,Jul.1993.


13/15


anapplicationthatseasytofixinproductionbegins

withanapplicationthatseasytomeasurein

production.

Mosttimes,whenImentiontheconceptofproduction

performancemeasurement,peopledriftintoastateof

worryaboutthemeasurementintrusioneffectof

performanceinstrumentation.Theyimmediatelyenteramodeofdatacollectioncompromise,leavingonly

surrogatemeasuresonthetable.Wontsoftwarewith

extracodepathtomeasuretimingsbeslowerthanthe

samesoftwarewithoutthatextracodepath?

IlikeananswerthatIheardTomKytegiveoncein

responsetothisquestion. 17Heestimatedthatthe

measurementintrusioneffectofOraclesextensive

performanceinstrumentationisnegative10%or

less.18Hewentontoexplaintoanow-vexed

questionerthattheproductisatleast10%fasternow

becauseoftheknowledgethatOracleCorporationhas

gainedfromitsperformanceinstrumentationcode,morethanmakingupforanyoverheadthe

instrumentationmighthavecaused.

Ithinkthatvendorstendtospendtoomuchtime

worryingabouthowtomaketheirmeasurementcode

pathefficientwithoutfiguringouthowfirsttomakeit

effective.ItlandssquarelyupontheideathatKnuth

wroteaboutin1974whenhesaidthatpremature

optimizationistherootofallevil. 19Thesoftware

designerwhointegratesperformancemeasurement

intohisproductismuchmorelikelytocreateafast

applicationandmoreimportantlyanapplication

thatwillbecomefasterovertime.

22 ACKNOWLEDGMENTSThankyouBaronSchwartzfortheemailconversation

inwhichyouthoughtIwashelpingyou,butinactual

fact,youwerehelpingmecometogripswiththeneed

forintroducingcoherencydelaymoreprominently

intomythinking.ThankyouJeffHolt,RonCrisco,Ken

Ferlita,andHaroldPalacioforthedailyworkthat

keepsthecompanygoingandforthelunchtime

conversationsthatkeepmyimaginationgoing.Thank

youTomKyteforyourcontinuedinspirationand

support.ThankyouMarkFarnhamforyourhelpful

suggestions.AndthankyouNeilGuntherforyour

17TomKyte,2009.Acoupleoflinksandanadvertathttp://tkyte.blogspot.com/2009/02/couple-of-links-and-

advert.html.

18Whereorlessmeansorbetter,asin20%,30%,etc.

19DonaldKnuth,1974.StructuredProgrammingwithGoTo

StatementsinACMJournalComputingSurveys,Vol.6,No.4,

Dec.1974,p268.

patienceandgenerosityinourongoingdiscussions

aboutknees.

23 ABOUTTHEAUTHORCaryMillsapiswellknownintheglobalOracle

communityasaspeaker,educator,consultant,and

writer.HeisthefounderandpresidentofMethodR

Corporation(http://method-r.com),asmallcompany

devotedtogenuinelysatisfyingsoftwareperformance.

MethodRoffersconsultingservices,education

courses,andsoftwaretoolsincludingtheMethodR

Profiler,MRTools,theMethodRSLAManager,andthe

MethodRInstrumentationLibraryforOraclethat

helpyouoptimizeyoursoftwareperformance.

Caryistheauthor(withJeffHolt)of OptimizingOracle

Performance(OReilly),forwhichheandJeffwere

namedOracleMagazines2004AuthorsoftheYear.He

isaco-authorofOracleInsights:TalesoftheOakTable

(Apress).HeistheformerVicePresidentofOracleCorporationsSystemPerformanceGroup,andaco-

founderofhisformercompanycalledHotsos.Caryis

alsoanOracleACEDirectorandafoundingpartnerof

theOakTableNetwork,aninformalassociationof

Oraclescientiststhatarewellknownthroughoutthe

Oraclecommunity.Caryblogsat

http://carymillsap.blogspot.com,andhetweetsat

http://twitter.com/CaryMillsap.

24 EPILOG:OPENDEBATEABOUTKNEESInsections14through16,Iwroteaboutkneesin

performancecurves,theirrelevance,andtheirapplication.However,thereisopendebategoingback

atleast20yearsaboutwhetheritsevenworthwhile

totrytodefinetheconceptof knee,likeIvedonein

thispaper.

Thereissignificanthistoricalbasistotheideathat

suchathingthatIvedescribedasakneeinfactisnt

reallymeaningful.In1988,StephenSamsonargued

that,atleastforM/M/1queueingsystems,thereisno

kneeintheperformancecurve.Hewrote,The

choiceofaguidelinenumberisnoteasy,buttherule-

of-thumbmakersgorighton.Inmostcasesthereis

notaknee,nomatterhowmuchwewishtofind

one.20

Thewholeproblemremindsme,asIwrotein1999,21

oftheparableofthefrogandtheboilingwater.The

20StephenSamson,1988.MVSperformancelegendsinCMG1988ConferenceProceedings.ComputerMeasurement

Group,148159.

21CaryMillsap,1999.Performancemanagement:mythsand

factsavailableathttp://method-r.com.


14/15


storysaysthatifyoudropafrogintoapanofboiling

water,hewillescape.But,ifyouputafrogintoapan

ofcoolwaterandslowlyheatit,thenthefrogwillsit

patientlyinplaceuntilheisboiledtodeath.

Withutilization,justaswithboilingwater,thereis

clearlyadeathzone,arangeofvaluesinwhichyou

cantaffordtorunasystemwithrandomarrivals.Butwhereistheborderofthedeathzone?Ifyouare

tryingtoimplementaproceduralapproachto

managingutilization,youneedtoknow.

Recently,myfriendNeilGunther 22hasdebatedwith

meprivatelythat,first,thetermkneeiscompletely

thewrongwordtousehere,becausekneeisthe

wrongtermtouseintheabsenceofafunctional

discontinuity.Second,heassertsthattheboundary

valueof.5foranM/M/1systemiswastefullylow,that

yououghttobeabletorunsuchasystemsuccessfully

atamuchhigherutilizationvaluethan.5.And,finally,

hearguesthatanysuchspecialutilizationvalueshouldbedefinedexpresslyastheutilizationvalue

beyondwhichyouraverageresponsetimeexceeds

yourtoleranceforaverageresponsetime(Exhibit11).

Thus,Guntherarguesthatanyusefulnot-to-exceed

utilizationvalueisderivableonlyfrominquiriesabout

humanpreferences,notfrommathematics.

Exhibit11.Gunthersmaximumallowable

utilizationvalueTisdefinedastheutilization

producingtheaverageresponsetimeT.

TheproblemIseewiththisargumentisillustratedinExhibit12.Imaginethatyourtoleranceforaverage

responsetimeisT,whichcreatesamaximum

toleratedutilizationvalueof T.Noticethatevenatiny

22Seehttp://en.wikipedia.org/wiki/Neil_J._Guntherfor

moreinformationaboutNeil.See

http://www.cmg.org/measureit/issues/mit62/m_62_15

.htmlfor more information about his argument.

fluctuationinaverageutilizationnear Twillresultin

ahugefluctuationinaverageresponsetime.

Exhibit12.NearTvalue,smallfluctuationsin

averageutilizationresultinlargeresponsetime

fluctuations.

Ibelieve,asIwroteinsection4,thatyourcustomers

feelthevariance,notthemean.Perhapstheysaythey

willacceptaverageresponsetimesuptoT,butIdont

believethathumanswillbetolerantofperformance

onasystemwhena1%changeinaverageutilization

overa1-minuteperiodresultsin,say,aten-times

increaseinaverageresponsetimeoverthatperiod.

Idounderstandtheperspectivethatthekneevalues

Ivelistedinsection14arebelowtheutilizationvalues

thatmanypeoplefeelsafeinexceeding,especiallyfor

lowerordersystemslikeM/M/1.However,Ibelieve

thatitisimportanttoavoidrunningresourcesataverageutilizationvalueswheresmallfluctuationsin

utilizationyieldtoo-largefluctuationsinresponse

time.

Havingsaidthat,Idontyethaveagooddefinitionfor

whatatoo-largefluctuationis.Perhaps,like

responsetimetolerances,differentpeoplehave

differenttolerancesforfluctuation.Butperhapsthere

isafluctuationtolerancefactorthatapplieswith

reasonableuniversalityacrossallhumanusers.The

Apdexstandard,forexample,assumesthatthe

responsetimeFatwhichusersbecomefrustratedis

universallyfourtimestheresponsetimeTatwhichtheirattitudeshiftsfrombeingsatisfiedtomerely

tolerating.23

Theknee,regardlessofhowyoudefineitorwhatwe

endupcallingit,isanimportantparametertothe

capacityplanningprocedurethatIdescribedin

23Seehttp://www.apdex.orgformoreinformationabout

Apdex.

0 0.5 T 0.9000

5

10

15

20

Utilization

Responsetim

eR

MM1 system, T 10.

0 0.744997 T 0.9870

5

10

15

20

Utilization

Responsetime

R

MM8 system, T 10.


15/15


section16,andIbelieveitisanimportantparameter

tothedailyprocessofcomputersystemworkload

management.

Iwillkeepstudying.

thinking clearly (paper)

Documents