tanel poder exadata performance 01
TRANSCRIPT
-
7/27/2019 Tanel Poder Exadata Performance 01
1/6
WhitepaperbyTanelPoderhttp://blog.tanelpoder.com
AreyougettingthemostoutofyourExadata?Part1:BasicSmartScans-version1.02ByTanelPoder([email protected])
WelcometoreadingmyfirstwhitepaperintheExadataPerformanceseries.IwillspareyoufromyetanotherechoofthemarketingmaterialandcopyoftheExadataspec-sheets.InsteadIwillexplainafewscenarioshere,whereyoumightnotbegettingthemostoutofyourExadatainvestmentandhowtodetectthatyourself.WelllookintoExadataperformanceinaDataWarehousingworkloadcontext.WheneverIhavebeeninvolvedinamigrationfromoldplatformtoExadata,theapplicationsusuallyrunmuchfasterontheExadataplatformasexpected.ThiscanhappenthankstoallthefundamentalimprovementsofExadata(themarketingstuffyouconstantlyhearabout),butsomeoftheextraperformancemayjustcomefromthefactthatyoumigratedtoOracle11.2fromyourolddatabaseversion.AndsomeoftheperformancemaycomefromrunningonwayfasterCPUsthanyour5-year-oldbigironboxhadinit.Yeah,thatsawildgeneralizationhere,butmypointisthatevenifyouarehappywithyourExadataexperiencesofar,youmightnotactuallybefullyusingallthebenefitsthatExadatacanoffer!
CheckingWhetherSmartScanningIsUsed
LetsfirstseehowtoidentifywhetheryourqueriesaretakingfulladvantageoftheExadatasecretsauceSmartScans.Heresasimpleexecutionplan,whichhassomeExadata-specificelementsinit:-----------------------------------------------------------------------| Id | Operation | Name | Pstart| Pstop |
-----------------------------------------------------------------------| 0 | SELECT STATEMENT | | | ||* 1 | FILTER | | | || 2 | HASH GROUP BY | | | ||* 3 | HASH JOIN | | | || 4 | PART JOIN FILTER CREATE | :BF0000 | | || 5 | PARTITION HASH ALL | | 1 | 16 ||* 6 | HASH JOIN | | | ||* 7 | TABLE ACCESS STORAGE FULL| ORDERS | 1 | 16 || 8 | TABLE ACCESS STORAGE FULL| ORDER_ITEMS | 1 | 16 || 9 | PARTITION HASH JOIN-FILTER | |:BF0000|:BF0000||* 10 | TABLE ACCESS STORAGE FULL | CUSTOMERS |:BF0000|:BF0000|-----------------------------------------------------------------------
1 - filter("C"."CREDIT_LIMIT"
-
7/27/2019 Tanel Poder Exadata Performance 01
2/6
WhitepaperbyTanelPoderhttp://blog.tanelpoder.com
attempttooffloadthepredicateconditionsintothecells.Ifthesmartscanwasnotused,
thenpredicateoffloadingdidnothappeneither(asthefilterpredicateoffloadingispartofsmartscan).
Important:
ThekeythingtorememberisthatExadatasmartscanisnotonlyapropertyofanexecutionplan-it'sactuallyaruntimedecisiondoneseparatelyforeachtable/index/partitionsegmentaccessedwithafullscan.Itisnotpossibletodeterminewhetherasmartscanhappenedjustbylookingintotheexecutionplan,youshouldmeasureexecutionmetrics
fromV$SQLorV$SESSIONtobesure.
Itisalsopossiblethatasmartscanisattemptedagainstasegment,butduringthesmartscanexecution,thecellshavetofallbacktoregularblockIOmodeforsomeblocksandshiptheblocksbacktodatabaseforprocessing-insteadofextractingrowsfromtheminsidethe
cell.Forexample,thishappensforblocksforwhichtheconsistentreadsrequireaccesstoundodataortherearechainedrowsinablock.TherearemorereasonsandweexplaintheseindetailintheupcomingExpertOracleExadatabook.
Thisshouldalsoexplainwhyisthereafilterpredicate(executedinthedatabaselayer)inadditiontoeverystoragepredicateintheplan,becausesometimesthefilteringcannotbeentirelyoffloadedtothestoragecellsandthedatabaselayerhastoperformthefinal
filtering.
SerialExecutionWithoutSmartScan
Letslookintoanexamplequerynow,executedinserialmodeatfirst.Basicallyits
returningallcustomerswhohaveevermadeorderswherethetotalorderamountexceedstheircustomer-specificcreditlimit.Allthetablesarepartitionedandtheirtotalsizeisaround11GB.NotethatthisqueryiswrittenagainstaregularnormalizedOLTP-style
schema,notastar-orsnowflakeschemawhichisyouarelikelyusinginyourDWdatabases.Butforpurposeofthisdemoitshouldbeenough.SELECT
c.customer_id, c.cust_first_name ||' '||c.cust_last_name, c.credit_limit, MAX(oi.unit_price * oi.quantity) max_order_total
FROMsoe.orders o
, soe.order_items oi, soe.customers c
WHERE-- join conditions
c.customer_id = o.customer_idAND o.order_id = oi.order_id-- constant filter conditionsAND c.nls_territory = 'AMERICA'AND o.order_mode = 'online'AND o.order_status = 5GROUP BY
c.customer_id, c.cust_first_name ||' '||c.cust_last_name, c.credit_limit
HAVINGMAX(oi.unit_price * oi.quantity) > c.credit_limit;
WhenexecutedonanotherwiseidlequarterrackExadataV2installation,ittook151
secondstorun,whichseemstoomuch,knowingthatthesmartscansonevenaquarterrack
-
7/27/2019 Tanel Poder Exadata Performance 01
3/6
WhitepaperbyTanelPoderhttp://blog.tanelpoder.com
canscandataonharddisksmultiplegigabytespersecond(andevenfasterfromflash
cache).So,IwillidentifytheSQLID,childcursornumberandexecutionIDofthiscurrentongoingSQLexecutionfirst:
SQL> SELECT sql_id, sql_child_number, sql_exec_id2> FROM v$session WHERE sid=200;
SQL_ID SQL_CHILD_NUMBER SQL_EXEC_ID------------- ---------------- -----------9n2fg7abbcfyx 0 16777224
Now,letslookintoV$SQLfirst,usingtheSQLIDandchildcursornumber:SQL> SELECT
2 ROUND(physical_read_bytes/1048576) phyrd_mb3 , ROUND(io_cell_offload_eligible_bytes/1048576) elig_mb4 , ROUND(io_interconnect_bytes/1048576) ret_mb
5 , (1-(io_interconnect_bytes/NULLIF(physical_read_bytes,0)))*100 "SAVING%"6 FROM7 v$sql8 WHERE9 sql_id = '9n2fg7abbcfyx'
10 AND child_number = 0;
PHYRD_MB ELIG_MB RET_MB SAVING%---------- ---------- ---------- ----------
10833 0 10833 0
Thephysical_read_bytesmetric(phyrd_mb)showsthatOracledatabaselayerhasissued10833MBworthofIOcallsforthisSQL.Andtheio_interconnect_bytes(ret_mb)showsthat
thisqueryhasalsoused10833MBworthofIOinterconnecttraffic(betweendatabasehostandstoragecells).So,thesmartscanswerenotabletoreducethecell-databaseIOtrafficatall.Infact,whenlookingintoio_cell_offload_eligible_bytes (elig_mb),itszero.Thismeansthatthedatabasehasnoteventriedtodosmartscanoffloadingforthisstatement.If10GBworthofsegmentswouldbereadviasmartscans,thentheeligiblebytesforoffloadwouldalsoshow10GB.So,thisio_cell_offload_eligible_bytes metricisakeyfordeterminingwhetheranyoffloadinghasbeenattemptedforaquery.NotethatV$SQLaccumulatesstatisticsovermultipleexecutionsofthesamequery,soifthiscursorhasalreadybeenexecutedbefore(andisstillincache)youshouldnotlookintotheabsolutevaluesintheV$SQLcolumns,butratherbyhowmuchtheyincrease(calculatedeltasfrombefore-andafter-testvalues).
Anotheroptionistolookintowhatthesession(s)executingthisSQLaredoingandwaitingfor.YoucanuseSQLtraceorASHforthis,oranyothertool,whichiscapableofextractingthewaitinformationfromOracle.Heresawaitprofileexamplefromtheproblemsession,takenwithOracleSessionSnapper(afreeOracletroubleshootingtooldownloadablefrommyblog):
--------------------------------------------------------------------------------Active% | SQL_ID | EVENT | WAIT_CLASS--------------------------------------------------------------------------------
49% | 9n2fg7abbcfyx | cell multiblock physical read| User I/O40% | 9n2fg7abbcfyx | ON CPU | ON CPU10% | 9n2fg7abbcfyx | gc cr multi block request | Cluster
-
7/27/2019 Tanel Poder Exadata Performance 01
4/6
WhitepaperbyTanelPoderhttp://blog.tanelpoder.com
Indeedwearentseeingthecellsmarttablescan (orcellsmartindexscan)waitevents,buta
cellmultiblockphysicalreadwaiteventasthetoponewhichshowsthathalfofthequerytimeisspentdoingregularmultiblockIOsfromstoragetodatabase.Notethattheresstillachancethatsmartscansarehappeningforsometablesinvolvedinthequery(buttheydontshowupinthetopwaitsthankstotheirfast,asynchronousnature),buttheregular
multiblockreadsshowupthankstorestofthetablesnotusingsmartscan.Luckilyitspossibletocheckexactlywhichexecutionplanrowsourcesdousethesmartscanandhowmuchdotheybenefitfromit.
WecangetthisinformationfromtheRealTimeSQLMonitoringviews,eitherbymanuallyqueryingV$SQL_PLAN_MONITORorbyGridControl/DatabaseControlUItools.Iliketoalwaysknowwherethedataiscomingfrom,soletsstartfromamanualquery.Notethat
thesql_idandsql_exec_idvaluesaretakenfrommypreviousqueryagainstV$SESSIONabove:
SQL> SELECT2 plan_line_id id3 , LPAD(' ',plan_depth) || plan_operation
4 ||' '||plan_options||' '5 ||plan_object_name operation6 , ROUND(physical_read_bytes /1048576) phyrd_mb7 , ROUND(io_interconnect_bytes /1048576) ret_mb8 , (1-(io_interconnect_bytes/NULLIF(physical_read_bytes,0)))*100 "SAVING%"9 FROM
10 v$sql_plan_monitor11 WHERE12 sql_id = '9n2fg7abbcfyx'13 AND sql_exec_id = 16777224;
ID OPERATION PHYRD_MB RET_MB SAVING%--- --------------------------------------------- ---------- ---------- --------
0 SELECT STATEMENT 0 01 FILTER 0 02 HASH GROUP BY 0 03 HASH JOIN 0 04 PART JOIN FILTER CREATE :BF0000 0 05 HASH JOIN 0 06 PART JOIN FILTER CREATE :BF0001 0 07 PARTITION HASH ALL 0 08 TABLE ACCESS STORAGE FULL ORDERS 2038 2038 09 PARTITION HASH JOIN-FILTER 0 0
10 TABLE ACCESS STORAGE FULL CUSTOMERS 3943 3943 011 PARTITION HASH JOIN-FILTER 0 012 TABLE ACCESS STORAGE FULL ORDER_ITEMS 4834 4834 0
TheV$SQL_PLAN_MONITORdoesnthavetheio_offload_eligible_bytescolumnshowinghowmanybytesworthofsegmentswerescannedusingsmartscans,butneverthelesstheio_interconnect_bytescolumntellsushowmanybytesofIOtrafficbetweenthedatabasehostandcellsweredoneforthegivenaccesspath.Intheaboveexample,exactlythesameamountofdatawasreturnedbycells(RET_MB)asrequestedbydatabase(PHYRD_MB),so
therewasnointerconnecttrafficsavingatall.Whenthesenumbersexactlymatch,thisisagoodindicationthatnooffloadingwasperformed(asalltheIOrequestedbydatabasewasreturnedinuntouchedblockstothedatabaseandnodatareductionduetofilteringand
projectionoffloadingwasperformedinthecells).So,thisisadditionalconfirmationthatnoneofthetableswerescannedwithsmartscan.Icausedtheaboveproblemdeliberately.Ijustsettheundocumentedparameter
_small_table_threshholdtoabigvalue(1000000blocks)inmytestingsession.ThismadeOracletablescanningfunctionthinkthatallthescannedpartitionsweresmallsegments,
-
7/27/2019 Tanel Poder Exadata Performance 01
5/6
WhitepaperbyTanelPoderhttp://blog.tanelpoder.com
whichshouldbescannedusingregularbufferedreadsasopposedtodirectpathreads,
whichareapre-requisiteforsmartscans.Notethattheabovequerywasexecutedinserialmode,notparallel.StartingfromOracle11g,Oraclecandecidetousedirectpathreads(andasaresultsmartscans)evenforserialsessions.Thedirectpathreaddecisionisdoneduringruntime,separatelyforeachtable/index/partitionsegmentaccessedandits
dependentonhowmanyblocksthissegmenthasunderitsHWMandwhatsthebuffercachesizeetc.So,youmightencountersuchanissueofOracledecidingtonotusedirectpathreads(andthussmartscan)inreallifetooandasitsanautomaticdecision,itmay
changeunexpectedly.
SerialExecutionWithSmartScan
Letsrunthesamequerywithoutmytricktodisablesmartscans,Imusingdefaultsession
parametersnow.Thequery,stillexecutedinserial,tool26seconds(asopposedto151secondspreviously).Thewaitprofilelooksdifferent,thecellsmarttablescanwaitispresentandtherearenoregularblockIOrelatedwaits.TheCPUusagepercentageishigher,
butrememberthatthisquerycompletedover5timesfasterthanpreviously.HigherCPU
usagewithlowerresponsetimeisagoodthing,ashighCPUusagemeansthatyourqueryspentlesstimewaitinginsteadofworkingandwasntthrottledbyIOandotherissues.
--------------------------------------------------------------------------------Active% | SQL_ID | EVENT | WAIT_CLASS--------------------------------------------------------------------------------
73% | 9n2fg7abbcfyx | ON CPU | ON CPU28% | 9n2fg7abbcfyx | cell smart table scan | User I/O
Forsimpleenoughqueriesyoumightactuallyseethatonlyacoupleofpercentofresponse
timeisspentwaitingforsmartscansandalltherestisCPUusage.Thishappensthankstotheasynchronousnatureofsmartscans,wherecellsworkforthedatabasesessionsindependentlyandmaybeabletoconstantlyhavesomedatareadyfortheDBsessiontoconsume.
Now,letslookintoV$SQLentriesofthatcursor(anewchildcursor1wasparsedthankstomechangingthe_small_table_thresholdparametervalueback):
SQL> SELECT2 ROUND(physical_read_bytes/1048576) phyrd_mb3 , ROUND(io_cell_offload_eligible_bytes/1048576) elig_mb4 , ROUND(io_interconnect_bytes/1048576) ret_mb5 , (1-(io_interconnect_bytes/NULLIF(physical_read_bytes,0)))*100 "SAVING%"6 FROM7 v$sql8 WHERE9 sql_id = '9n2fg7abbcfyx'
10 AND child_number = 1;
PHYRD_MB ELIG_MB RET_MB SAVING%---------- ---------- ---------- ----------
10815 10815 3328 69.2%
ApparentlyallthephysicalIOsrequestedwererequestedusingthesmartscanningmethod,asELIG_MBisthesameasPHYRD_MB.Apparentlysomefilteringandprojectionwasdoneinthecellastheinterconnecttraffic(RET_MB)ofthatstatementisabout69.2%lessthanthescannedsegmentssizesondisk.
Letsdrilldownintoindividualexecutionplanlines,whichallowsustolookintothe
efficiencyofaccessingindividualtables.InreallifeDWsyouaremorelikelydealingwith10..20tablejoins,not3-tablejoins.Notethatthesql_exec_idhasincreasedasIvere-executedtheSQLstatement:
-
7/27/2019 Tanel Poder Exadata Performance 01
6/6
WhitepaperbyTanelPoderhttp://blog.tanelpoder.com
SQL> SELECT2 plan_line_id id3 , LPAD(' ',plan_depth) || plan_operation
4 ||' '||plan_options||' '5 ||plan_object_name operation6 , ROUND(physical_read_bytes /1048576) phyrd_mb7 , ROUND(io_interconnect_bytes /1048576) ret_mb8 , (1-(io_interconnect_bytes/NULLIF(physical_read_bytes,0)))*100 "SAVING%"9 FROM
10 v$sql_plan_monitor11 WHERE12 sql_id = '9n2fg7abbcfyx'13 AND sql_exec_id = 16777225;
ID OPERATION PHYRD_MB RET_MB SAVING%--- --------------------------------------------- ---------- ---------- -------
0 SELECT STATEMENT 0 01 FILTER 0 0
2 HASH GROUP BY 0 03 HASH JOIN 0 04 PART JOIN FILTER CREATE :BF0000 0 05 PARTITION HASH ALL 0 06 HASH JOIN 0 07 TABLE ACCESS STORAGE FULL ORDERS 2038 2 99.98 TABLE ACCESS STORAGE FULL ORDER_ITEMS 4834 3125 35.39 PARTITION HASH JOIN-FILTER 0 0
10 TABLE ACCESS STORAGE FULL CUSTOMERS 3943 201 94.9
TheaboveoutputclearlyshowsthattheORDERSandCUSTOMERStablesbenefitfromsmartscanthemost.Notethattheexecutionplanjoinorderhaschangedtoo,thisisthankstotheautomaticcursorre-optimizationusingcardinalityfeedbackfrompreviouschildcursorsexecutionstatistics.
Summary
InordertogetthemostoutofyourExadataperformanceforyourDWapplication,youllhavetousesmartscans.OtherwiseyouwontbeabletotakeadvantageofthecellscomputingpoweranditsIOreductionfeatures.Fullsegmentscans(likefulltable/partitionscansandfastfullindexscans)areapre-requisiteforsmartscans.Additionally,directpath
readshavetobeusedinordertothesmartscanstokickin.Forserialsessions,thedirectpathreaddecisionisdonebasedonthescannedsegmentsize,buffercachesizeandsome
otherfactors.Forparallelexecution,directpathaccessisalwaysusedforfullscans,unlessyouusethenewparallel_degree_policy=AUTOfeature,inwhichcasethedecisionwould
againbedynamic.Inthisarticleweonlymanagedtotouchthesurfaceofalltheoptimizationandbenefitsthat
Exadatagivesus.Wedidntevenlookintoparallelexecutionyet,althoughfromsmartscanningperspectiveitsnottoodifferentfromserialexecution.Also,IranoutofspacebeforeIgottoexplainthebloomfilterpushdowntothecells,whichwouldallowtopushthe:BF000xfiltersthatyoumayseeintheexecutionplanswithhashjoins,allthewaytothe
cells-toreducetheamountofreturneddataevenmore.Well,hopefullyinafuturearticle