a robust partitioning scheme for ad-hoc query...
TRANSCRIPT
-
ARobustPartitioningSchemeforAd-HocQueryWorkloads
ANILSHANBHAGMIT
J/WAlekh Jindal,SamMadden, JorgeQuiane, AaronJ.ElmoreMicrosoftMIT QCRI Univ.Chicago
-
Today
Datacollectionischeap=>Lotsofdata!
-
DataPartitioning
FindaverageordersizeforallordersbetweenSept10andSept11,2017
DataSkipping - Skipdatablocksnotnecessary
10%selectivityquery=>10xfasterifdatapartitionedonselectionpredicate
Orderdate
-
TheProblem
Analytics
Ad-Hoc/ExploratoryAnalysis
RecurringWorkloads
+
Focusofexistingwork
Giveworkload=>Returnpartitioninglayout
Problems:1. Tedioustocollectworkload2. Maynotbeknownupfront3. Changesovertime
Howtogetbenefitsofpartitioninginthiscase?
-
OurApproach
Doeverythingadaptively!
Twostepprocess:1. Upfrontloadthedatasetpartitioned2. Asusersquery,incrementallyimprovethe
partitioningofthedata
-
DistributedstoragesystemslikeHDFS,filesbrokenintoblocks(128MBchunks)
ASamenumberofblockscreatedasinHDFS.Eachblocknowhasadditionalmetadata
-
AdaptiveRe-Partitioning
Whenusersubmitsaquery,optimizertriestoimprovethepartitioningbyreorganizingthepartitioningtree
HereifqueriesaskA
-
SystemArchitecturePredicatedScanQueryExample:
FINDemployeesWITHAge<30AND20k<Salary<40k1
2
-
1.UpfrontPartitionerGoal:Generateapartitioningtree
WITHOUTanupfrontqueryworkload
>Generatesatreewithheterogeneousbranching
>Balancethepartitioningbenefitacrossallattributes
!
" #
$
! " !
-
AllocationGoal: Balancepartitioningbenefitacrossattributes
Allocationofattributei ~averagepartitioningofanattributej
= 𝛴all nodes i nij cij
UpfrontPartitioningAlgorithm
AttributeAllocations
PartitioningTree
UniformifnoworkloadinformationWeightedifwehavepriorworkloadinformation
-
2.AdaptiveQueryExecutorGoal:Returnmatchingtuples+checkifpartitioninglayoutcanbeimproved
Alternativesfoundviatransformationsonthepartitioningtree
1.SwapRule
2.PushupRule 3.RotateRule
-
Gettingaplan
-
CostModelThesystemmaintainawindowWofpastqueries
ComputeBenefitandRepartitioningCostforthebestplan
RepartitioningONLY happenswhenreductioninthetotalcostofthequeryworkloadisgreaterthanre-partitioningcost.
Solvesconstantre-partitioningduetorandomquerysequencesandboundstheworsecaseimpact.
-
Performance
4metrics
1)Loadtime
2)Timetakenbyfirstquery
3)Aggregateruntimeoveraworkload
4)Incrementalimprovementwithworkloadhints
-
LoadTimeTPC-H:ScaleFactor200+De-normalized.Datasize:1.4TB
Loadingperformance: 1.38timesslowerthanHDFS
Loadtimescalesalmostlinearlywithdatasizeandindependentofnumberofcolumns
-
Timetakenbyfirstquery
OnAverage:45%betterthanfullscan20%betterthank-dtree
-
AggregateWorkloadRuntime
0400800
120016002000
0400800
120016002000
0400800
120016002000
0 25 50 75 100 125 150 175 2004uery 1o
0400800
1200160020007
ime
7aNe
Q (iQ
s)
full scaQ raQge raQge2 AmoebaWorkload:200Queriesgeneratedfromrandominitializationof8querytemplatesofTPC-Hbenchmark
fullscan – Baseline
range – partitionsonorderdate (1perdate)1.88xbetter
range2– partitionsonorderdate(64),r_name(4),c_mktsegment(4),quantity(8)3.48xbetter
Amoeba– 3.84xbetterthanbaseline
-
WorkloadHints
0400800
120016002000
0 25 50 75 100 125 150 175 2004uery 1o
0400800
120016002000
7im
e 7a
NeQ
(iQ s
)
default better iQitBetterInit:Startswithcustomallocationtomimicrange2
6.67xbetterthan fullscan
Filteringratio:default:0.81betterinit :0.9
-
Conclusion•Amoeba isadistributedstoragesystembasedonanadaptivedatapartitioningscheme• Lowloadingoverhead• Improvedfirstqueryperformance• Adapttochangesandsignificantlyimprovementtoworkloadruntime• Canexploitworkloadhints
•Allowsanalyststogetstartedrightawayandreapbenefitsofpartitioningwithoutanupfrontworkload