a robust partitioning scheme for ad-hoc query...

ARobustPartitioningSchemeforAd-HocQueryWorkloads

ANILSHANBHAGMIT

J/WAlekh Jindal,SamMadden, JorgeQuiane, AaronJ.ElmoreMicrosoftMIT QCRI Univ.Chicago

Today

Datacollectionischeap=>Lotsofdata!

DataPartitioning

FindaverageordersizeforallordersbetweenSept10andSept11,2017

DataSkipping - Skipdatablocksnotnecessary

10%selectivityquery=>10xfasterifdatapartitionedonselectionpredicate

Orderdate

TheProblem

Analytics

Ad-Hoc/ExploratoryAnalysis

RecurringWorkloads

+

Focusofexistingwork

Giveworkload=>Returnpartitioninglayout

Problems:1. Tedioustocollectworkload2. Maynotbeknownupfront3. Changesovertime

Howtogetbenefitsofpartitioninginthiscase?

OurApproach

Doeverythingadaptively!

Twostepprocess:1. Upfrontloadthedatasetpartitioned2. Asusersquery,incrementallyimprovethe

partitioningofthedata

DistributedstoragesystemslikeHDFS,filesbrokenintoblocks(128MBchunks)

ASamenumberofblockscreatedasinHDFS.Eachblocknowhasadditionalmetadata

AdaptiveRe-Partitioning

Whenusersubmitsaquery,optimizertriestoimprovethepartitioningbyreorganizingthepartitioningtree

HereifqueriesaskA

SystemArchitecturePredicatedScanQueryExample:

FINDemployeesWITHAge<30AND20k<Salary<40k1

2

1.UpfrontPartitionerGoal:Generateapartitioningtree

WITHOUTanupfrontqueryworkload

>Generatesatreewithheterogeneousbranching

>Balancethepartitioningbenefitacrossallattributes

!

" #

$

! " !

AllocationGoal: Balancepartitioningbenefitacrossattributes

Allocationofattributei ~averagepartitioningofanattributej

= 𝛴all nodes i nij cij

UpfrontPartitioningAlgorithm

AttributeAllocations

PartitioningTree

UniformifnoworkloadinformationWeightedifwehavepriorworkloadinformation

2.AdaptiveQueryExecutorGoal:Returnmatchingtuples+checkifpartitioninglayoutcanbeimproved

Alternativesfoundviatransformationsonthepartitioningtree

1.SwapRule

2.PushupRule 3.RotateRule

Gettingaplan

CostModelThesystemmaintainawindowWofpastqueries

ComputeBenefitandRepartitioningCostforthebestplan

RepartitioningONLY happenswhenreductioninthetotalcostofthequeryworkloadisgreaterthanre-partitioningcost.

Solvesconstantre-partitioningduetorandomquerysequencesandboundstheworsecaseimpact.

Performance

4metrics

1)Loadtime

2)Timetakenbyfirstquery

3)Aggregateruntimeoveraworkload

4)Incrementalimprovementwithworkloadhints

LoadTimeTPC-H:ScaleFactor200+De-normalized.Datasize:1.4TB

Loadingperformance: 1.38timesslowerthanHDFS

Loadtimescalesalmostlinearlywithdatasizeandindependentofnumberofcolumns

Timetakenbyfirstquery

OnAverage:45%betterthanfullscan20%betterthank-dtree

AggregateWorkloadRuntime

0400800

120016002000

0400800

120016002000

0400800

120016002000

0 25 50 75 100 125 150 175 2004uery 1o

0400800

1200160020007

ime

7aNe

Q (iQ

s)

full scaQ raQge raQge2 AmoebaWorkload:200Queriesgeneratedfromrandominitializationof8querytemplatesofTPC-Hbenchmark

fullscan – Baseline

range – partitionsonorderdate (1perdate)1.88xbetter

range2– partitionsonorderdate(64),r_name(4),c_mktsegment(4),quantity(8)3.48xbetter

Amoeba– 3.84xbetterthanbaseline

WorkloadHints

0400800

120016002000

0 25 50 75 100 125 150 175 2004uery 1o

0400800

120016002000

7im

e 7a

NeQ

(iQ s

)

default better iQitBetterInit:Startswithcustomallocationtomimicrange2

6.67xbetterthan fullscan

Filteringratio:default:0.81betterinit :0.9

Conclusion•Amoeba isadistributedstoragesystembasedonanadaptivedatapartitioningscheme• Lowloadingoverhead• Improvedfirstqueryperformance• Adapttochangesandsignificantlyimprovementtoworkloadruntime• Canexploitworkloadhints

•Allowsanalyststogetstartedrightawayandreapbenefitsofpartitioningwithoutanupfrontworkload

a robust partitioning scheme for ad-hoc query...

Documents