high performance computing with - revolutions · high performance computing with doazureparallel...
TRANSCRIPT
HighPerformanceComputingwithdoAzureParallelUsingAzureasyourParallel-BackendforEmbarassingly Parallelwork
MicrosoftJSTan
AzureBigCompute
Commodity,mostvalueforcost
Fastprocessors,highermemory-to-coreratio,SSDs
Mostmemory,IntelXeonprocessors
HPC/LowLatencyVMsforcomputeintensiveworkloads
GPUenabledVMsforVisualization/Compute
Fastprocessors,lower-memorytocoreratio,SSDs
AzureInfrastructure
WhatisBatch?
APP
Tasksareassignedtocomputers/VMs
Manyindividualtasks
Manycomputers/VMs
Scenarios
• Aquantback-testingportfolio strategies• Adatascientistoptimizingtheirmodel¶metertuning• Alife-scienceresearcherdoinggenomesequencing
Whatdotheyhaveincommon?
• Scale– computationallyexpensivework- needtoscaleupinordertogetresultsbackquickly• MinimalITManagement – theuseristhedomainspecialist,notanITspecialist• Elasticcompute – temporaryneedforalotofcapacity• Costeffective – lowcoststrategiesareimportant!
+TheyareallprobablyusingR…
doAzureParallel is...ARpackagethatusesAzureasaparallel-backendforpopularopensourcetoolstouse– foreach,caret, dplyr,etc.
Foreach usingdoAzureParallel
foreach (i = 1:100) %dopar% {
myParallelAlgorithm(...)
}MicrosoftAzure
doAzureParallel onAzureBatch
AzureBatchisaplatformservicethatprovideseasyjobschedulingandclustermanagement,allowingapplicationsoralgorithmstoruninparallelatscale.
• Capacityondemand;jobsondemand• Autoscale (moreonthislater)• Minimalclustermanagement(nodefailure,install,etc)• Hardwarechoice– useanyVMsize• Paybytheminute• Costeffective– nochargeforusingit,youonlypayfortheVMs• Morecosteffective– lowpriorityVMs(moreonthislater)
Ifyouwanttorunjobsusingelasticcompute,Batchisagreatfit!
Scale
• From1to10,000VMsforacluster• From1tomillionsoftasks• Yourselectionofhardware:
• GeneralcomputeVMs(A-Series/D-Series)• Memory/storageoptimized(G-Series)• ComputeOptimized(F-Series)• GPUenabled(N-Series)
• Resultsfromcomputingthemandelbrot setwhenscalingup:
Localmachine
5parallelworkers
10parallelworkers
20parallelworkers
MinimalClusterManagement
• AbstractawaycomplexAzure/cloudconcepts• ZeroIT-levelmanagement• WorkentirelyinRStudio
• Monitor/DebugyourjobsdirectlyinRstudio
• ManageyourclusterandmultiplejobsdirectlyinRstudio
• Theresultsofyourdistributed,largescalework canbereturneddirectlytoyourRsession
Minimalcodechange
• Minimalcodechangetouse doAzureParallel• Easytouseandyoucangetstartedinjustafewlinesofcode
ElasticCompute
• Computeon-demand• Create/deleteyourclusterasyouneed
• Autoscaling pool=maximizingcloudelasticity• Longrunningbatchjobs/overnight• Dailyscheduledwork– pre-provisionclustersoitsreadyforyouatthebeginningoftheday
• Bursty work
CostEffective• Low-Priority= (extremely) LowCosts• ProvisioningVMsfromAzure’ssurpluscapacityat80% discount• YourAzureclustercancontainbothregular(dedicated)VMsandlow-priorityVMs
MyLocalRSession
AzureBatch
LowPriorityVMsatupto80%discountDedicatedVMs
CostEffective:MoreaboutLowPriorityWhenshouldIuseit?• Longrunningworkthatcanbebrokenintosmallerpiecesandworkthatdoesn'thaveastricttimelimittocomplete
• Experimentation,testing,evaluatingmodels
Whatyouneedtoknowwhenusingit:• PossibilitythatAzure
• willnotallocateyourVMsOR• thatitwilltakesomeorallofthecapacityback
• Ifanodeispre-empted• AzureBatchwillreplaceyournodeforyou• AzureBatchwillrescheduleyourworksothatyoujobcansuccessfullycomplete
LowPriorityScenarios Dedicated Low-priority
LowestCostLowercost+
guaranteedbaselinecapacityLowercost
+maintainingcapacityw/autoscale
AzureBatchPool
AzureBatchPool
AzureBatchPool
PreemptedCa
pacity
Time
Capacity
Time
Capacity
Time
Questions?www.github.com/azure/doazureparallel
https://aka.ms/earl2017
What’snewwithdoAzureParallel?
• Lowprioritysupporta• RicherJobManagementexperiencea• ResourceFilestopreloaddataa• ParameterTuningintegrationwithCareta• SimpleconnectortoAzureBlobStoragea
R+AzureBatch
SowhatRworkloadsworkgreatonAzureBatch?• Simulationbasedwork(VaR calculation,back-testing,monte-carlo simulations,financialmodelling)
• ParameterTuning/ModelEvaluation(gridsearch,randomsearch,crossvalidation,etc)• Computingagainstdata/ETLjobs/Data-prepjobs
Whatindustries/verticalsmightbeinterestedinusingthis?• FinancialServices• Education&Research• Sportsanalytics
doAzureParallel (sinceinitialrelease)
• InitialreleaseinMarch• Grassrootsstrategy• End-userfocused• FinancialServicestargeted/keymessaginghasbeenaroundsimulationbasedwork• Interestfromthefield• Feedback
AzureBatch
LowPriorityVMsatupto80%discountDedicatedVMs
MyLocalRSession