spruce special priority and urgent computing environment advisor demo nick trebon university of...
TRANSCRIPT
![Page 1: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/1.jpg)
SPRUCESpecial PRiority and
Urgent Computing Environment
Advisor Demo
Nick TrebonUniversity of Chicago
Argonne National Laboratory
http://spruce.teragrid.org/
![Page 2: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/2.jpg)
Urgent Computing - 2University of Chicago Argonne National Lab
Urgent Computing Resource Selection
Given an urgent computation and a deadlinehow does one select the “best” resource? “Best”: Resource that provides the configuration most likely to meet the deadline• Configuration: a specification of the runtime parameters for an urgent computation on a given resource Runtime parameters: # cpus, input/output repository, priority, etc.
Cost function (priority): • Normal --> No additional cost• SPRUCE --> Token + intrusiveness to other users
![Page 3: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/3.jpg)
Urgent Computing - 3University of Chicago Argonne National Lab
Probabilistic Bounds on Total Turnaround Time
• Input Phase: (IQ,IB)
• Resource Allocation Phase: (AQ,AB)
• Execution Phase: (EQ,EB)
• Output Phase: (OQ,OB)
• If each phase is independent, then: Overall bound = IB
+ AB + EB + OB
Overall quantile ≥ IQ * AQ * EQ * OQ
![Page 4: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/4.jpg)
Urgent Computing - 4University of Chicago Argonne National Lab
IB+AB+EB+OB: File Staging
Delay• Methodology is the same for input/output file staging
• Utilize Network Weather Service to generate bandwidth predictions based upon probe Wolski, et al.: Use MSE as sample variance of a normal distribution to generate upper confidence interval
• Problems? Predicting bandwidth for output file transfers
NWS uses TCP-based probes, transfers utilize GridFTP
![Page 5: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/5.jpg)
Urgent Computing - 5University of Chicago Argonne National Lab
IB+AB+EB+OB: Resource Allocation Delay
• Normal priority (no SPRUCE) Utilize the existing Queue Bounds Estimation from Time Series (QBETS)
• SPRUCE policies Next-to-run
Modified QBETS - Monte Carlo simulation on previously observed job history
Pre-emption Utilize Binomial Method Batch Predictor (BMPB) to predict bounds based upon preemption history
Other possibilities: elevated priority, checkpoint/restart
![Page 6: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/6.jpg)
Urgent Computing - 6University of Chicago Argonne National Lab
Resource Allocation Concerns
• Resource allocation bounds are predicted entirely on historical data and not current queue state What if necessary nodes are immediately available?
What if it is clear that the bounds may be exceeded? Example: Higher priority jobs already queued
• Possible solution: Query Monitoring and Discovery Services framework (Globus) to return current queue state
![Page 7: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/7.jpg)
Urgent Computing - 7University of Chicago Argonne National Lab
IB+RB+EB+OB: Execution Delay
• Approach: Generate empirical bound for a given urgent application on each warm standby resource Utilize BMBP methodology to generate probabilistic bounds Methodology is general and non-parametric
![Page 8: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/8.jpg)
Urgent Computing - 8University of Chicago Argonne National Lab
Calculating the Composite Probability
• Goal: to determine a probabilistic upper bound for a given configuration
• Approximate approach: query a small subset (e.g., 5) of quantiles for each individual phase and select the combination that results in highest composite quantile with bound < deadline
![Page 9: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/9.jpg)
Advisor Demo
Screenshots
![Page 10: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/10.jpg)
Urgent Computing - 10University of Chicago Argonne National Lab
![Page 11: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/11.jpg)
Urgent Computing - 11University of Chicago Argonne National Lab
![Page 12: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/12.jpg)
Urgent Computing - 12University of Chicago Argonne National Lab
![Page 13: SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory](https://reader036.vdocuments.net/reader036/viewer/2022062803/56649f335503460f94c501ae/html5/thumbnails/13.jpg)
Urgent Computing - 13University of Chicago Argonne National Lab