www.cs.man.ac.uk/cnc use case scenarios for performance control of grid-based metacomputing john...
Post on 15-Dec-2015
216 Views
Preview:
TRANSCRIPT
www.cs.man.ac.uk/cnc
Use Case Scenarios for Performance Control of Grid-based
Metacomputing
John Gurd, Ken Mayes, Graham Riley
3rd Grid Performance Workshop, June 2005
Overview
Preamble• The case for Performance Control
Context• Malleable, component-based Grid applications
The PERCO (Performance Control) System• Design and implementation
Homogeneous Components• Simple performance control scenarios
More Complex Scenarios Conclusions
Achieving Performance
Engineering for maximum performance:• coarse design, then fine tuning• requires high degree of repeatability• benefits from homogeneity, symmetry, etc.
Control to achieve (less than maximum) target:• use negative feedback control at run-time• necessary to approach dynamic
environment• helps to deal with heterogeneity
How to Control Performance?
Requires (negative) feedback
• needs sensors, actuators and compensators• timers, control ‘handles’, predictive models
Whole system vs. piece-wise control• who is responsible for what?
Perception is that a hierarchy is needed• hence need hierarchical software structure
actuator
feedback function
error
Controllable Components?
Several groups have suggested that control should be effected via a component-based software architecture• degenerates to singleton component• can reduce the complexity of control• can form a control hierarchy
Overview of PERCO
Two-tier hierarchical performance control• CPS (Component Performance Steerer)
- one wrapped around each component- all attached to APS (see below)- maximises performance on deployed platform
• APS (Application Performance Steerer)- (re)deploys components on available resources- maximises performance on allocated platforms
Requires an external resource allocator (from which to obtain a set of resources in which to effect its deployments)
Modus Operandi
Components progress via a sequence of progress points, at each of which a component calls out to its CPS for any component-specific performance control actions (local actuation; requires component to be malleable)
Certain progress points are also safe-points (i.e. the component is in a state that permits it to be redeployed) and, at these points, the CPS can call out to the APS for redeployment-based performance control actions (the APS means of actuation)
Progress Points
Assume that the execution of components and application proceeds through phases, and that the phase boundaries are marked by progress points.
Can take decisions about performance and (possibly) actuate at the progress points
0 1 2 3 4 5 6 7
Ph 1 Ph 2 Ph 3 Ph 4 Ph 5 Ph 6 Ph 7
Application vs. Component Progress Points
Application progress points need to be safe points
Application progress points
Component progress points
APS
CPS
Component
Time
PERCO Infrastructure
Each component is attached to a local loader which is capable of moving the component safely around the distributed Grid hardware according to the APS commands
The local loaders act in concert with the APS to form a virtual loader layer for the application
Each CPS communicates with the local loader on behalf of its component
Controllable Components?
Several groups have suggested that control should be effected via a component-based software architecture• degenerates to singleton component• can reduce the complexity of control• can form a control hierarchy
But where do the components come from?• a knotty problem (cf. RealityGrid LB3D)
One Answer . . .
Homogeneous components• each component a copy of the same model• used e.g. for parameter search• e.g. LB3D from RealityGrid
Performance control scenarios• N instances of LB3D, finish as fast as
possible- equates to keeping them in (approximate)
timestep with each other (see next slides)
• execute N instances of LB3D at specified rates relative to one another- e.g. N=2, one instance executes twice as many
timesteps per unit of time as the other
Slightly More Complex Answer . . .
“Almost homogeneous” components• each component a copy of a similar model,
but ...• ... with different driving parameters
- e.g. LB3D with different resolutions
Performance control scenarios• TeraGyroid experiment (from RealityGrid;
conducted during SC’2003; see next slide)• IntBioSim “beading” method• Hurricane “tracking”
Embedded high resolution subdomains• when does extra resolution become new
physics?
Even More Complex Answer: Coupled Models
Many scientific modellers are finding a need to link together multiple models:• climate/envt. models (ocean + atmosphere + ...)• multi-scale phenomena (CFD + MD = HybridMD)• aircraft lightning strike (CEM + a/f structure)+ others, all needing high performance & ‘Grid’
The individual models seem to constitute ready-made components:• can these be used for performance control?
Summary
We are investigating the practicalities of component-based performance control in Grid execution environments
A prototype performance control system is being developed and we have shown that it can be used to achieve a scientifically meaningful high-level performance objective
We are ready to apply it to realistic scientific coupled model applicationsK.R. Mayes, M. Luján, G.D. Riley, J. Chin, P.V. Coveney, J.R. Gurd, Towards performance control on the Grid, Philosophical Transactions of the Royal Society of London: Series A, to appear, August 2005.
Related Projects at Manchester
FLUME - design of next generation Unified Model software• funded by The Met Office (led by Mick Carter)
RealityGrid – condensed matter modelling• EPSRC-funded e-Science (led by Peter Coveney at UCL)
SoftIAM - climate impact, integrated assessment modelling• funded by the Tyndall Centre (led by Rachel Warren)
IntBioSim – integrated biological simulation• BBSRC-funded e-Science (led by Mark Sansom at
Oxford)
GENIEfy – Earth system modelling• NERC-funded e-Science (led by Tim Lenton + Tyndall C)
Weblinks
For more information check:
http://www.cs.man.ac.uk/cnc
http://www.realitygrid.org
http://www.intbiosim.org (under construction)
top related