Download - Chalmers e-Science Initiative Seminar
The Mapper project receives funding from the EC's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° RI-261507.
Multiscale Applications on European e- Infrastructures
Marian Bubak AGH Krakow PL and University of Amsterdam NL
on behalf of the MAPPER Consortium
http://www.mapper-project.eu/
Chalmers e-Science Initiative Seminar2 Dec 2011
2
Academic Computer Centre
CYFRONET AGH (1973)
120 employees
http://www.cyfronet.pl/en/
Academic Computer Centre
CYFRONET AGH (1973)
120 employees
http://www.cyfronet.pl/en/
Department of Computer Science AGH (1980)
800 students, 70 employeeshttp://www.ki.agh.edu.pl/uk/index.htm
Faculty of Electrical Engineering, Automatics, Computer Science and
Electronics (1946)
4000 students, 400 employees
http://www.eaie.agh.edu.pl/
Faculty of Electrical Engineering, Automatics, Computer Science and
Electronics (1946)
4000 students, 400 employees
http://www.eaie.agh.edu.pl/
AGH University of Science and Technology (1919)
15 faculties, 36000 students; 4000 employeeshttp://www.agh.edu.pl/en
Other 14 faculties
Distributed Computing Environments (DICE)
Teamhttp://dice.cyfronet.pl)
About the speakerAbout the speaker
University of Amsterdam, Institute for Informatics, Computational Sciencehttp://www.science.uva.nl/~gvlam/wsvlam
1.
2.
3
DICE DICE tteameam (http://dice.cyfronet.pl) (http://dice.cyfronet.pl)• Main research interests:
• investigation of methods for building complex scientific collaborative applications and large-scale distributed computing infrastructures
• elaboration of environments and tools for e-Science• development of knowledge-based approach to services,
components, and their semantic composition and integrationCrossGrid 2002-2005
interactive compute- and data-intensive applications
K-Wf Grid 2004-2007
knowledge-based composition of grid workflow applications
CoreGRID 2004-2008
problem solving environments, programming models
GREDIA 2006-2009
grid platform for media and banking applications
ViroLab 2006-2009
GridSpace virtual laboratory
PL-Grid 2009-2011
advanced virtual laboratory
gSLM 2010-2012
service level management for grid and clouds
UrbanFlood
2009-2012
Common Information Space for Early Warning Systems
MAPPER
VPH-Share
Collage
2010-2013
2011-2015
2011-?
computational strategies, software and services for distributed multiscale simulations Federating cloud resources for development and execution of VPH computationally and data intensive applications Executable Papers; 1st award of Elsevier Competition at ICCS2011
4
Plan
• Multiscale applications• Multiscale modeling• Objectives of the MAPPER project• Programming and Execution Tools• MAPPER infrastructure• ISR application scenario• Summary
5
Vision
• Distributed Multiscale Computing...– Strongly science driven, application pull
• ...on existing and emerging European e-Infrastructures, ...
• ... and exploiting as much as possible services and software developed in earlier (EU-funded) projects.– Strongly technology driven, technology push
6
Nature is multiscale
• Natural processes are multiscale
– 1 H2O molecule
– A large collection of H2O molecules, forming H-bonds
– A fluid called water, and, in solid form, ice.
7
Multiscale modeling
• Scale Separation Map
• Nature acts on all the scales
• We set the scales
• And then decompose the multiscale system in single scale sub-systems
• And their mutual coupling
temporalscale
spatialscale
x
L
t T
8
From a Multiscale System to many Singlescale Systems
• Identify the relevant scales
• Design specific models which solve each scale
• Couple the subsystems using a coupling method
temporalscale
spatialscale
x
L
t T
9
Why multiscale models?
• There is simply no hope to computationally track complex natural processes at their finest spatio-temporal scales.– Even with the ongoing growth in computational power.
10
Minimal requirement
tol
interestofquantitiesinerrors
1solverscalefineofcost
solvermultiscaleofcost
11
Multiscale computing
• Inherently hybrid models are best serviced by different types of computing environments
• When simulated in three dimensions, they usually require large scale computing capabilities.
• Such large scale hybrid models require a distributed computing ecosystem, where parts of the multiscale model are executed on the most appropriate computing resource.
• Distributed Multiscale Computing
12
Two paradigms• Loosely Coupled
– One single scale model provides input to another
– Single scale models are executed once
– workflows
• Tightly Coupled
– Single scale models call each other in an iterative loop
– Single scale models may execute many times
– Dedicated coupling libraries are needed
temporalscale
spatialscale
x
L
t T temporalscale
spatialscale
x
L
t T
13
MAPPER
Multiscale APPlications on European e-infRastructures
University of
Amsterdam
Max-Planck
Gesellschaft zur
Foerderung der
Wissenschaften E.V.
University of U
lster
Poznan
Supercomputing and
Netw
orking Centre
Akademia G
orniczo-
Hutnicza im
. Stanislawa
Staszica w Krakow
ie
Ludwig-M
aximilians-U
niversität
München
University of G
eneva
Chalm
ers Tekniska
Högskola
University C
ollege
London
14
Motivation: user needs
VPHFusion
Computional Biology
MaterialScience
Engineering
Distributed Multiscale Computing Needs
15
Applications
• 7 applications from 5 scientific domains ...
• ... brought under a common generic multiscale computing framework
virtual physiological human fusionhydrology nano material science computational biology
SSM Coupling topology (x)MML Task graph Scheduling
16
Ambition
• Develop computational strategies, software and services
for distributed multiscale simulations across disciplines
exploiting existing and evolving European e-infrastructure
• Deploy a computational science infrastructure
• Deliver high quality components
aiming at large-scale, heterogeneous, high performance multi-disciplinary multiscale computing.
• Advance state-of-the-art in high performance computing on e-infrastructures
enable distributed execution of multiscale models across e-Infrastructures,
17
High level tools: objectives• Design and implement an environment for
composing multiscale simulations from single scale models
– encapsulated as scientific software components
– distributed in various e-infrastructures
– supporting loosely coupled and tightly coupled paradigm
• Support composition of simulation models:
– using scripting approach
– by reusable “in-silico” experiments
• Allow interaction between software components from different e-Infrastructures in a hybrid way.
• Measure efficiency of the tools
18
Requirements analysis• Focus on multiscale applications that are described as a set of connected,
but independent single scale modules and mappers (converters)
• Support describing such applications in uniform (standardized) way to:
– analyze application behavior
– support switching between different versions of the modules with the same scale and functionality
– support building different multiscale applications from the same modules (reusability)
• Support computationally intensive simulation modules
– requiring HPC or Grid resources
– often implemented as parallel programs
• Support tight (with loop), loose (without loop) and hybrid (both) connection modes
19
Overview of tools
• MAPPER Memory (MaMe) - a semantics-aware persistence store to record metadata about models and scales
• Multiscale Application Designer (MAD) - visual composition tool transforming high level MML description into executable experiment
• GridSpace Experiment Workbench (EW) - execution and result management of experiments on e-infrastructures via interoperability layers (AHE, QCG)
Direct Experiment hosts (UIs)
User Interfaces and visual tools, task 8.1
Multiscale Application Designer
GridSpace ExperimentWorkbench
GridSpaceExecution
EngineTask 8.3
ProvenanceTask 8.4
Result and file browsing
XMMLRepository
Task 8.2
Mapper Memory(MaMe)Task 8.2
QCG-Broker(Interoerability layer
WP4)
GridSpaceRegistry of Interpreters
( such as MUSCLE)Task 8.3
Module implemented
in the firstprototype
Module in the design phase
Legend:
AHE (Interoperability
layer WP4)
MaMe Web Interface
Data flowCurrent
Planned
Result Management Task 8.3
ProvenanceInterface
REST REST
REST
Currently:GSExperiment file
QCG-client API and GridFTP
ssh
currently:ssh
Java API
Software packages created in WP7, adapted by WP4, integrated by WP5 and installed by WP6
on e-infrastructures
20
Multiscale modeling language
• Uniformly describes multiscale models and their computational implementation on abstract level
• Two representations: graphical (gMML), textual (xMML)
• Includes description of
– scale submodules
– scaleless submodules (so called mappers and filters)
– ports and their operators (for indicating type of connections between modules)
– coupling topology
– implementation
Submodel execution loop in pseudocode
f := finit /*initialization*/
t := 0
while not EC(f, t):
Oi(f, t) /*intermediate observation*/
f := S(f, t) /*solving step*/
t += theta(f)
end
Of(f, t) /*final observation*/
Oi
Of
S
finit
undefined
Corresponding symbols in gMML
Example for Instent Restenosis application
IC – initial conditionsDD- drug diffusionBF – blood flowSMC – smooth muscle cells
21
jMML library
Supports XMML analysis:
• Detection of initial models
• Constructing coupling topology (gMML)
• Generating task graph
• Deadlock detection
• Generating Scale Separation Map
• Supports Graphviz or pdf formats
22
MaMe - MAPPER memory
• Provides rich, semantics-aware persistence store for other components to record information
• Based on a well-defined domain model containing MAPPER metadata defined in MML
• Other MAPPER tools store, publish and reuse such matadata throughout the entire Project and its Consortium
• Provides dedicated web interface for human users to browse and curate metadata
choose/add/delete
Mapper A
Mapper B
SubmoduleA
SubmoduleB
23
MAD: Application Designer
• User friendly visual tool for composing multiscale applications
• Supports importing application structure from xMML (section A and B)
• Supports composing multiscale applications in gMML (section B) with additional graphical specific information - layout, color etc. (section C)
• Transforms gMML into xMML
• Performs MML analysis to identify its loosely and tightly coupled parts
• Using information from MaMe and GridSpace EW, transforms gMML into executable formats with information needed for actual execution (section D) :
– GridSpace Experiment
– MUSCLE connection file (cxa.rb)
24
GridSpace Experiment Workbench
• Supports execution of experiments on e-infrastructures via interoperability layers
• Result management support
• Uses Interpreter-Executor model of computation:
– Interpreter - a software package available on the infrastructure, usually programatically accessible by DSL or script language e.g: MUSCLE, LAMMPS, CPMD
– Executor - a common entity for hosts, clusters, grid brokers etc. capable of running Interpreters
• Allows easy configuration of available executors and interpreters
Transforming example MML into executable GS Experiment
GS Experiment
Interpreter (muscle)Interpreter BInterpreter A
MML Tightly coupled part
Snippet 1 Snippet 2 Snippet 3
Interoperability Layer (QCG, AHE) , SSH accessible resources
E-Infrastructures
Exe
cutor A
Exe
cutor B
Exe
cutor C
MapperSubmodule(µm)
Submodule(cm)
Submodule(m)
Mapper
25
User environment
Application composition:from MML to executable
experiment
Registration of MML metadata: submodules
and scales
Result Management
Execution of experiment using interoperability layer
on e-infrastructure
26
… …2011 06 09 2012 201308 11
MoU signedTaskforceestablished 1st evaluation
• Joined task force between MAPPER, EGI and PRACE
• Collaborate with EGI and PRACE to introduce new capabilities and policies onto e-Infrastructures
• Deliver new application tools, problem solving environments and services to meet end-users needs
• Work closely with various end-users communities (involved directly in MAPPER) to perform distributed multiscale simulations and complex experiments
05
1st EU reviewselected two apps
on MAPPERe-Infrastructure
(EGI and PRACEresources) Tier - 2
Tier - 1
Tier - 0 MA
PP
ER
Taskfo
rceM
AP
PE
R T
askforce
E-infrastructure
27
MAPPER e-infrastructure
• MAPPER pre-production infrastructure– Cyfronet, LMU/LRZ, PSNC, TASK, UCL, WCSS
– Environment for developing, testing and deployment of MAPPER components
• Central services– GridSpace, MAD, MaMe, monitoring, web-site
• EGI-MAPPER-PRACE task force– SARA Huygens HPC system
28
2 scenarios in operation
loosely coupled DMC tightly coupled DMC
29
In-stent restenosis
• Coronary heart disease (CHD) remains the most common cause of death in the Europe, being responsible for approximately 1.92 million deaths each year*
• A stenosis is an abnormal narrowing of a blood vessel
• Solution: a stent placed with a balloon angioplasty
• Possible response, in 10% of the cases: abnormal tissue growth in the form of in-stent restenosis
– Multiscale, multiphysics phenomenon involving physics, biology, chemistry, and medicine
30
In-stent restenosis model• A 3D model of in-stent restenosis
(ISR3D)
– why does it occur, when does it stop?
– Ultimate goal:• Facilitate stent design• Effect of drug eluting stents
• Models:
– cells in the vessel wall;
– blood in the lumen;
– drug diffusion; and
– most importantly their interaction
• 3D model is computationally very expensive
• 2D model has published results*
*H. Tahir, A. G. Hoekstra et al. Interface Focus, 1(3), 365–373
31
Scale separation map
• Four main submodels
• Same spatial scale
• Different temporal scale
32
Coupling topology
• Model
– is tightly coupled (excluding initial condition)
– has a fixed number of synchronization points
– has one instance per submodel
33
MML of ISR3D
start
stop
submodel
mapper
edge heads/tails
finalization initialization
intermediate intermediate
34
ISR3D on MAPPER
35
Demo: Mapper Memory (MaMe)
• Semantics-aware persistence store
• Records MML-based metadata about models and scales
• Supports exchanging and reusing MML metadata for– other MAPPER tools via
REST interface– human users via
dedicated Web interface
choose/add/delete
Mapper A
Mapper B
SubmoduleA
SubmoduleB
Ports and their operators
36
Demo: Gridspace EW for ISR3D
• Obtains MAD generated experiment containing a configuration file for MUSCLE interpreter
• Provides two executors for MUSCLE interpreter
– SSH on Polish NGI UI – cluster execution
– QCG – multisite execution
• Uses QCG executor for running MUSCLE interpreter on QCG and staging input/output files
GS Experiment
MUSCLE Interpreter
MML Tightly coupled part
Snippet 1
Interoperability Layer (QCG, AHE) , SSH accessible resources
E-Infrastructures
QCG
Executor
Submodulecm
Submodulem
Mapper
# declare kernels which can be launched in the CxA cxa.add_kernel(’submodel_instance1, ’my.submodelA’) cxa.add_kernel(’submodel_instance2’, ’my.submodelB’) …# configure connection scheme of the CxA cs = cxa.cs # configure unidirectional connection between kernelscs.attach ’ submodel_instance1’=> ’submodel_instance2’ do tie ’portA’, ’portB’ …..end…
37
Computing
• ISR3D is implemented using the multiscale coupling library and environment (MUSCLE)
• Contains Java, Fortran and C++ submodels
• MUSCLE provides uniform communication between tightly coupled submodels
• MUSCLE can be run on a laptop, a cluster or multiple sites.
38
QCG Role
• Provides an interoperability layer between PRACE and EGI infrastructures
• Co-allocates heterogeneous resources according to the requirements of a MUSCLE application using an advance reservation mechanism
• Synchronizes the execution of application kernels in multi-cluster environment
• Efficiently executes and manages tasks on EGI and UCL resources
• Manages data transfers
39
ISR3D Results
40
ISR3D - conclusion
• Before MAPPER, ISR2D ran fast enough, ISR3D took too much exection time and a lot of time to couple
• Now, ISR3D runs distributedly using the MAPPER tools and middleware
• To get scientific results we will have to, and can, run many batch jobs
– Done in the MeDDiCa EU project
– Involves 1000s of runs
• Also, the code can be parallelized to run faster
41
Summary
• Elaboration of a concept of an environment supporting developers and users of multiscale applications for grid and cloud infrastructures
• Design of the formalism for describing connections in multiscale simulations
• Enabling access to e-infrastructures
• Validation of the formalism against real applications structure by using tools
• Proof of concept for transforming high level formal description to actual execution using e-infrastructures
More about MAPPER
http://www.mapper-project.eu/
http://dice.cyfronet.pl/