grid07 2 kranzlmuller
TRANSCRIPT
D. Kranzlmüller Grids for Science and Business 2
Defining the “Grid”• Access to (high performance) computing power• Distributed parallel computing• Improved resource utilization through resource
sharing• Increased memory provision• Controlled access to distributed memory• Interconnection of arbitrary resources
(sensors, instruments, …)• Collaboration between users/resources• Corresponding security • …
D. Kranzlmüller Grids for Science and Business 3
Defining the “Grid”
A Grid is the combination of networked resources and
the corresponding Grid middleware, which provides Grid services
for the user.
Grids for Science and Business 4
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
The EGEE Project• EGEE
– 1 April 2004 – 31 March 2006– 71 partners in 27 countries,
federated in regional Grids
• EGEE-II– 1 April 2006 – 31 March 2008– Expanded consortium
91 partners in 32 countries11 Joint Research Units (48 partners)
– Exploitation of EGEE results– Emphasis on providing
production-level infrastructureincreased support for applicationsinteroperation with other Grid infrastructuresmore involvement from Industry
Grids for Science and Business 5
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Defining the Grid
• A Grid is the combination of networked resources and the corresponding Grid middleware, which provides Grid services for the user.
Status of EGEE-II (as of May 2007)
Grids for Science and Business 6
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
EGEE Infrastructure
Country participating
in EGEE
> 200 sites in 40 countries> 36 000 CPUs> 5 PB storage> 98k jobs/day> 200 Virtual Organizations
TERAGRID
OSG
EELA
Baltic Grid
See-GridDEISA
EUMedGridEUChinaGrid
EUIndiaGrid
NAREGI
Grids for Science and Business 7
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Defining the Grid
• A Grid is the combination of networked resources and the corresponding Grid middleware, which provides Grid services for the user.
Status of EGEE-II (as of May 2007)
Grids for Science and Business 8
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Production Grid Middleware
Key factors in EGEE Grid Middleware Development:
• Strict software processUse industry standard software engineering methods– Software configuration management, version control, defect
tracking, automatic build system, …
• Conservative approach in what software to useAvoid “cutting-edge” software– Deployment on over 100 sites cannot assume a homogenous
environment – middleware needs to work with many underlying software flavors
Avoid evolving standards– Evolving standards change quickly (and sometime significantly
cf. OGSI vs. WSRF) – impossible to keep pace on > 100 sites
Long (and tedious) path
from prototypes to production
Grids for Science and Business 9
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
EGEE Middleware: gLite
• Exploit experience & existing components– VDT (Condor, Globus) – EDG/LCG– AliEn– …
• Develop a lightweight stack of EGEE generic middleware– Dynamic deployment– Pluggable components
• Focus is on re-engineering and hardening
• March 4, 2006: gLite 3.0
LCG-2
prototyping
prototyping
product
200420042004
20052005 product
gLite
20062006 gLite 3.0
Grids for Science and Business 10
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
MiddlewareGlobus GT4 CondorAPST
PlatformInfrastructure Unix Windows JVM TCP/IP MPI .Net Runtime
Environmental Sciences
Life & Pharmaceutical
Sciences
ApplicationsGeo Sciences
Building Software for the Grid
VPN SSH
Courtesy IBM
Slide Courtesy David Abramson
Grids for Science and Business 11
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
MiddlewareGlobus GT4 CondorAPST
PlatformInfrastructure Unix Windows JVM TCP/IP MPI .Net Runtime
Environmental Sciences
Life & Pharmaceutical
Sciences
ApplicationsGeo Sciences
Building Software for the Grid
VPN SSH
Courtesy IBM,Lower Middleware
Upper Middleware & Tools
Bonds
Slide Courtesy David Abramson
Grids for Science and Business 12
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Defining the Grid
• A Grid is the combination of networked resources and the corresponding Grid middleware, which provides Grid services for the user.
Status of EGEE-II (as of July 27, 2006)
Grids for Science and Business 13
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
High Energy Physics
Large Hadron Collider (LHC):• One of the most powerful instruments
ever built to investigate matter• 4 Experiments: ALICE, ATLAS, CMS, LHCb• 27 km circumference tunnel• Due to start up in 2007
Mont Blanc(4810 m)
Downtown Geneva
Grids for Science and Business 14
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Applications Example: WISDOM
• Grid-enabled drug discovery process for neglected diseases– In silico docking
compute probability that potential drugs dock with target protein– To speed up and reduce cost to develop new drugs
• WISDOM (World-wide In Silico Docking On Malaria)– First biomedical data challenge – 46 million ligands docked in 6 weeks
Target proteins from malaria parasiteMolecular docking applications: Autodock and FlexX~1 million virtual ligands selected
– 1TB of data produced – 1000 computers in 15 countries
Equivalent to 80 CPU years
• Significant results– Best hits to be re-ranked using Molecular Dynamics
Grids for Science and Business 15
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Example: Avian flu• Avian Flu H5N1
– H5 and N1 = proteins on virus surface
• Biological goal of data challenge– Study in silico the impact of selected point
mutations on the efficiency of existing drugs – Find new potential drugs
• Data challenge parameters:– 5 Grid projects: Auvergrid, BioinfoGrid, EGEE,
Embrace, TWGrid– 1 docking software: autodock– 8 conformations of the target (N1)– 300 000 selected compounds
>100 CPU years to dock all configurations on all compounds
• Timescale: – First contacts established: 1 March 2006– Data Challenge kick-off: 1 April 2006– Duration: 4 weeks
N1H5
Credit: Y-T Wu
Credit: Y-T Wu
Grids for Science and Business 16
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Industrial applications
• EGEODE– Industrial application from Compagnie Générale de Géophysique
running on EGEE infrastructureSeismic processing platformBased on industrial application Geocluster© used at CGGBeing ported to EGEE for Industry and Academia
• OpenPlast project– French R&D programme to develop and deploy Grid platform for
plastic industry (SMEs)– Based on experience from EGEE (supported by CS)– Next: Interoperability with other Grids
Grids for Science and Business 17
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
EGEE-II Applications Overview• >200 VOs from several
scientific domains– Astronomy & Astrophysics– Civil Protection– Computational Chemistry– Comp. Fluid Dynamics– Computer Science/Tools– Condensed Matter Physics– Earth Sciences– Fusion– High Energy Physics– Life Sciences
• Further applications under evaluation
98k jobs/day
Applications have moved from testing to routine and daily usage
~80-90% efficiency
Grids for Science and Business 18
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
EGEE-II Overview
Status of EGEE-II(as of May 2007)
1. Resources2. Middleware3. Applications
BUT …
D. Kranzlmüller Grids for Science and Business 19
PerspectiveToday:• New scientific collaborations have been formed
thanks to the Grid infrastructure• Applications are routinely using the Grid on a
daily basis• Scientific applications start to depend on Grid
infrastructures• Business and Industry are getting interested
However, there is a clear need for a long term perspective
D. Kranzlmüller Grids for Science and Business 20
MiddlewareGlobus GT4 CondorAPST
PlatformInfrastructure Unix Windows JVM TCP/IP MPI .Net Runtime
Environmental Sciences
Life & Pharmaceutical
Sciences
ApplicationsGeo Sciences
Building Software for the Grid
VPN SSH
Courtesy IBM,Lower Middleware
Upper Middleware & Tools
Bonds
D. Kranzlmüller Grids for Science and Business 21
A European Vision …
• for a universal e-Infrastructure for research(1)
“An environment where research resources (H/W, S/W & content) can
be readily shared and accessed wherever this is necessary to promote better and more effective research”(1) Malcolm Read (Ed.) http://www.e-irg.org/meetings/2005-UK/A_European_vision_for_a_Universal_e-Infrastructure_for_Research.pdf
D. Kranzlmüller Grids for Science and Business 22
European Commission
“…for Grids we would like to see the move towards long-term sustainable initiatives less dependent upon
EU-funded project cycles”
• Viviane Reding, Commissioner, European Commission, at the EGEE’06 Conference, September 25, 2006
D. Kranzlmüller Grids for Science and Business 23
European Grid Initiative
Goals:• Ensure the long-term sustainability of the
European e-infrastructure• Coordinate the integration and interaction between
National Grid Infrastructures• Operate the European level of the production Grid
infrastructure for a wide range of scientific disciplines to link National Grid Infrastructures
D. Kranzlmüller Grids for Science and Business 24
Grids in Europe• Examples of National Grid projects:
– Austrian Grid Initiative– Belgium: BEgrid– DutchGrid– France: Grid’5000– Germany: D-Grid; Unicore– Greece: HellasGrid– Grid Ireland – Italy: INFNGrid; GRID.IT– NDGF– Portuguese Grid– Swiss Grid– UK e-Science: National Grid Service; OMII; GridPP– …
D. Kranzlmüller Grids for Science and Business 25
Evolution
Testbeds Utility ServiceRoutine Usage
National
Global
SustainableEuropean Grid
D. Kranzlmüller Grids for Science and Business 26
EGI Design Study (EGI_DS)
• Project Proposal, submitted to the European Commission for funding withinFP7-INFRASTRUCTURES-2007-1, 1.2.1 Design Studies (May 2, 2007)
Participant no. Participant organisation name Short name Country 1 (Coordinator) Institut für Graphische und Parallele Datenverarbeitung
der Johannes Kepler Universität Linz GUP A
2 Greek Research and Technology Network – GRNET S.A. GRNET GR 3 Istituto Nazionale di Fisica Nucleare INFN I 4 CSC – Scientific Computing Ltd. CSC FI 5 CESNET, z.s.p.o. CESNET CZ 6 European Organization for Nuclear Research CERN CH 7 Verein zur Förderung eines Deutschen Foschungsnet-
zen – DFN-Verein DFN D
8 Science & Technology Facilities Council STFC UK 9 Centre National de la Recherche Scientifique CNRS F
D. Kranzlmüller Grids for Science and Business 27
EGI Design Study (EGI_DS)
• Project for the conceptual setup and operation of a new organizational model of a sustainable pan-European grid infrastructure
• Federated model bringing together NGIs to build a European organisation
• Responsibilities between NGIs and EGI are split to be federated and complimentary
Draft
D. Kranzlmüller Grids for Science and Business 28
D. Kranzlmüller Grids for Science and Business 29
Support for EGI Vision and EGI_DS
• 35 EuropeanNGIs (EU27+8)
• Asia, Latin-America, USA
• OGF-EU• PACE• ETICS
D. Kranzlmüller Grids for Science and Business 30
MiddlewareGlobus GT4 CondorAPST
PlatformInfrastructure Unix Windows JVM TCP/IP MPI .Net Runtime
Environmental Sciences
Life & Pharmaceutical
Sciences
ApplicationsGeo Sciences
Building Software for the Grid
VPN SSH
Courtesy IBM,Lower Middleware
Upper Middleware & Tools
Bonds
D. Kranzlmüller Grids for Science and Business 31
Example 1: Fusion Simulation
D. Kranzlmüller Grids for Science and Business 32
Example 2: Flood Simulation
Cooperation with Slowak Academy of Sciences
D. Kranzlmüller Grids for Science and Business 33
Example 3: Biomedical Display
D. Kranzlmüller Grids for Science and Business 34
MiddlewareGlobus GT4 CondorAPST
PlatformInfrastructure Unix Windows JVM TCP/IP MPI .Net Runtime
Environmental Sciences
Life & Pharmaceutical
Sciences
ApplicationsGeo Sciences
Building Software for the Grid
VPN SSH
Courtesy IBM,Lower Middleware
Upper Middleware & Tools
Bonds
D. Kranzlmüller Grids for Science and Business 35
Conclusions
• Production grids (e.g. EGEE, …) exist and are in use today
• Strong efforts towards establishing large scale, permanent, multidisciplinary grid infrastructures are going on now
• Continuous development of higher level grid services (for more grid applications)
D. Kranzlmüller Grids for Science and Business 36
http://[email protected]