eela is a project funded by the european union under contract 026409
DESCRIPTION
E-infrastructure shared between Europe and Latin America. European Meteorological Society 7 th EMS / 8 th ECAC El Escorial (Spain), 1-5 Oct 2007. GRID distributed computation of nested climate Simulations and data-mining. On behalf of the EELA project. - PowerPoint PPT PresentationTRANSCRIPT
1
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
EELA is a project funded by the European Union under contract 026409
E-infrastructure shared between Europe and Latin AmericaV. Fernández-Quiruelas (1),
J. Fernández (1),
A. S. Cofiño (1),
C. Baeza (3),
M. Carrillo (2),
F. García-Torre (1),
R. M. San-Martín (2),
R. Abarca (3) and
J. M. Gutiérrez (1)
R. Mayo (4)
on behalf of the EELA team.(1) Dept of applied mathematics and computing sciences. University of Cantabria. Spain, (2) Servicio Nacional de Meteorología e hidrología. Peru, (3) Universidad de Concepción. Chile (4) CIEMAT, Spain
GRID distributed computation of nested climate
Simulations and data-mining.
On behalf of the EELA project
European Meteorological Society
7th EMS / 8th ECAC
El Escorial (Spain), 1-5 Oct 2007
2
E-infrastructure shared between Europe and Latin America
www.eu-eela.org GRID Computing
Applications draw computing power from a Computational Grid in the same way electrical devices draw power from an electrical grid
3
E-infrastructure shared between Europe and Latin America
www.eu-eela.org GRID Computing
• Developed in the mid-90• Use of distributed, heterogeneous, dynamic and,
usually, parallel computational resources.• Middleware and standard software to build applications
(Globus Toolkit, OGSA, …)• Several research projects (and commercial products)
developing this technology.
Applications draw computing power from a Computational Grid in the same way electrical devices draw power from an electrical grid
4
E-infrastructure shared between Europe and Latin America
www.eu-eela.org EELA Goals
E-infrastructure shared between
Europe and Latin America
• Goal:
To build a bridge between consolidated e-Infrastructure initiatives in Europe and emerging ones in Latin America.
• Objectives:– Establish a human collaboration network between Europe and
Latin America– Setting a pilot e-infrastructure in Latin America– Identifying and promoting a sustainable framework for e-Science
in Latin America
5
E-infrastructure shared between Europe and Latin America
www.eu-eela.org EELA structure
EELA is structured in four Working Packages:
• WP1. Project administrative and technical management• WP2. Pilot testbed operation and support• WP3. Identification and support of Grid-enhanced
applications– Task 3.1. Biomed Applications– Task 3.2. High Energy Physics Applications– Task 3.3. Additional Applications:
E-Learning Climate
• WP4. Dissemination activities
6
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Partners
• EU– Spain: CIEMAT, CSIC, UPV,
RED.ES, UC – Italy: INFN – Portugal: LIP
• Latin America– Venezuela: ULA – Cuba: CUBAENERGIA – Chile: UTFSM, REUNA,
UDEC – Peru: SENAMHI– Mexico: UNAM – Argentina: UNLP– Brazil: UFRJ, CNEN, – CECIERJ/CEDERJ, RNP, UFF
• International– CLARA– CERN
8
E-infrastructure shared between Europe and Latin America
www.eu-eela.org WP 3.3b. Climate
We deal with a climate challenge with huge socio-economical impact in Latin America: El Niño phenomenon.
The Grid helps to access the infrastructure and know-how in a user-friendly way.
Three different applications have been identified:– Global atmospheric circulation model (CAM)– Regional weather model (WRF) – Data-mining clustering tools (SOM)
Scientific challenge: High resolution regional simulations over Latin American regions for El Niño 1982-1983 and 1997-98 strong events. Comparison with historical local data, including sensitivity studies to SST and parameterizations.
The problem is well suited for its execution on the Grid since many independent simulations will be needed.
9
E-infrastructure shared between Europe and Latin America
www.eu-eela.org WP 3.3b. Climate
We are currently performing CAM simulations perturbing the SST.
pertSSTc(t,x) = SST(t,x) + c * Pattern(x)
Where c is a random number in the interval (-2.5, 2.5) c = -2.5 ~regular year c = 0 ~niño’97 c > 0 SST anomalies stronger than niño’97
10
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
SE
Challenging computational problem with nontrivial dependent relationships among the applications. A cascade of dynamic dependent jobs is adopted.
The cascade of applications interacts with the middleware to:
– Prepare and submit dependent jobs.– Store and retrieve the generated data sets (data sharing).– Manage metadata (for the data sets and application status).– Restarting broken experiments
CAM
WRF (par 1)
WRF (par 2)
WRF (par n)
…
SE
SOM
Computational challenge
SOM
SOM
…
SE SESESE SE
SS
T +
oth
er
forc
ing
s
SST PDF
12
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Grid Enabling Layer
CAM
?
15
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Grid Enabling Layer
CAM
The use of this additional software layer has several advantages:
• Easier updates of the model• Easier programming (shell, perl, python, … instead of Fortran)
16
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
SE
LFC
CAM
AMGA
DATA
Metadata
statusinformationWRF
Info & Data Flow
File catalog
Storage Element
Metadata Catalog
17
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Grid Enabling Layer
CAM
GEL tasks:
• Download model-required files from LFC
• Upload model-generated files to LFC
• Extract metadata from model output files and publish them in AMGA.
• Upload model restart files to LFC and restart information to AMGA
• Publish model status in AMGA
18
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
LFC
CAM
AMGA
DATA
Metadata
WRF
Info & Data Flow (past)
UI
Portal
SE
statusinformation
If a submitted job is not running, the User queries AMGA about whether it was successful. If not, the User checks what was the last restart file and restarts the CAM job.
While a CAM job is running, the User queries AMGA about the data sets produced by CAM and then triggers the WRF jobs. And so on with the SOMs.
19
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Last stable status
Using GENIUS to interact with the applications (CAM+WRF).
20
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
&camexp
absems_data = ‘lfn:/grid/eela/.../abs_ems_factors.nc‘
aeroptics = ‘lfn:/grid/eela/.../AerosolOptics.nc‘
bnd_topo = ‘lfn:/grid/eela/.../topo-from-cami.nc‘
bndtvaer = ‘lfn:/grid/eela/.../AerosolMass.nc'
bndtvo = 'lfn:/grid/eela/.../pcmdio3.nc‘
bndtvs = 'lfn:/grid/eela/.../sst_HadOIBl.nc'
caseid = 'nino82d'
iyear_ad = 1982
start_ymd = 19820101
ncdata = 'lfn:/grid/eela/.../cami.nc‘
mfilt = 1,4,1
...
21
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Specific requirements
Climate modeling poses specific challenges for the GRID
• Big storage requirements• CPU-intensive• Dependent jobs• Long lasting jobs• A climate modeling Experiment may consist of several
model Realizations, which are likely to be composed of several Jobs
22
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
LFC
CAM
AMGA
DATA
Metadata
Info & Data Flow (now)
coordinator
SE
statusinformation
23
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
LFC
CAM
AMGA
DATA
Metadata
WRF
Info & Data Flow (future)
coordinator
UI
Portal
SE
statusinformation
24
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Conclusions
• The EU-funded EELA project aims at establishing an e-infrastructure and scientific collaboration between European and Latin American countries.
• Within the climate task, we have implemented a sequence of climate applications CAM+WRF(+SOM) which runs integrated in the EELA testbed providing regional simulations for a given SST and other forcings.
• The applications interact with the GRID services transferring data and status information. They can be easily managed by the User through a web portal.
• Climate modelling poses specific requirements on the GRID, which are not solved by the current middleware – dependent and long lasting jobs, experiments composed of simulations split into jobs.
25
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
CAM Application: http://www.ccsm.ucar.edu/models/atm-cam/
User’s Guide to the NCAR Community Atmosphere Model (CAM 3.0): J. R. McCaa, M. Rothstein, B. E. Eaton, J. M. Rosinski, E. Kluzek, M. Vertenstein: Climate And Global Dynamics Division, NCAR, Boulder, Colorado, 2004, 88 pp.
W. D. Collins, C. M. Bitz, et al. (2006) “The Community Climate System Model: CCSM3”, Journal of Climate, Special Issue on CCSM, 19(11).
WRF Application: http://www.wrf-model.org/
Skamarock, W. C., J. B. Klemp, J. Dudhia, D. O. Gill, D. M. Barker, W. Wang and J. G. Powers, 2005: A Description of the Advanced Research WRF Version 2. NCAR Technical note, 2005, 88 pp.
Michalakes, J., J. Dudhia, D. Gill, T. Henderson, J. Klemp, W. Skamarock, and W. Wang, 2004: "The Weather Reseach and Forecast Model: Software Architecture and Performance,"Proceedings of the 11th ECMWF Workshop on the Use of High Performance Computing In Meteorology, 25-29 October 2004, Reading U.K. Ed. George Mozdzynski.
References
SOM Application (grid version): http://www.meteo.unican.es
F. Luengo, A.S. Cofiño, and J.M. Gutiérrez (2004) “GRID Oriented Implementation of Self-Organizing Maps for Data Mining in Meteorology”, Lecture Notes in Computer Science, 2970, 163 – 171.
26
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
Thanks for your attention!
27
E-infrastructure shared between Europe and Latin America
www.eu-eela.org CAM output
Monthly accumulated precipitation over Perú
29
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Acronyms
Survival guide for reading Grid documents
AMGA : ARDA Metadata Grid App.
ARDA : A Realisation of Distributed Analysis for LHC
CE : Computing element
CLARA : Cooperación latino-americana de redes avanzadas
EGEE : Enabling Grids for E-sciencE
GGF : Global grid forum
GIIS : Grid index info service
GILDA : Grid INFN laboratory for dissemination activities
GMA : Grid monitoring architecture
GRAM : Grid res. alloc. manager
GRIS : Grid resource info service
GSI : Grid security infrastructure
INFN : Istituto nazionale di Fisica nucleare
JDL : Job description language
LCG : LHC computing grid
LDAP : Lightweight directory access protocol
LFC : Logical file catalog
LHC : Large Hadron Collider
MDS : Monitoring & discovery system
NREN : National Research and Education Network
OGSA : OpenGrid services architecture
PKI : Public key infrastructure
RB : Resource broker
R-GMA : Relational GMA
SE : Storage element
VOMS : Virtual organization membership service
WS : Web service
30
E-infrastructure shared between Europe and Latin America
www.eu-eela.org CAM & WRF
The Community Atmosphere Model (CAM) and the Weather Research and Forecasting (WRF) models are state-of-the-art atmosphere (global and regional) models developed at NCAR.
Output format: NetCDF
The models need to be adapted to interact with the GRID (i.e. with the middleware). This would require deep model modifications. Instead, we only modified slightly the model source code to call other applications to interact with the GRID on behalf of the model: Grid Enabling Layer
32
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
The CAM namelist needs to be prepared and provided here by
the user.
(not very user-friendly, yet)
37
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
Currently, the regions are prepared off-line and uploaded
to the LFC.
38
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
All CAM simulations are available for regionalization with WRF, but the coupler
CAM -> WRF is NOT yet implemented
39
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
However, we have other input data sets for WRF available in
the catalog.
40
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
The simulations finished provide
access to the results
41
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
The output file can be downloaded in NetCDF
format or accessed via a THREDDS or OpenDAP
aware application.
42
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
For instance, the toolsUI java application from Unidata can load the OpenDAP (DODS)
address and access only the requested portions of the data
43
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
NOAA provides user-friendly access to setup regional
domains for WRF
44
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
This Java Web Start application could in the future be launched from our web portal as starting
point to design a regional simulation
45
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
UI
submiter
GenerateCAM.jdl
SubmitCAM.jdl
Insert entry in CAM collectionAMGA
ResourceBroker
Computing element
Update status
Update statusInsert runonCAM collection
AMGA
CAM
Upload info
Insert entry in WRF collectionAMGA
1
2
3
Restart checkpoint
Insert entry in CHECKPOINTcollectionAMGA
Update history
Insert entry in HISTORYCAMcollectionAMGA
CAM Application Workflow