overview of the crclim project - main - c2sm wiki · federal department of home affairs fdha...

48
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project Christophe Charpilloz COSMO users workshop 20 th of January 2017

Upload: others

Post on 16-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

Federal Department of Home Affairs FDHAFederal Office of Meteorology and Climatology MeteoSwiss

Overview of the crCLIMprojectChristophe CharpillozCOSMO users workshop20th of January 2017

Page 2: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

2© COSMO user workshop, the 20th of January Christophe Charpilloz

Cloud-resolving climate modeling on future supercomputing platforms (crCLIM) [1]

A SNF funded SINERGIA project

[1] http://www.c2sm.ethz.ch/research/crCLIM.html

Page 3: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

3© COSMO user workshop, the 20th of January Christophe Charpilloz

• MeteoSwiss• Institute for Atmospheric and Climate

Science ETHZ• Institute for Computer Systems

ETHZ• Swiss National Supercomputing

Center

An interdisciplinary project

Page 4: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

4© COSMO user workshop, the 20th of January Christophe Charpilloz

• Subproject A- Oliver Fuhrer- Andrea Arteaga- Christophe Charpilloz

• Subproject B- Torsten Hoefler- Salvatore di Girolamo- Thomas Schulthess

The team and subprojects

• Subproject C- Christoph Schaer- Linda Schlemmer- Nikolina Ban- David Leutwyler- Daniel Luethi

• Subproject D- Heinli Wernli- Michael Sprenger- Nicolas Piaget- Stefan Ruedisueli*

Bold: leader✱: speaker at the current workshop

Page 5: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

5© COSMO user workshop, the 20th of January Christophe Charpilloz

• Subproject A- Oliver Fuhrer- Andrea Arteaga- Christophe Charpilloz

• Subproject B- Torsten Hoefler- Salvatore di Girolamo- Thomas Schulthess

The team and subprojects

• Subproject C- Christoph Schaer- Linda Schlemmer- Nikolina Ban- David Leutwyler- Daniel Luethi

• Subproject D- Heinli Wernli- Michael Sprenger- Nicolas Piaget- Stefan Ruedisueli*

Bold: leader✱: speaker at the current workshop

Page 6: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

6© COSMO user workshop, the 20th of January Christophe Charpilloz

1. Improve our understanding the processes governing water-cycle in a changing climate

2. Improve the representation of the water-cycle in climate models

3. Propose a computational framework allowing large scale climate simulation and allowing its analysis

Goals of the project

Page 7: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

7© COSMO user workshop, the 20th of January Christophe Charpilloz

• 10 years climate simulation• On a continent scale

- See label “This proposal” on the figure

• Horizontal grid resolution of 2.2 km

• 60 to 80 vertical levels

Some numbers

Page 8: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

8© COSMO user workshop, the 20th of January Christophe Charpilloz

• Huge computational cost

Problems

Page 9: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

9© COSMO user workshop, the 20th of January Christophe Charpilloz

• Huge computational cost• Huge amount of data

generated by the simulation

Problems

Page 10: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

10© COSMO user workshop, the 20th of January Christophe Charpilloz

• Huge computational cost• Huge amount of data

generated by the simulation- Currently 4.4 TB or 4400

GB per year of simulation

- Difficult to store if not possible

Problems

Page 11: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

11© COSMO user workshop, the 20th of January Christophe Charpilloz

• Use large supercomputer systems- Use hybrid CPU-GPU architecture- Kesch, Daint, …

• The COSMO model has already been adapted to run on these architecture [2]- COSMO-pompa, STELLA, CPP DyCore

Solutions - Huge computational cost

[2] O. Fuhrer, C. Osuna, X. Lapillonne, T. Gysi, M. Bianco, and T. Schulthess. "Towards gpu-accelerated operational weather forecasting." In The GPU Technology Conference, GTC. 2013.

Page 12: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

12© COSMO user workshop, the 20th of January Christophe Charpilloz

• Be able to run the model in a reasonable amount of time is only one part of the solution

• How do we conduct analysis if we can’t store the data ?- We propose a trade off between computational time and

storage- The idea is to trade space with time

Solutions - Huge amount of data

Page 13: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

13© COSMO user workshop, the 20th of January Christophe Charpilloz

1. The simulation runs

Current situation

Runsimulation1

Page 14: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

14© COSMO user workshop, the 20th of January Christophe Charpilloz

1. The simulation runs2. The simulation generates data

Current situation

Runsimulation Storeresults1 2

Page 15: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

15© COSMO user workshop, the 20th of January Christophe Charpilloz

1. The simulation runs2. The simulation generates data3. The generated data is read by the analysis application

Current situation

Runsimulation Storeresults1 2

Analysetheresults

3

Page 16: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

16© COSMO user workshop, the 20th of January Christophe Charpilloz

• Use a data virtualization layer or DVL• Developed by Salvatore di Girolamo

- Subproject B• The DVL is a layer between the analysis application and the

data

Solutions – The data virtualization layer (DVL)

Page 17: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

17© COSMO user workshop, the 20th of January Christophe Charpilloz

DVL – Original simulation

Page 18: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

18© COSMO user workshop, the 20th of January Christophe Charpilloz

DVL – Writing save points

Page 19: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

19© COSMO user workshop, the 20th of January Christophe Charpilloz

DVL – Writing save points

Page 20: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

20© COSMO user workshop, the 20th of January Christophe Charpilloz

DVL – Writing save points

Page 21: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

21© COSMO user workshop, the 20th of January Christophe Charpilloz

DVL – Writing save points

Page 22: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

22© COSMO user workshop, the 20th of January Christophe Charpilloz

DVL – Writing save points

Page 23: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

23© COSMO user workshop, the 20th of January Christophe Charpilloz

DVL – Writing save points

Page 24: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

24© COSMO user workshop, the 20th of January Christophe Charpilloz

The DVL – Data access

Page 25: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

25© COSMO user workshop, the 20th of January Christophe Charpilloz

The DVL – Interception of the access

Page 26: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

26© COSMO user workshop, the 20th of January Christophe Charpilloz

The DVL – The data is available

Page 27: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

27© COSMO user workshop, the 20th of January Christophe Charpilloz

The DVL – “Simple” read

Page 28: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

28© COSMO user workshop, the 20th of January Christophe Charpilloz

The DVL – Data is returned

Page 29: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

29© COSMO user workshop, the 20th of January Christophe Charpilloz

The DVL – Data not available

Page 30: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

30© COSMO user workshop, the 20th of January Christophe Charpilloz

The DVL – Re-run

Page 31: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

31© COSMO user workshop, the 20th of January Christophe Charpilloz

The DVL – The data is computed

Page 32: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

32© COSMO user workshop, the 20th of January Christophe Charpilloz

The DVL – The data is returned

Page 33: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

33© COSMO user workshop, the 20th of January Christophe Charpilloz

• Open questions regarding data access- Performance ?

• Caching• Access pattern detection• Prefetching

- Application grouping ?- Remote Direct Memory Access (RDMA) ?

Open question and research – The DVL

Page 34: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

34© COSMO user workshop, the 20th of January Christophe Charpilloz

• The DVL has to re-run the simulation- Multiple times

• The DVL has to choose the optimal re-run depending on- The requirement of the re-run- The availability of the resources

Solutions – Re-runs (done by the DVL)

Page 35: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

35© COSMO user workshop, the 20th of January Christophe Charpilloz

• The optimal re-run is determined by a performance model• For example the costs of the first “nc_open” calls

- Developed by Salvatore di Girolamo

Solutions – Performance model

Page 36: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

36© COSMO user workshop, the 20th of January Christophe Charpilloz

• The optimal re-run is determined by a performance model• For example the costs of the first “nc_open” calls

- Developed by Salvatore di Girolamo

Solutions – Performance model

Page 37: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

37© COSMO user workshop, the 20th of January Christophe Charpilloz

• The previous model relies on:- I/O model (read, write data results)- COSMO performance model

• Both are still in development (todo, use approach like [5] ?)

Performance model - TODO

[5] T. Hoefler, W. Gropp, W. Kramer, and M. Snir. "Performance modeling for systematic performance tuning." In State of the Practice Reports, p. 6. ACM, 2011.

Page 38: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

38© COSMO user workshop, the 20th of January Christophe Charpilloz

Proposed approach

Runsimulation Storesavepoints1 2

Restoresavepoints

anddore-runs

Analysetheresults3 DVL4

Runsimulation Storeresults1 2

Analysetheresults

3

The “classic” way The crClim way

Page 39: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

39© COSMO user workshop, the 20th of January Christophe Charpilloz

Proposed approach

Runsimulation Storesavepoints1 2

Restoresavepoints

anddore-runs

Analysetheresults3 DVL4

Runsimulation Storeresults1 2

Analysetheresults

3

The “classic” way The crClim way

Page 40: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

40© COSMO user workshop, the 20th of January Christophe Charpilloz

• The DVL has to choose the optimal re-run depending on the availability of the resources:- These machines may be CPU or hybrid CPU-GPU

architecture- The result of the simulation should be machine

independent

Problem - Re-runs

Page 41: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

41© COSMO user workshop, the 20th of January Christophe Charpilloz

Re-runs – Why change architecture ?

Page 42: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

42© COSMO user workshop, the 20th of January Christophe Charpilloz

Example - Re-runs machine selection

Few re-run instances

Many nodes perinstances: use CPU nodes

Many re-runinstances

Few node per instance: use GPU nodes

Page 43: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

43© COSMO user workshop, the 20th of January Christophe Charpilloz

• A re-run can start from any save point• They can be executed on a different hardware than the

original simulation• The results need to be always consistent (perfect match)• We want bit-reproducibility [4]

Solution – Bit-reproducibility

[4] A. Arteaga, O. Fuhrer, and T. Hoefler. "Designing bit-reproducible portable high-performance applications." In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, pp. 1235-1244. IEEE, 2014.

Page 44: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

44© COSMO user workshop, the 20th of January Christophe Charpilloz

• Can we prove it ?- Unlikely

• Do we suffer from performance penalty ?- First results tend to show that’s not the case (memory

bound instead of compute bound)

Open question – Bit-reproducibility

Page 45: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

45© COSMO user workshop, the 20th of January Christophe Charpilloz

• Simulation on a continent scale at high horizontal resolution [3] (subproject C)

• An early prototype of the DVL (subproject B)• A reproducible version of COSMO (subproject A)

- Only tested with meteorological configuration

Achievements

[3] D. Leutwyler, O. Fuhrer, X. Lapillonne, D. Luthi, and C. Schar. "Towards European-scale convection-resolving climate simulations with GPUs: a study with COSMO 4.19." Geoscientific Model Development 9, no. 9 (2016): 3393.

Page 46: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

46© COSMO user workshop, the 20th of January Christophe Charpilloz

Thank you for your attention

More about the crClim project in the next talk (S. Ruedisueli)

Page 47: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

47© COSMO user workshop, the 20th of January Christophe Charpilloz

[1] http://www.c2sm.ethz.ch/research/crCLIM.html

[2] O. Fuhrer, C. Osuna, X. Lapillonne, T. Gysi, M. Bianco, and T. Schulthess. "Towards gpu-accelerated operational weather forecasting." In The GPU Technology Conference, GTC. 2013.

[3] D. Leutwyler, O. Fuhrer, X. Lapillonne, D. Luthi, and C. Schar. "Towards European-scale convection-resolving climate simulations with GPUs: a study with COSMO 4.19." Geoscientific Model Development 9, no. 9 (2016): 3393.

[4] A. Arteaga, O. Fuhrer, and T. Hoefler. "Designing bit-reproducible portable high-performance applications." In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, pp. 1235-1244. IEEE, 2014.

[5] T. Hoefler, W. Gropp, W. Kramer, and M. Snir. "Performance modeling for systematic performance tuning." In State of the Practice Reports, p. 6. ACM, 2011.

References

Page 48: Overview of the crCLIM project - Main - C2SM Wiki · Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Overview of the crCLIM project

48© COSMO user workshop, the 20th of January Christophe Charpilloz 48

MeteoSvizzeraVia ai Monti 146CH-6605 Locarno-MontiT +41 58 460 92 22www.meteosvizzera.ch

MétéoSuisse7bis, av. de la PaixCH-1211 Genève 2T +41 58 460 98 88www.meteosuisse.ch

MétéoSuisseChemin de l‘AérologieCH-1530 PayerneT +41 58 460 94 44www.meteosuisse.ch

MeteoSwissOperation Center 1 CH-8058 Zurich-Airport T +41 58 460 91 11 www.meteoswiss.ch

Federal Department of Home Affairs FDHAFederal Office of Meteorology and Climatology MeteoSwiss