project athena overview project athena: origins the world modeling summit (wms) in may 2008 called...

Project Athena Overview

Project Athena: Origins· The World Modeling Summit (WMS) in May 2008 called for

a revolution in climate modeling to more rapidly advance improvements in accuracy and reliability

· The WMS recommended petascale supercomputers dedicated to climate modeling based in at least 3 international facilities

· Dedicated petascale machines are needed to provide enough computational capability and a controlled environment to support long runs and the management, analysis and stewardship of very large (petabyte) data sets

· The U.S. National Science Foundation, recognizing the importance of the problem, realized that a resource (Athena) was available to meet the challenge of the World Modeling Summit and offered to dedicate the Athena supercomputer for 6 months in 2009-2010

· An international collaboration was formed among groups in the U.S., Japan and the U.K. to use Athena to take up the challenge

COLA - Center for Ocean-Land-Atmosphere Studies, USA (NSF-funded)ECMWF - European Centre for Medium-range Weather Forecasts, UKJAMSTEC - Japan Agency for Marine-Earth Science and Technology, Research Institute for Global Change, JapanUniversity of Tokyo, JapanNICS - National Institute for Computational Sciences, USA (NSF-funded)Cray Inc.

Project Athena: Collaborating Groups

CodesNICAM: Nonhydrostatic Icosahedral Atmospheric ModelIFS: ECMWF Integrated Forecast System

SupercomputersAthena: Cray XT4 - 4512 quad-core Opteron nodes (18048)

#30 on Top500 list (November 2009) – dedicated Oct’09 – Mar’10

Kraken: Cray XT5 - 8256 dual hex-core Opteron nodes (99072)

#3 on Top500 list (November 2009) replaced Athena – allocation of 5M SUs

Straus/GMU/COLA AGU Dec 2010 3

Athena Experiments

http://wxmaps.org/athena/home/

Surface pressure Potential Vorticity

Blocking Index. 13 month integrations of ECMWF model (at T159 and T1259). DJFM 1960-2003

ERA-40

500 hPa Geopotential height – ERA DJFM 1960-2007

DJFM Weather Regimes Euro-Atlantic Region

Winds@850 hPa: DJF 1990-2007

60W100W140W180140E100E

40S60W100W140W180140E100E

60W100W140W180140E100E

40S60W100W140W180140E100E

b T511- T159

d T2047- T1279

a T159- ERA

c T1279- T511

5.0m/s

Aircraft observations showing spectra of wind components and T, plotting log(E) vs. log(k), so that the slope of the straight lines indicate the exponent n in the previous slide.

Other in-situ observations have confirmed these results!

Atmospheric SpectraPower LawsTwo scaling regimes:

Log-log plot ofEnergy vs. wavenumber

100 – 10 km

ECMWF Dec-March Simulations: Eddy Kinetic Energy Spectrum 250 hPa

Total Eddy Kinetic Energy ECMWFBlack: T511 (40 km grid)Red: T1279 (16 km grid)Blue: T2047 (10 km grid)250 hPa 5 DJF seasons

Note sudden downturn in spectra:suggests dissipation regime

Hint of two regimes at T1279 and T2047but not at T511

Slope of Total Eddy Kinetic EnergyBlack: T511 (40 km grid)Red: T1279 (16 km grid)Blue: T2047 (10 km grid)250 hPa level 5 DJF seasons

Local Spectral Slope b En ~ n-b

Least squares fit of log10(eddy kinetic energy) to line with slope –b , locally over a range of constant log10(n)

Large slope indicates dissipation regime

y-axis is bx-axis is log10(n)

T2047, T1279 show weak shallowing of spectra at higher wavenumbers

Athena Results

• Seasonal Length Runs

• Results shown for 5 DJF seasons and 5 JJA seasons

• Results for both ECMWF and NICAM models

Cluster analysis methodology

1) Identification of clusters in the reduced phase space defined by the empirical orthogonal functions (EOFs. The leading EOFs (to explain about

80% of the space-time variance) are retained. 2) For a given number k of clusters, the optimum partition of data into k

clusters is found by an algorithm that takes an initial cluster assignment (based on the distance from pseudorandom seed points), and iteratively changes it by assigning each element to the cluster with the closest centroid, until a ‘‘stable’’ classification is achieved. (A cluster centroid is defined by the average of the PC coordinates of all states that lie in that cluster.)

3) This process is repeated many times (using different seeds), and for each partition the ratio r*k of variance among cluster centroids (weighted by

the population) to the average intra-cluster variance is recorded.

4) The partition that maximises this ratio is the optimal one.

The (modified) K-means cluster analysis method (K is the number of clusters into which the data will be grouped, this number must be specified in advance) (Straus et al. 2007) can be summarized in the following four steps:

Cluster analysis - Significance

The goal is to assess the strength of the clustering compared to that expected from an appropriate reference distribution, such as a multidimensional Gaussian distribution.

In assessing whether the null hypothesis of multi-normality can be rejected, it is therefore necessary to perform Monte-Carlo simulations using a large number M of synthetic data sets. Each synthetic data set has precisely the same length as the original data set against which it is compared, and it is generated from a series of n dimensional Markov processes, whose mean, variance and first-order auto-correlation are obtained from the observed data set.

A cluster analysis is performed for each one of the simulated data sets. For each k-partition the ratio rmk of variance among cluster

centroids to the average intra-cluster variance is recorded. Since the synthetic data are assumed to have a unimodal distribution, the proportion Pk

of red-noise samples for which rmk < r*k is a measure of the significance of the k-cluster partition of the actual data, and 1- Pk

is the corresponding confidence level for the existence of k clusters.

Cluster analysis - How many clusters?

The need of specifying the number of clusters can be a disadvantage of K-means method if we don’t know in advance what is the best cluster partition of the data set in question. However there are some criteria that can be used to choose the optimal partition.Significance: partition with the highest significance with respect to predefined Multinormal distributions

Reproducibility: We can use as a measure of reproducibility the ratio of the mean-squared error of best matching cluster centroids from a N pairs of randomly chosen half-length datasets from the full actual one. The partition with the highest reproducibility will be chosen.

Consistency: The consistency can be calculated both with respect to variable (for example comparing clusters obtained from dynamically linked variables) and with respect to domain (test of sensitivities with respect to the lateral or vertical domain).

project athena overview project athena: origins the world modeling summit (wms) in may 2008 called...

t511 slide

challenge slide

resource athena

athena supercomputer

sus slide

previous slide

project athena overview

t159 t1259 slide

Documents

modeling project contest

deliverable b4.4 drds work package – b4 · ip- project /...

brahma athena gurgaon | commercial project in sector 16

presentation file of eastend athena project(1)

athena project€¦ · athena project jaime ciriaco michael...

athena ecocalculator for assemblies...athena ecocalculator...

* project presentation at athena-workshop in stuttgart,...

modeling project management

1 athena final review, 28 march 2007 – © athena...

amazon athena - amazon athena documentation€¦ · amazon...

energy modeling project

2d process modeling with silvaco athena dr. lynn fuller ·...

athena swan @ icg · daniel thomas - icg - january 2017...

projectathena.org€¦ · what's project athena? project...

athena industries - nostrali industries 2015_de.pdf ·...

dispersion modeling project

project modeling

athena & barca - hiranandani … · athena & barca....

project finance modeling

project athena fundraising tool kit · 2019. 1. 15. ·...