scec capability simulations on teragrid

26
1 SAN DIEGO SUPERCOMPUTER CENTER, UCSD SCEC Capability Simulations on TeraGrid Yifeng Cui San Diego Supercomputer Center

Upload: tejana

Post on 20-Jan-2016

19 views

Category:

Documents


1 download

DESCRIPTION

SCEC Capability Simulations on TeraGrid. Yifeng Cui San Diego Supercomputer Center. SCEC Computational Pathways. SCEC Capability Simulations on Kraken and Ranger. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SCEC Capability Simulations  on  TeraGrid

1SAN DIEGO SUPERCOMPUTER CENTER, UCSD

SCEC Capability Simulations on TeraGrid

Yifeng CuiSan Diego Supercomputer Center

Page 2: SCEC Capability Simulations  on  TeraGrid

2SAN DIEGO SUPERCOMPUTER CENTER, UCSD

SCEC Computational Pathways

Page 3: SCEC Capability Simulations  on  TeraGrid

3SAN DIEGO SUPERCOMPUTER CENTER, UCSD

SCEC Capability Simulations on Kraken and Ranger

• ShakeOut-D: 600 x 300 x 80 km domain, 100m resolution, 14.4 billion grids, upper frequency limit to 1-Hz, 3 minutes, 50k time steps, min surface velocity 500m/s, dynamic source (SGSN), velocity properties SCEC CVM4.0, 1 terabyte inputs, 5 terabytes output

• ShakeOut-K: 600 x 300 x 80 km domain, 100m resolution, 14.4 billion grids, upper frequency limit to 1-Hz, 3 minutes, 50k time steps, min surface velocity 500m/s, kinematic, velocity properties SCEC CVM4.0, 1 terabyte inputs, 5 terabytes output

• Chino Hills: 180x125x60km, 50m resolution, 10.8 billion grids, 80k time steps, upper frequency limit to 2-hz, using both SCEC CVM4 and CVM-H velocity models

• Latest simulation completed within 1.8 hours for ShakeOut-D run on 64k Kraken XT5 cores. ShakeOut-D 2-hz benchmark achieved sustained 49 Teraflop/s.

Source: Yifeng Cui, UCSD

Page 4: SCEC Capability Simulations  on  TeraGrid

4SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Validation of Chino Hills Simulations

• Goodness-of-fit at 0.1-0.5 Hz for synthetics relative to data from M5.4 Chino hills earthquake.

• Seismogram comparisons of recorded data (black traces), CVM-S synthetics (read traces) and CVM-H synthetics (blue traces)

Page 5: SCEC Capability Simulations  on  TeraGrid

5SAN DIEGO SUPERCOMPUTER CENTER, UCSD

File system: Original Source

File

File system: Original Media

File

Source Partitioning

Media Partitioning

File System:Partitioned

Source Files

File System:Partitioned Media Files

Archival System:

Source and Media Files

Configuration:IN3D

INPUT DATA PREPARATION

Archival System:

Output Files

DATA ARCHIVAL

Grid

FT

P

SR

B

CopyG

ridFT

PS

RB

Cop

y

SIMULATION AND VALIDATION

Configuration:IN3D

File System:Partitioned Source and Media Files

GridFTP

SRBLink

File System:Simulation

Output Files

ShakeOut Simulation

GridFTP

SRBCopy

SimulationValidation

YES

Simulation preparation

Source Ready?

Media Ready?

NO

YES

NO

SimulationVisualization

GridFTP

SRBCopy

SCEC Capability Simulations Workflow

• Inputs are in TB size with spatial and temporal locality

• Input partitions are transferred between TG sites,

• Simulation outputs are backed up on TACC Ranch and NICS HPSS.

• Visualization done on Ranger

Page 6: SCEC Capability Simulations  on  TeraGrid

6SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Adapting SCEC Applications to Different TeraGrid Architectures

settingssource fault input

media input

0-4 source mode0-3 media mode

Temporal Locality

solver

0-max checkpoints0-1 MD5 mode 0-1 output mode0-1 accumulation0-1 performance

mediapartition

Read in Read in

settings

yes

if 2

if 1if 0-1

if 1

if 0 or 2

if 1

if 0 or 2

save partition

save partition

if 2

ckpts

sfc orsfc+ vlm

MD5

if I/O mode 1

SAN switch Instrastructure

SAM-QFS HPSS

if >0

if 1

if 1

restart

Spatial Locality

initial stress input

if >2no

performance measurement

Source: Cui et al. Toward Petascale Earthquake Simulations, Acta Geotechnica, June 2008

Serial or parallel source partitioning and split options

Serial or parallel mesh partitioning and options

Page 7: SCEC Capability Simulations  on  TeraGrid

7SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Mesh Partitioning

Mesh inputs

Mesh 0 Mesh 1 Mesh 2 … Mesh N

Serial (part-serial)

Serial (part-paralllel)

MPI-IO scattered read

MPI-IO Contiguous

read

Page 8: SCEC Capability Simulations  on  TeraGrid

8SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Mesh Serial Read

Page 9: SCEC Capability Simulations  on  TeraGrid

9SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Mesh Partitioned in Advance

• Data locality

Page 10: SCEC Capability Simulations  on  TeraGrid

10SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Mesh MPI-IO Scattered Read

Page 11: SCEC Capability Simulations  on  TeraGrid

11SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Mesh MPI-IO Contiguous Read

• Data Continuity

• Read XY plane and then redistribute data

Page 12: SCEC Capability Simulations  on  TeraGrid

12SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Comparisons of Mesh Approaches

Serial IOSeria IO (partitioned local files)

MPIIO (scattered)MPIIO (contigous) and data redistribution

Performance Low High Midium High

System dependence Low Low High Low

Scalability poor poor dependents Good

Number of files 1 npx*npy*npz 1 1

Memory requirement (elements)

nxt*nyt*nzt/core nxt*nyt*nzt/core nxt*nyt*nzt/core

nx*ny/core - sender (nz cores)nxt*nyt*nzt/core - receiver (all cores)

Communication overhead High None None High

Collective IO No No Yes Yes

Stripe number (recommended)

Small Small Large Large

Stripe size (recommended)

Small Small Big Bigger (nx*ny)

Page 13: SCEC Capability Simulations  on  TeraGrid

13SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Source PartitioningSource inputs

Source 1

Time Step 1-600

Time step 601-1199

Time step 23401 - 24000

Source 2

Time Step 1-600

Time step 601-1199

Time step 23401-24000

Source 3

Time Step 1-600

Time step 601-1199

Time step 23401-24000

… Source 483161

Time Step 1-600

Time step 601-1199

Time step 23401-24000

Serial (part-serial)

Serial (part-paralllel)

MPI-IO scattered read

Page 14: SCEC Capability Simulations  on  TeraGrid

14SAN DIEGO SUPERCOMPUTER CENTER, UCSD

1000 10000 100000 10000001E+05

1E+06

1E+07

AWP-Olsen-Day Code Scaling on Kraken, Ranger and Intrepid1-Hz ShakeOut,100m resolution and 14.4 billion mesh points

(6000x3000x800)

On NICS Kraken-XT4 with Synchronous Communication

On TGW BG/L Intrepid with Synchronous Communication

Number of Cores

Nr.

of

mes

h p

oin

ts u

pd

ated

/ste

p/s

ec/c

ore

Page 15: SCEC Capability Simulations  on  TeraGrid

15SAN DIEGO SUPERCOMPUTER CENTER, UCSD

1000 10000 100000 10000001E+05

1E+06

1E+07

AWP-Olsen-Day Code Scaling on Kraken, Ranger and Intrepid1-Hz ShakeOut,100m resolution and 14.4 billion mesh points

(6000x3000x800)On Sun Constellation TACC Ranger with Synchronous Communication

On NICS Kraken-XT5 with Synchronous Communication

On ALCF BG/P Intrepid with Synchronous Communication

On NICS Kraken-XT4 with Synchronous Communication

On TGW BG/L Intrepid with Synchronous Communication

Number of Cores

Nr.

of

mes

h p

oin

ts u

pd

ated

/ste

p/s

ec/c

ore

Page 16: SCEC Capability Simulations  on  TeraGrid

16SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Synchronous Communication

Page 17: SCEC Capability Simulations  on  TeraGrid

17SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Synchronous Communication

Page 18: SCEC Capability Simulations  on  TeraGrid

18SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Asynchronous Communication

Page 19: SCEC Capability Simulations  on  TeraGrid

19SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Asynchronous Communication

Page 20: SCEC Capability Simulations  on  TeraGrid

20SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Asynchronous Communication

Page 21: SCEC Capability Simulations  on  TeraGrid

21SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Asynchronous Communication

Page 22: SCEC Capability Simulations  on  TeraGrid

22SAN DIEGO SUPERCOMPUTER CENTER, UCSD

1000 10000 100000 10000001E+05

1E+06

1E+07

AWP-Olsen-Day Code Scaling on Kraken, Ranger and Intrepid1-Hz ShakeOut,100m resolution and 14.4 billion mesh points

(6000x3000x800)On Sun Constellation TACC Ranger with Synchronous Communication

On NICS Kraken-XT5 with Synchronous Communication

On ALCF BG/P Intrepid with Synchronous Communication

On NICS Kraken-XT4 with Synchronous Communication

On TGW BG/L Intrepid with Synchronous Communication

Number of Cores

Nr.

of

mes

h p

oin

ts u

pd

ated

/ste

p/s

ec/c

ore

Page 23: SCEC Capability Simulations  on  TeraGrid

23SAN DIEGO SUPERCOMPUTER CENTER, UCSD

1000 10000 100000 10000001E+05

1E+06

1E+07

AWP-Olsen-Day Code Scaling on Kraken, Ranger and Intrepid1-Hz ShakeOut,100m resolution and 14.4 billion mesh points

(6000x3000x800)On NICS Kraken-XT5 with Asynchronous CommunicationOn ALCF BG/P Intrepid with Asynchronous CommunicationOn Sun Constellation TACC Ranger with Asynchronous CommunicationOn Sun Constellation TACC Ranger with Synchronous CommunicationOn NICS Kraken-XT5 with Synchronous CommunicationOn ALCF BG/P Intrepid with Synchronous CommunicationOn NICS Kraken-XT4 with Synchronous CommunicationOn TGW BG/L Intrepid with Synchronous Communication

Number of Cores

Nr.

of

mes

h p

oin

ts u

pd

ated

/ste

p/s

ec/c

ore

Page 24: SCEC Capability Simulations  on  TeraGrid

24SAN DIEGO SUPERCOMPUTER CENTER, UCSD

SCEC Capability Simulations Performance on TeraGrid

* Benchmark

Page 25: SCEC Capability Simulations  on  TeraGrid

25SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Other efforts in progress supporting SCEC larger-scale simulations

• Single CPU optimization, for example division is very

expensive, by reducing division work, we have observed

performance improvements by 25-45% on up to 8k cores

• Workflow: end-to-end approach to automate procedures

of capability simulations

• Restructuring code to prepare as SCEC community code,

emphasize modularity, re-usability and ease of integration

• Developing hybrid code with a two level MPI/OpenMP

Page 26: SCEC Capability Simulations  on  TeraGrid

26SAN DIEGO SUPERCOMPUTER CENTER, UCSD

Acknowledgements

• This work has received technical supports from varied TeraGrid sites, in particular:– Tommy Minyard and Karl Schultz of TACC– Kwai Lam Wong and Bruce Loftis of NICS– Amit Chourasia of SDSC– SCEC Collaborations