cs267-april 20th, 2010 big bang, big iron high performance computing and the cosmic microwave...

24
CS267 - April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center, LBL Space Sciences Laboratory, UCB with Chris Cantalupo, Ted Kisner, Radek Stompor, Rajesh Sudarsan and the BOOMERanG, MAXIMA, Planck, EBEX, PolarBear & other experimental collaborations

Upload: victor-green

Post on 31-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background

Julian BorrillComputational Cosmology Center, LBL

Space Sciences Laboratory, UCB

with Chris Cantalupo, Ted Kisner, Radek Stompor, Rajesh Sudarsan

and the BOOMERanG, MAXIMA, Planck, EBEX, PolarBear & other experimental collaborations

Page 2: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

The Cosmic Microwave Background

About 400,000 years after the Big Bang, the expanding Universe cools through the ionization temperature of hydrogen: p+ + e- => H.

Without free electrons to scatter off, CMB photons free-stream to us today.

• COSMIC - filling all of space.• MICROWAVE - redshifted by

the expansion of the Universe from 3000K to 3K.

• BACKGROUND - primordial photons coming from “behind” all astrophysical sources.

Page 3: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

CMB Physics Drivers

• It is the earliest possible photon image of the Universe.• Its existence supports a Big Bang over a Steady State cosmology (NP1).• Tiny fluctuations in the CMB temperature (NP2) and polarization encode details of

– cosmology• geometry• topology• composition• history

– ultra-high energy physics• fundamental forces• beyond the standard model• Inflation & the dark sector (NP3)

Page 4: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

The Concordance Cosmology

Supernova Cosmology Project (1998):

Cosmic Dynamics (- m)

BOOMERanG & MAXIMA (2000):

Cosmic Geometry (+ m)

70% Dark Energy + 25% Dark Matter + 5% Baryons

95% Ignorance

What (and why) is the Dark Universe ?

Page 5: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

1% of static on (untuned) TV

Observing The CMB

Page 6: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

CMB Satellite Evolution

Page 7: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

The Planck Satellite

• The primary driver for HPC CMB work for the last decade.

• A joint ESA/NASA satellite mission performing a 2-year+ all-sky survey from L2.

• All-sky survey at 9 microwave frequencies from 30 to 857 GHz.

• The biggest data set to date:– O(1012) observations– O(108) sky pixels– O(104) spectral multipoles

Page 8: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Beyond Planck

• EBEX (1x Planck) - Antarctic long-duration balloon flight in 2012. • PolarBear (10x Planck) - Atacama desert ground-based 2010-13.• QUIET-II (100x Planck) - Atacama desert ground-based 2012-15.• CMBpol (1000x Planck) - L2 satellite 2020-ish?

Page 9: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

CMB Data Analysis

• In principle very simple– Assume Guassianity and maximize the likelihood

1. of maps given the data and its noise statistics (analytic).2. of power spectra given maps and their noise statistics (iterative).

• In practice very complex– Foregrounds, asymmetric beams, non-Gaussian noise, etc.– Algorithm & implementation scaling with evolution of

• Data volume• HPC architecture

Page 10: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

The CMB Data Challenge• Extracting fainter signals (polarization mode, angular resolution) from the data requires:

– larger data volumes to provide higher signal-to-noise.– more complex analyses to remove fainter systematic effects.

• 1000x data increase over next 15 years– need to continue to scale on the bleeding edge through the next 10 M-foldings !

Experiment Date Time Samples Sky Pixels Gflop/Map

COBE 1989 109 103 1*

BOOMERanG 2000 109 105 103

WMAP 2001 1010 106 104

Planck 2009 1011 107 105

PolarBear 2012 1012 106 106

QUIET-II 2015 1013 106 107

CMBpol 2020+ 1014 108 108

Page 11: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

CMB Data Analysis Evolution

Data volume & computational capability dictate analysis approach.

Date Data System Map Power Spectrum

2000 B98 Cray T3E x 700

Explicit Maximum Likelihood (Matrix Invert - Np

3)Explicit Maximum Likelihood

(Matrix Cholesky + Tri-solve - Np3)

2002 B2K2 IBM SP3 x 3,000

Explicit Maximum Likelihood (Matrix Invert - Np

3)Explicit Maximum Likelihood(Matrix Invert + Multiply - Np

3)

2003-7 Planck subsets

IBM SP3 x 6,000

PCG Maximum Likelihood(FFT - Nt log Nt)

Monte Carlo(Sim + Map - many Nt)

2007+Planck full

EBEXCray XT4 x 40,000

PCG Maximum Likelihood(FFT - Nt log Nt)

Monte Carlo(SimMap - many Nt)

2010+ Towards CMBpol

Addressing the challenges of 1000x data & next 10 generations of HPC systems starting with Hopper, Blue Waters, etc.

Page 12: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Scaling In Practice2000: BOOMERanG-98 temperature map (108

samples, 105 pixels) calculated on 128 Cray T3E processors;

2005: A single-frequency Planck temperature map (1010 samples, 108 pixels) calculated on 6000 IBM SP3 processors;

2008: EBEX temperature and polarization maps (1011 samples, 106 pixels) calculated on 15360 Cray XT4 cores.

Page 13: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Aside: HPC System Evaluation

• Scientific applications provide realistic benchmarks– Exercise all components of a system both individually and collectively.– Performance evaluation can be fed back into application codes.

• MADbench2– Based on MADspec CMB power spectrum estimation code.– Full computational complexity (calculation, communication & I/O).– Scientific complexity removed

• reduces lines of code by 90%.• runs on self-generated pseudo-data.

– Used for NERSC-5 & -6 procurements.– First friendly-user Franklin system crash (90 minutes after access).

Page 14: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

MADbench2 I/O Evaluation

IO performance comparison• 6 HPC systems• Read & write• Unique & shared files

Asynchronous IO experiment• N bytes asynchronous read/write• N flops simultaneous work• Measure time spent waiting on IO

Page 15: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

MADmap for Planck Map MakingA massively parallel, highly optimized, PCG solver for maximum likelihood maps(s)

given a time-stream of observations and their noise statistics• 2005: First Planck-scale map

– 75 billion observations mapped to 150 million pixels– First science code to use all 6,000 CPUs of Seaborg

• 2007: First full Planck map-set (FFP)– 750 billion observations mapped to 150 million pixels– Using 16,000 cores of Franklin– IO doesn’t scale

• write-dominated simulations• read-dominated mappings

• May 14th 2009: Planck launches!

Page 16: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Planck First Light Survey

Page 17: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Planck Sim/Map Target

• By the end of the Planck mission in 2013, we need to be able to simulate and map

– O(104) realizations of the entire mission• 74 detectors x 2.5 years ~ O(1016) samples

– On O(105) cores– In O(10) wall-clock hours

WAIT ~ 1 day : COST ~ 106 CPU-hrs

Page 18: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

TARGET:104 maps

9 freqs2.5 years105 cores

10 hours

12x217

FFP1

M3/GCPM3/GCP

CTP3

OTFSOTFS Peta-ScalingPeta-Scaling

Page 19: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

On-The-Fly Simulation

• Remove redundant & non-scaling IO from traditional simulate/write then read/map cycle.

• M3-enabled map-maker’s read TOD request translated into runtime (on-the-fly) simulation.

• Trades cycles & memory for disk & IO.• Currently supports

– Piecewise stationary noise with arbitrary spectra• Truly independent random numbers

– Symmetric & asymmetric beam-smoothed sky• Can be combined with explicit TOD (e.g. systematic).

Page 20: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Current Planck State-Of-The-Art : CTP3• 1000 each Planck 1-year 1-frequency noise & signal.• Noise sim/map runs

– O(1014) samples, 2TB disk (maps)– 2 hours on 20,000 cores

Page 21: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

MADmap Scaling Profile

Page 22: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Next Generation HPC Systems

• Large, correlated, CMB data sets exercise all components of HPC systems:– Data volume => disk space, IO, floating point operations.– Data correlation => memory, communication.

• Different components scale differently over time and across HPC systems– Develop trade-offs and tune to system/concurrency.

• IO bottleneck has been ubiquitous– now largely solved by replacement with (re-)calculation.

• For all-sky surveys, communication bottleneck is the current challenge– Even in a perfect world, all-to-all communication won’t scale

• Challenges & opportunities of next-generation systems: – Multi- & many-core (memory-per-core fork)– GPUs/accelerators (heterogeneous systems)

Page 23: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Ongoing Research

• Address emerging bottlenecks at very high concurrency in the age of many-core & accelerators:

– Hopper– Blue Waters– NERSC-7

• Communication: replace inter-core with inter-node communication.

• Calculation: re-write/tool/compile for GPUs et al.

• Auto-tuning: system- & analysis-specific run-time configuration.

Page 24: CS267-April 20th, 2010 Big Bang, Big Iron High Performance Computing and the Cosmic Microwave Background Julian Borrill Computational Cosmology Center,

CS267 - April 20th, 2010

Conclusions• Roughly 95% of the Universe is known to be unknown

– the CMB provides a unique window onto the early years.• The CMB data sets we gather and the HPC systems we analyze them on are both

evolving. • CMB data analysis is a long-term computationally-challenging problem requiring

state-of-the-art HPC capabilities.• The quality of the science derived from present and future CMB data sets will be

determined by the limits on – our computational capability– our ability to exploit it

• CMB analysis codes can be powerful full-system evaluation tools.

We’re always very interested in hiring good computational scientists to do very interesting computational science!