petascale astronomy and the ska
DESCRIPTION
Petascale astronomy and the SKA. Athol Kemball Department of Astronomy & Center for Extreme-scale Computing (IACAT/NCSA) University of Illinois, USA [email protected]. Contemporary scientific methods. Euclid, 3 rd century mathematician, teaching ( Raphael). Theory: - PowerPoint PPT PresentationTRANSCRIPT
Petascale astronomy and the SKA
Athol Kemball Department of Astronomy & Center for Extreme-scale Computing
(IACAT/NCSA)University of Illinois, USA
SKA SA 2008
Contemporary scientific methods
• Theory:– Develop abstract or
mathematical models of the physical system or problem.
• Experimental and observational methods:
– Take observational or experimental data to disprove or refine models
• Computational methods:
– Simulate complex multi-scale systems that are beyond the reach of analytic methods
– Process vast amounts of observed or experiment data
Euclid, 3rd century mathematician, teaching
(Raphael)
Very Large Array (VLA); New Mexico, USA
Molecular dynamics simulation: water
permeation in aquaporins (Schulten Group, UIUC
SKA SA 2008
Computational Cosmology: Structure Formation
Nonlinear Evolution of the Universe:from 20 million to 14 billion years old The cosmological simulation computes the nonlinear evolution of the universe in the context of the standard cosmological model determined by the Wilkinson Microwave Background Anisotropy experiment. (Cen & Ostriker 2006;
Advanced Vizualization Laboratory NCSA)
SKA SA 2008
Computational Science: Ensuring …President’s Information Technology Advisory Committee
“Together with theory and experiment, computational science now constitutes the “third pillar” of scientific inquiry, enabling researchers to build and test models of complex phenomena – such as multi-century climate shifts, multidimensional flight stress on aircraft, and stellar explosions – that cannot be replicated in the laboratory, and to manage huge volumes of data rapidly and economically.”
While it is itself a discipline, computational science serves to advance all of science. The most scientifically important and economically promising research frontiers in the 21st century will be conquered by those most skilled with advanced computing technologies and computational science applications.”
SKA SA 2008
Computational Science and Engineering
Molecular Science Weather & Climate Forecasting
Earth ScienceAstronomy Health
SKA SA 2008
Open Challenges in Modern Astrophysics
SKA SA 2008
“What are the basic properties of the fundamental particles and forces?”
Neutrinos, Magnetic Fields, Gravity, Gravitational Waves, Dark Energy
“What constitutes the missing mass of the Universe?”Cold Dark Matter (e.g. via lensing), Dark Energy, Hot Dark Matter (neutrinos)
“What is the origin of the Universe and the observed structure and how did it evolve?”
Atomic hydrogen, epoch of reionization, magnetic fields, star-formation history……
“How do planetary systems form and evolve?”
Movies of Planet Formation, Astrobiology, Radio flares from exo-planets……
“Has life existed elsewhere in the Universe, and does it exist elsewhere now?”
SETI
Fundamental questions in physics and astronomy
SKA SA 2008
How does SKA answer these questions ?
• Detect and image neutral hydrogen in the very early phases of the universe when the first stars and galaxies appeared “epoch of re-ionization”
• Locate 1 billion galaxies via their neutral hydrogen signature and measure their distribution in space – “dark energy”
SKA SA 2008
How does SKA answer these questions ?
• Time pulsars to test description of gravity in the strong field case (pulsar-Black Hole binaries), and to detect gravitational waves; explore the unknown transient universe
• Origin and evolution of cosmic magnetic fields – “the magnetic universe”
• Planet formation – image Earth-sized gaps in proto-planetary disks
BLACK HOLE
SKA SA 2008
The Large Synoptic Survey Telescope (2014)
(LSST; 8.4m; 3.2 Gpixel camera)
LSST science goals• Cosmology: probing dark energy
and dark matter• Exploring the transient sky• Mapping the Milky Way• Inventory of Solar System objects
(Cerro Panchon (Iveziv et al. 2008))
(LSST deep lensing survey (Ivezic et al. 2008))
SKA SA 2008
The Great Survey Era
SKA-era telescopes & science require:• Surveys over large cosmic volumes (Ω,z),
fine synoptic time-sampling Δt, and/or high completeness
• High receptor count and data acquisition rates
• Software/hardware boundary far closer to receptors than at present
• Efficient, high-throughput survey operations modes
Processing implications• High sensitivity, Ae/Tsys~104 m2K-1, wide-
field imaging;• Demanding (t,ω,P) non-imaging analysis• Large O(109) survey catalogs
High associated data rates (TBps), compute processing rates (PF), and PB/EB archives (HI galaxy surveys, e.g. ALFALFA HI
(Giovanelli et al. 2007); SKA requires a billion galaxy survey.)
(SKA schematic: tiled aperture arrays plus parabolic dishes)
SKA SA 2008
Petascale Computing Challenges in the Great Survey Era
SKA SA 2008
14
LSST computing and data storage scaleLSST computing and data storage scale
Reference science requires:• Telescope data output of 15
TB per night• Archive size ~ O(102) PB• Processing ~ O(1) PF
(LSST data flow (Ivezic et al. 2008))
(LSST focal plan: each square 4k x 4k pixels; (Ivezic et al. 2008))
SKA SA 2008
SKA wide-field image formation
Algorithm technologies• 3-D transform (Perley 1999), facet-based tesselation / polyhedral
imaging (Cornwell & Perley 1992), and w-projection (Cornwell et al. 2003).
(Cornwell et al. 2003; facet-based vs w-projection algorithms)
SKA SA 2008
• LNSD data rates (Perley & Cornwell 2003):
where D = dish diameter, B = max. baseline, Δν = bandwidth, and ν = frequency• Wide-field imaging cost ~ O(D-4 to -8) (Perley & Clark 2003; Cornwell 2004; Lonsdale et al
2004).• Full-field continuum imaging cost (derived from Cornwell 2004):
• Strong dependence on 1/D and B. Data rates of Tbps and computational costs in PF are readily obtained from underlying geometric terms.
• Spectral line imaging costs exceed continuum imaging costs.• Possible mitigation through FOV tailoring (Lonsdale et al 2004), beam-forming, and
antenna aggregation approaches (Wright et al.)
– 550 GBps/na2 (Lonsdale et al 2004)
• Runaway petascale costs for SKA tightly coupled to design choices
SKA computing and data scale
t
NNN
TBps
V antantchanvis
1210
)1(20
2
1410~D
NB
TBps
Vvis
1500273.02
2 2103.22
D
B
ant D
BN
PF
C
0
500
1000
1500
2000
D=12.5m, B=5km
D=12.5m,B=35km
D=6m, B=5km
TB per hour
0
2
4
6
8
10
D=12.5m,B=5km
D=12.5m,B=35km
D=6m,B=5km
Peak PF
SKA SA 2008
Directions inComputing Technology
SKA SA 2008
The declining cost of high-performance computing hardware
• Computing hardware system costs
vary over key primary axes:– Time evolution (Moore’s Law)– Level of commoditization
0
50
100
150
200
250
300
GPU 500GF CPU 100-1000GF CPU 100-1000TF
$1000 per TF
Commoditization effects in computing hardware costs models for general- purpose CPU and GPU accelerators at a fixed epoch (2007). Estimated from public data.
Moore’s Law for general-purpose Intel CPUs.
Trend-line for Top 500 leading-edge performance.
SKA SA 2008
• Predicted leading-edge LINPACK Rmax performance from Top 500 trend-line (from data tyr = [1993, 2007]):
• Cost per unit teraflop cTF(t), for a commiditzation factor η, Moore’s Law doubling time Δt, and construction lead time Δc:
[with cTF(t0) = $300k/TF, t0 = 2007, η = [0.3-1.0], Δt ~ 1.5 yr, Δc ~ 1-4 yr]
Computing hardware performance and cost models
0
20
40
60
80
100
2011 2012 2013 2014 2015 2016
Predicted Rmax (PF)
)1993(6217.0max 05555.0
yrte
TF
R
)2ln()(
0
0
)()( t
tt
TFTF etcctc
0
50
100
150
200
250
300
350
400
1 PF(2012)
10 PF(2012)
7.5 PF(2016)
90.1 PF(2016)
Approximate projected costs ($M)
SKA SA 2008
B
B
B
B
B B
B B B
10
100
1000
10000
1990 1995 2000 2005 2010
Directions in Computing Technology
Increasing Clock Frequency & PerformanceF
req
uen
cy (
MH
z)
“In the past, performance scaling in conventional single-core processors has been accomplished largely through increases in clock frequency (accounting for roughly 80 percent of the performance gains to date).”
Platform 2015S. Y. Borkar et al., 2006
Intel Corporation
Intel Pentium
SKA SA 2008
Directions in Computing Technology
Problem with Uni-core Microprocessors
Decreasing Feature SizeIncreasing Chip Frequency
Wat
ts/c
m2
1
10
100
1000
1.5 1.0 0.7 0.5 0.35 0.25 0.18 0.13 0.1 0.07
i386i486
PentiumPentium Pro
Pentium IIPentium III
Hot Plate
Nuclear Reactor
Rocket Nozzle
Pentium 4(Prescott)
Pentium 4(Willamette)
SKA SA 2008
Directions in Computing Technology
From Uni-core to Multi-core Processors
AMDUni-, Dual-, Quad-core,Processors
IntelMulti-core Performance
Intel Teraflops Chip
SKA SA 2008
B
B
B
B
B B
B B B
10
100
1000
10000
1990 1995 2000 2005 2010
Directions in Computing Technology
Switch to Multicore ChipsF
req
uen
cy (
MH
z)
du
al
co
req
ua
d c
ore
“For the next several years the only way to obtain significant increases in performance will be through increasing use of parallelism:
– 4× now
– 8× in 2009
– 16× in 2011
– …
SKA SA 2008
Trends at extreme scale
Inconvenient truths
• Moore’s Law holds, but high-performance architectures are evolving rapidly:– Breakpoint in clock speed evolution
(2004)– Lateral expansion to multi-core
processors and processor augmentation with accelerators
• Theoretical performance ≠ actual performance
• Sustained petascale calibration and imaging performance for SKA requires:– Demonstrated mapping of SKA
calibration and imaging algorithms to modern HPC architectures, and proof of feasible scalability to petascale: [O(105) processor cores].
– Remains a considerable design unknown in both feasibility and cost.
0
20000
40000
60000
80000
100000
10 TF 100 TF 1 PF
No processors
(Golap, Kemball et al. 2001, Coma cluster, VLA 74 MHz, parallelized facet-based wide-field imaging)
SKA SA 2008
Scalability
Fastest current
NCSA system
(abe.ncsa.uiuc.
edu*)
Generic
petascale
system
Peak
performance0.090 PF 10-20 PF
Number of
processors9,600 300,000-
750,000
Amount of
memory0.0096 PB 0.5-1.0 PB
Disk storage 0.10 PB 25-50 PB
Archival
storage0.005 EB 0.5-1 EB
(Dunning 2007)
*Abe: Dell 1955 blade cluster– 2.33 GHz Intel Cloverton Quad-Core• 1,200 blades/9,600 cores• 89.5 TF; 9.6 TB RAM; 170 TB disk– Power/Cooling• 500 KW / 140 tons
SKA SA 2008
US NSF vision for open petascale computing
SKA SA 2008
Challenges and Solutions in Petascale Computing
Petascale Computing Facility
www.ncsa.uiuc.edu/BlueWaters
• Modern Data Center
– 90,000+ ft2 total
– 20,000 ft2 machine room
• Energy Efficiency
– LEED certified (goal: silver)
– Efficient cooling system
PartnersEYP MCF/GenslerIBMYahoo!
SKA SA 2008
Innovative Computing Technologies
On to Many-core Chips
Intel Teraflops Chip(80 cores)
NVIDIA GeForce8800 GTX(128 cores) IBM Cell
(1+8 cores)
SKA SA 2008
Innovative Computing Technologies
New Technologies for Petascale ComputingG
flop
s
Courtesy of John Owens (UCSD) & Ian Buck (NVIDIA)
2002 2003 2004 2005 2006 2007
3.4 GHzDual-core
2.66 GHzQuad-
core
1.35 GHz G80
1.50 GHz G80NVIDIA (GPU)
INTEL (CPU)
0
50
100
150
200
250
300
350
400
SKA SA 2008
Innovative Computing Technologies
NVIDIA: GeForce 8800 GTX GPU
Load/store
Global Memory
Thread Execution Manager
Input Assembler
Host
Texture Texture Texture Texture Texture Texture Texture TextureTexture
Parallel DataCache
Parallel DataCache
Parallel DataCache
Parallel DataCache
Parallel DataCache
Parallel DataCache
Parallel DataCache
Parallel DataCache
Load/store Load/store Load/store Load/store Load/store
128 Cores, 346 GFLOPS (SP), 768 MB DRAM,86.4 GB/s memory bandwidth; CUDA*
* Compute Unified Device Architecture
SKA SA 2008
Innovative Computing Technologies
NVIDIA: Selected BenchmarksApplication Description Kernel X App X
H.264 SPEC ’06 version, change in guess vector 20.2 1.5
LBM SPEC ’06 version, change to single
precision and print fewer reports 12.5 12.3
FEM Finite element modeling, simulation of 3D
graded materials 11.0 10.1
RPES Rys polynomial equation solver, 2-electron
repulsion integrals 210.0 79.4
PNS Petri net simulation of a distributed system 24.0 23.7
LINPACK Single-precision implementation of saxpy,
used in Gaussian elimination routine 19.4 11.8
TRACF Two Point Angular Correlation Function 60.2 21.6
FDTD Finite-difference time domain analysis of 2D
electromagnetic wave propagation 10.5 1.2
MRI-Q Computing a matrix Q, a scanner’s
configuration in MRI reconstruction 457.0 431.0
* W-m. Hwu et al., 2007
SKA SA 2008
Computational and Algorithmic Challenges for the SKA
SKA SA 2008
Feasibility: imaging dynamic range
Richards 2000 HDF VLA 1.4 GHz 7.5 μJy
Norris et al 2005 HDF-S ATCA 1.4 GHz 10 μJy
Middelberg et al
2008
ELAIS I ATCA 1.4 GHz < 30 μJy
Miller et al 2008 E-CDF-S {E}VLA 1.4 GHz 6.4 μJy
Reference specifications (Schillizzi et al 2007)• Targeted λ20cm continuum field: 107:1.• Routine λ20cm continuum: 106:1.• Driven by need to achieve thermal noise limit
(nJy) over plausible field integrations.• Spectral dynamic range: 105:1.• Current typical state of practice near λ ~ 20
cm given below.
(de Bruyn and Brentjens, 2005)
High-sensitivity deep fieldsNoordarm et al
1982
3C84 WSRT 1.4 GHz 10,000:1
Geller et al 2000 1935-692 ATCA 1.4 GHz 77,000:1
de Bruyn &
Brentjens 2005
Perseus WSRT 92 cm 400,000:1
de Bruyn et al,
2007
3C147 WSRT 1.4 GHz 1,000,000:1
Dynamic range
SKA SA 2008
Feasibility: imaging dynamic range
Visibility on baseline m-n
Visibility-plane calibration effect
Image-plane calibration effect Source
brightness (I,Q,U,V)Direction
on sky: ρ
Basic imaging and calibration equation for radio interferometry (e.g. Hamaker, Bregman, & Sault et al.):
Key challenges• Robust, high-fidelity image-plane (ρ) calibration:
– Non-isoplanatism.– Antenna pointing errors.– Polarized beam response in (t,ω), …
• Non-linearities, non-closing errors• Deconvolution and sky model limits• Dynamic range budget will be set by system design
elements.
(Bhatnagar et al. 2004; antenna pointing self-cal: 12µJy => 1µJy rms)
SKA SA 2008
Feasibility: imaging dynamic range
Visibility on baseline m-n
Visibility-plane calibration effect
Image-plane calibration effect Source
brightness (I,Q,U,V)Direction
on sky: ρ
Basic imaging and calibration equation for radio interferometry (e.g. Hamaker, Bregman, & Sault et al.):
Calibration challenges• Number of free parameters in image-plane terms far greater than visibility-
plane terms:– Requires large-parameter solvers for multiple calibration terms– Stability, robustness, and convergence an open research topic.
• Large-N arrays will almost certainly operated with reference Global Sky Models (GSM)
– As well-calibrated as possible in routine observing.– A new paradigm, however …– Pathfinders will inject reality here
SKA SA 2008
SKA dynamic range assessment – beyond the central pixel• Current achieved dynamic ranges degrade significantly with radial projected distance from field center, for reasons
understood qualitatively (e.g. direction-dependent gains, sidelobe confusion etc.)• An SKA design with routine uniform, ultra-high dynamic range requires a quantitative dynamic range budget.• Strategies:
– Real data from similar pathfinders (e.g. MeerKAT) are key.– Simulations are useful if relative dynamic range contributions or absolute fidelity are being assessed with simple
models.– New statistical methods:
• Assume convergent, regularized imaging estimator for brightness distribution within imaging equation; need to know sampling distribution of imaging estimator per pixel, but unknown PDF a priori:
• Statistical resampling (Kemball & Martinsek 2005ff) and Bayesian methods (Sutton & Wandeldt 2005) offer new approaches.
Feasibility: dynamic range assessment
( )S ( )S
SKA SA 2008
Direction-dependent variance estimation methods
M1: Np=1; Δt = 60 s
M2: Np=1; Δt = 150 s
M3: Np=1; Δt = 300 s
M4: Np=2; Δt = 900 s
S1: delete frac. 12.5%
S2: delete frac. 25%
S3: delete frac. 50%
S4: delete frac. 75%
MC M1 M2
M3 M4 S1
S2 S3 S3
(Kemball et al. (2008), AJ)
Truth from MC simulation Other estimates from statistical methods
SKA SA 2008
Software cost models
• Computer operations costs: ~ 10% of system construction costs p.a.
• Software development costs (Boehm et al. 1981):
where β ~ ratio of academic to commerical software construction costs.
• LSST computing costs approximately one quarter of project; order of magnitude smaller data rates than SKA (~ tens of TB per night); total construction costs perhaps a third of SKA.
1.05
2.41000
COST Lines of code
FTE month
(LSST)
SKA SA 2008
Solutions for the Petascale Era
SKA SA 2008
Approaching the SKA petascale challenges
• Form interdisciplinary institutes and teams:
– Computer scientists, computer engineers, and applications scientists
– Invest in people not hardware• Develop international projects and collaborations• Focus on the (multi-wavelength) science goals• Revisit current imaging algorithms for extreme scalability• Learn from other disciplines in the physical sciences preparing for
the petascale era• New sociology needed concerning observing and data practices
SKA SA 2008
Great Lakes Consortium for Petascale Computation
The Ohio State University*
Shiloh Community Unit School District #1
Shodor Education Foundation, Inc.
SURA – 60 plus universities
University of Chicago*
University of Illinois at Chicago*
University of Illinois at Urbana-Champaign*
University of Iowa*
University of Michigan*
University of Minnesota*
University of North Carolina–Chapel Hill
University of Wisconsin–Madison*
Wayne City High School
* CIC universities*
Argonne National Laboratory
Fermi National Accelerator Laboratory
Illinois Math and Science Academy
Illinois Wesleyan University
Indiana University*
Iowa State University
Illinois Mathematics and Science Academy
Krell Institute, Inc.
Louisiana State University
Michigan State University*
Northwestern University*
Parkland Community College
Pennsylvania State University*
Purdue University*
Goal: Facilitate the widespread and effective use of petascale computing to address frontier research questions in science, technology and engineering at research, educational and industrial organizations across the region and nation.
Charter Members
SKA SA 2008
US SKA calibration & processing working group (TDP)
• Athol Kemball (Illinois) (Chair)• Sanjay Bhatnagar (NRAO)• Geoff Bower (UCB)• Jim Cordes (Cornell; TDP PI)• Shep Doeleman (Haystack/MIT)• Joe Lazio (NRL)• Colin Lonsdale (Haystack/MIT)• Lynn Matthews (Haystack/MIT)• Steve Myers (NRAO)• Jeroen Stil (Calgary)• Greg Taylor (UNM)• David Whysong (UCB)
Calgary.... .
. .Cornell
NRL
UIUC MIT
NRAO
UCB UNM
SKA SA 2008
Approaching the SKA petascale challenges
• Form interdisciplinary institutes and teams:
– Computer scientists, computer engineers, and applications scientists
– Invest in people not hardware• Develop international projects and collaborations• Focus on the (multi-wavelength) science goals• Revisit current imaging algorithms for extreme scalability• Learn from other disciplines in the physical sciences preparing for
the petascale era• New sociology needed concerning observing and data practices