state of hcc 2012 dr. david r. swanson director, holland computing center

58
State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Upload: christine-mccormick

Post on 15-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

State of HCC2012

Dr. David R. SwansonDirector, Holland Computing Center

Page 2: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Nature Communications, July 17, 2012

Nebraska Supercomputing Symposium 2012

Page 3: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC CPU Hour Usage 2012

Nebraska Supercomputing Symposium 2012

Zeng (Quant Chem) 4.5M

Starace (AMO Phys) 2.7M

Rowe (Climate) 2.0M

NanoScience 6.4M

B

N

NN

N NN

NB

CComp Bio 3.0M B

Comp Sci 1.7M C

C

Physics 0.7M

Mech E 0.4M

Page 4: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

High Performance Computing

• http://t2.unl.edu/status/hcc-status• Xiao Zeng, Chemistry, UNL (prior slide)• DFT and Car Parrinello MD• HPC – tightly coupled codes• Requires expensive low-latency local network

(infiniband)• Requires high-performance storage (Panasas,

Lustre) • Requires highly reliable hardware

Nebraska Supercomputing Symposium 2012

Page 5: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Eureka! A Higgs! (or at least something currently indistinguishable)

• "I think we have it. We have discovered a particle that is consistent with a Higgs boson." – CERN Director-General Rolf Heuer

Nebraska Supercomputing Symposium 2012

Page 6: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

US CMS Tier2 Computing

Nebraska Supercomputing Symposium 2012

Page 7: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Compact Muon Solenoid (CMS)

5.5 mi

Large Hadron Collider

Nebraska Supercomputing Symposium 2012

Page 8: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

CMS Grid Computing Model

Nebraska Supercomputing Symposium 2012

Page 9: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Eureka! A Higgs! (or at least something currently indistinguishable)

• Ca. 50 PB of CMS data in entirety• Over 1 PB currently at HCC’s “Tier2”, 3500

cores• Collaboration at many scales

– HCC and Physics Department– Over 2700 scientists worldwide– International Grid Computing Infrastructure– Data grid as well– UNL closely linked to KU, KSU physicists via

a jointly hosted “Tier3” Nebraska Supercomputing Symposium 2012

Page 10: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Data Intensive HTC

• Huge database• Requires expensive high-bandwidth wide area

network (dwdm fiber)• Requires high-capacity storage (HDFS, dCache) • HTC – loosely coupled codes• Requires hardware

Nebraska Supercomputing Symposium 2012

Page 11: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Outline

• HCC Overview• New User report• HCC-Go• Moving Forward (after break)

– Next purchase– It’s the Data, stupid… – Other Issues

Nebraska Supercomputing Symposium 2012

Page 12: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Outline

• New User report• HCC-Go• Moving Forward (next section)

– Next purchase (motivation)– New Communities– PIVOT– It’s the Data, stupid…

Nebraska Supercomputing Symposium 2012

Page 13: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HOLLAND COMPUTING CENTER OVERVIEW

Nebraska Supercomputing Symposium 2012

Page 14: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC @ NU

• Holland Computing Center has a University-wide mission to – Facilitate and perform computational and

data intensive research– Engage and train NU researchers, students,

and other state communities

– This includes you! – HCC would be delighted to collaborate

Nebraska Supercomputing Symposium 2012

Page 15: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Computational Science – 3rd Pillar

Experim

ent

Theory

Computation/D

ata

Nebraska Supercomputing Symposium 2012

Page 16: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Lincoln Resources

• 10 staff• Red• Sandhills• 5,000 compute

cores• 3 PetaBytes

storage in HDFS

Nebraska Supercomputing Symposium 2012

Page 17: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Sandhills “Condominium Cluster”

• 44 nodes X 32-core, 128 GB, IB

• Lustre (175 TB)• Priority Access

– $HW + $50/month– 4 groups currently

• SLURM

Nebraska Supercomputing Symposium 2012

Page 18: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Omaha Resources

• 3 Staff • Firefly • Tusker• 10,000 compute

cores• 500 TB storage• New offices soon:

158J PKI

Nebraska Supercomputing Symposium 2012

Page 19: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Tusker

• 106*64= 6784 cores

• 256 GB/node• 2 nodes w/ 512

GB• 360 TB Lustre

– 100 TB more en route

• QDR IB• 43 TFlop

Nebraska Supercomputing Symposium 2012

Page 20: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Tusker

• ¼ footprint of Firefly

• ¼ the power• 2X the TFLOPS• 2X the storage• Fully utilized• Maui/Torque

Nebraska Supercomputing Symposium 2012

Page 21: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

In between …

• HCC (UNL) to Internet2: 10 gbps• HCC (Schorr) to HCC (PKI): 20 gbps• Allows us to do some interesting things

– “overflow” jobs to/from Red– DYNES project– Xrootd mechanism

Nebraska Supercomputing Symposium 2012

Page 22: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC Staff

• HPC Applications Specialists– Dr. Adam Caprez– Dr. Ashu Guru– Dr. Jun Wang– Dr. Nicholas

Palermo

• System Administrators– Dr. Carl Lundstedt– Garhan Attebury– Tom Harvill– John Thiltges– Josh Samuelson– Dr. Brad Hurst

Nebraska Supercomputing Symposium 2012

Page 23: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC Staff

• Other Staff– Dr. Brian

Bockelman– Joyce Young

• GRAs– Derek Weitzel– Chen He– Kartik Vedalaveni– Zhe Zhang

• Undergraduates– Carson Crawford– Kirk Miller– Avi Knecht– Phil Brown– Slav Ketsman– Nicholas Nachtigal– Charles Cihacek

Nebraska Supercomputing Symposium 2012

Page 24: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC Campus Grid

• Holland Computing Center resources are combined into an HTC campus grid– 10,000 cores, 500 TB in Omaha– 5,000 cores, 3 PB in Lincoln– All tied together via a single submission

protocol using OSG software stack– Straightforward to expand to OSG sites

across the country, as well as to EC2 (cloud)– HPC jobs get priority; HTC ensures high

utilizationNebraska Supercomputing Symposium 2012

Page 25: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC Model for a Campus Grid

Me, my friends and everyone else

Grid

Campus

Local

25

Nebraska Supercomputing Symposium 2012

Page 26: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC & Open Science Grid

• National, distributed computing partnership for data-intensive research– Opportunistic computing– Over 100,000 cores– Supports the LHC experiments, other

science– Funded for 5 more years– Over 100 sites in the Americas– Ongoing support for 2.5 (+3) FTE at HCC

Nebraska Supercomputing Symposium 2012

Page 27: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

It Works!

Nebraska Supercomputing Symposium 2012

Page 28: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC Networking Monitoring

Nebraska Supercomputing Symposium 2012

Page 29: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

OSG Resources

Nebraska Supercomputing Symposium 2012

Page 30: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Working philosophy

• Use what we buy– These pieces of infrastructure are linked, but improve

asynchronously – Depreciation is immediate– Leasing is still more expensive (for now)– Buying at fixed intervals mitigates risk, increases ROI– Space, Power and Cooling have a longer life span

• Share what we aren’t using– Share opportunistically – retain local ownership– Consume opportunistically – there is more to gain!– Collaborators, not just consumers– Greater good vs. squandered opportunity

Nebraska Supercomputing Symposium 2012

Page 31: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Working philosophy

• A Data deluge is upon us• Support is essential

– If you only build it, they still may not come– Build incrementally and buy time for user training– Support can grow more gradually than hardware

• Links to national and regional infrastructure are critical – Open Source Community– GPN access to Internet2– Access to OSG, XSEDE resources– Collaborations with fellow OSG experts– LHC

Nebraska Supercomputing Symposium 2012

Page 32: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC New Users

FY UNL-City

UNL-East

UNO UNMC Outside NU system

2011 424 (74) 33 (10) 75 (19) 30 (17)

112 (26)

2012 519 (95) 50 (17)

105 (30) 35 (5)

130 (18)

Nebraska Supercomputing Symposium 2012

Page 33: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

New User Communities

• Theatre, Fine Arts/Digital Media, Architecture• Psychology, Finance

• UNMC

• Puerto Rico

• PIVOT collaborators

Nebraska Supercomputing Symposium 2012

Page 34: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC NEW USER REPORT:HEATH ROEHR

Nebraska Supercomputing Symposium 2012

Page 35: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC-GO :DR. ASHU GURU

Nebraska Supercomputing Symposium 2012

Page 36: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

MOVING FORWARD

Nebraska Supercomputing Symposium 2012

Page 37: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

NEW PURCHASE

Nebraska Supercomputing Symposium 2012

Page 38: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

$2M for …

• More computing– need ca. 100 TF to hit Top500 for Jun 2013 – Likely use all of funds to hit that amount

• More storage– Near-line archive (9 PB)– HDFS

• Specialty hardware– GPGPU/Viz– Mic hardware

Nebraska Supercomputing Symposium 2012

Page 39: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

More computing

• How much RAM/core? • Currently almost always oversubscribed• Large scale jobs almost impossible (> 2000

core)• Safest investment – will use right away• Firefly due to be retired soon – EOL

Nebraska Supercomputing Symposium 2012

Page 40: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

More computing

Nebraska Supercomputing Symposium 2012

Page 41: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

More Computing

Nebraska Supercomputing Symposium 2012

Page 42: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

More storage

• Most rapidly growing demand• Growing contention, can’t just queue up• Largest unmet need (?)

Nebraska Supercomputing Symposium 2012

Page 43: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Storage for $2M

• $2M HDFS cluster – 250 nodes– 4000 cores (Intel)– 9.0 PB (RAW)– 128 GB / node

Nebraska Supercomputing Symposium 2012

Page 44: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Other options

• GPGPUs most Green option for computing• Highest upside for raw power (Top500)• Mic even compatible with x86 codes

• SMP uniquely meets some needs, easiest to use/program

• Bluegene, Tape silo, …

Nebraska Supercomputing Symposium 2012

Page 45: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC personnel timeline

1999 2002 2005 2009 2012

Personnel 2 3 5 9 13

1

3

5

7

9

11

13

HCC Personnel Numbers

Nu

mb

er

7X

Nebraska Supercomputing Symposium 2012

Page 46: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC networking timeline

1999 2002 2005 2009 2012

WAN B/W 0.155 0.155 0.622 10 30

2.5

7.5

12.5

17.5

22.5

27.5

HCC WAN Bandwidth

Gb

/sec

200X

Nebraska Supercomputing Symposium 2012

Page 47: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC cpu timeline

1999 2002 2005 2009 2012

CPU Cores 16 256 656 6956 14492

1000

3000

5000

7000

9000

11000

13000

15000

HCC CPU Cores

Nu

mb

er

900X

Nebraska Supercomputing Symposium 2012

Page 48: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC storage timeline

1999 2002 2005 2009 2012

Capacity 0.108 1.2 31.2 1200 3250

250

750

1250

1750

2250

2750

3250

HCC Storage Capacity (RAW)

Tera

Byte

s

30,000X

Nebraska Supercomputing Symposium 2012

Page 49: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Composite Timeline

• Data increase/ CPU Cores = 33• Data increase/ WAN bandwidth = 150• It takes a month to move 3 PB at 10 Gb/sec

• Power < 100X increase, largely constant last 3 years

Nebraska Supercomputing Symposium 2012

Page 50: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Storage at HCC

• Affordable, Reliable, High Performance, High Capacity– Pick 2 – So multiple options

• /home• /work• /shared• Currently, no /archive

Nebraska Supercomputing Symposium 2012

Page 51: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

/home

• Reliable• Low performance

– No W from workers• ZFS• Rsync’ed pair, one in Omaha, one in Lincoln• Backed up incrementally, requires severe

quotas

Nebraska Supercomputing Symposium 2012

Page 52: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

/work

• High performance• High(er) capacity• Not permanent storage• Lenient quotas• More robust, more reliable “scratch space”• Subject to purge as needed

Nebraska Supercomputing Symposium 2012

Page 53: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

/share

• Purchased by given group• Exported to both Lincoln and Omaha machines• Usually for capacity, striped for some reliability

Nebraska Supercomputing Symposium 2012

Page 54: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Storage Strategy

• Maintain /home for precious files– Could be global

• Maintain /work for runtime needs– Remain local to cluster

• Create /share for near-line archive– 3-5 year time frame (or less)– Use for accumulating intermediate data,

then purge– Global access

Nebraska Supercomputing Symposium 2012

Page 55: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Storage strategy

• Permanent archival has 3 options– 1) library– 2) Amazon glacier

• Currently $120/TB/year– 3) tape system

Nebraska Supercomputing Symposium 2012

Page 56: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

HCC Data Visualizations

• Fish!• HadoopViz• OSG Google Earth

• Web-based monitoring– http://t2.unl.edu/status/hcc-status/– http://hcc.unl.edu/gratia/index.php

Nebraska Supercomputing Symposium 2012

Page 57: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

Other discussion topics

• Maui vs. SLURM• Queue length policy • Education approaches

– This (!)– Tutuorials (next!)– Afternoon workshops– Semester courses– Individual presentations/meetings– Online materials

Nebraska Supercomputing Symposium 2012

Page 58: State of HCC 2012 Dr. David R. Swanson Director, Holland Computing Center

©2007 The Board of Regents of the University of Nebraska

NU Administration (UNL, NRI)NSF, DOE, EPSCoR, OSG

Holland FoundationCMS: Ken Bloom, Aaron Dominguez

HCC: Drs. Brian Bockelman, Adam Caprez, Ashu Guru, Brad Hurst, Carl Lundstedt, Nick Palmero, Jun Wang.

Garhan Attebury, Tom Harvill, Josh Samuelson, John Thiltges

Chen He, Derek Weitzel