pittsburgh supercomputing center rp update july 16, 2009

9
© 2008 Pittsburgh Supercomputing Center Pittsburgh Supercomputing Center RP Update July 16, 2009 Bob Stock Associate Director [email protected]

Upload: holly

Post on 13-Jan-2016

23 views

Category:

Documents


2 download

DESCRIPTION

Pittsburgh Supercomputing Center RP Update July 16, 2009. Bob Stock Associate Director [email protected]. Center for Analysis & Prediction of Storms. Oklahoma/NOAA Spring Severe Weather Forecast Experiment for 2009 CAPS used NICS (1 km) and PSC (4 km) At PSC from 4/20 to 6/5 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pittsburgh Supercomputing Center RP Update July 16, 2009

© 2008 Pittsburgh Supercomputing Center

Pittsburgh Supercomputing CenterRP Update July 16, 2009

Bob Stock

Associate Director

[email protected]

Page 2: Pittsburgh Supercomputing Center RP Update July 16, 2009

© 2008 Pittsburgh Supercomputing Center

Center for Analysis & Prediction of Storms

• Oklahoma/NOAA Spring Severe Weather Forecast Experiment for 2009

• CAPS used NICS (1 km) and PSC (4 km)• At PSC

– from 4/20 to 6/5– Sunday-Thursday: reservations of 2000 cores

for 10-12 hours starting at 10:30 a.m. (eastern)

• Lots of data generated: E.g., 66 terabytes ingested into archive during May

Page 3: Pittsburgh Supercomputing Center RP Update July 16, 2009

2009 CAPS Spring Experiment on PSC BigBen

•Data Access and Screening•Create Input Files•Create Job Scripts

•Remap Radar Data [800 proc, 20 proc each radar]

•Process Initial and Boundary Conditions•Run Weather Analysis [80 processors]•Create Ensemble Perturbations•Run WRF & ARPS Forecast Models [18 x 80 processors]•Extraction & reformatting of 2-D output•Archive of 3-D results, over 50 TB data

•Generate derived products•Data display and interrogation•Analysis and verification•Publication

Page 4: Pittsburgh Supercomputing Center RP Update July 16, 2009

Sample 4-km Ensemble Forecast Products

18h Forecasts Valid 1800 UTC, May 8, 2009

PredictedProbabilitymatched reflectivity

ActualObservedRadarReflectivity

PredictedSpaghettiDiagramof 35 dBZ reflectivity

PredictedProbabilityof reflectivity >35 dBZ

MidwestZoomAll EnsembleForecastMembers

Page 5: Pittsburgh Supercomputing Center RP Update July 16, 2009

© 2008 Pittsburgh Supercomputing Center

Enhancing Operations on Pople

• Automatic Performance Measurement– Utilize Performance Monitor Unit (PMU)

• Backfilling using Predictive Walltimes

Page 6: Pittsburgh Supercomputing Center RP Update July 16, 2009

© 2008 Pittsburgh Supercomputing Center

Automatic Performance Measurement

Goal: Collect Intel Itanium 2 PMU stats for each job in order to Identify underperforming codes (MFLOPS) Provide users with PMU stats for their runs

Based on open source package: Perfmon2 http://perfmon2.sourceforge.net/

Collection started for each job using pfmon Counters collected: CPU_OP_CYCLES_ALL,

FP_OPS_RETIRED, L3_REFERENCES, L3_MISSES Counter detail for each process and thread collected Report issued from digested stats Currently testing and evaluating load on system

Page 7: Pittsburgh Supercomputing Center RP Update July 16, 2009

© 2008 Pittsburgh Supercomputing Center

Backfilling using Predictive Walltimes

Goal: Maximize backfilling during drain for larger jobs Problem: Backfilling for large jobs idles machine due to users

overestimating job run times Solution: Store estimated and actual job run times for each job

and statistically predict job run times Statistically calculated run time is used to optimize backfilling

opportunities Database used to store job actual and estimated walltimes for

each job Lightweight database engine, SQLite, used to store data

70,000 jobs in database Database uses only 87Kbytes!

Scheduler uses data from database to select jobs for backfill Still studying impact and benefits – shows promise

Page 8: Pittsburgh Supercomputing Center RP Update July 16, 2009

© 2008 Pittsburgh Supercomputing Center

PSC at TG09: Organization

• Shawn Brown: Science Track Co-Chair• Pallavi Ishwad: EOT Track Chair• Laura McGinnis: Student Program Chair• Shandra Williams: Communications

Committee Member in charge of signage• Mike Schneider: Wrote news items about

the conference

Page 9: Pittsburgh Supercomputing Center RP Update July 16, 2009

© 2008 Pittsburgh Supercomputing Center

PSC at TG09: Participation

• Phil Blood and Robin Flaus: Presented paper on Computation Exploration (Comp Ex) program in EOT Track

• Greg Foss: Presented visualizations in Visualization Showcase• Ed Hanna and Rob Light with Dave Hart (SDSC): Presented paper

on RDR in Technology Track• Anirban Jana and Sergiu Sanielevici with several people from other

institutions: Presented tutorial Preparing Your Application for TeraGrid Beyond 2010

• Nick Nystrom with several people from other institutions: Presented tutorial Using Tools to Understand Performance Issues on TeraGrid Machines: IPM and the POINT Project

• Josephine Palencia: Presented poster JWAN: PSC's Secure, Federated, Distributed Lustre Filesystem on the WAN (TeraGrid)