hank’s activities longhorn/xd ahm austin, tx december 20, 2010

21
Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010 Volume rendering of 4608^3 combustion data set Image credit: Mark Howison Volume rendering of flame data set using VisIt + IceT on Longhorn. Image credit: Tom Fogal

Upload: conan

Post on 21-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Volume rendering of 4608^3 combustion data set. Image credit: Mark Howison. Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010. Volume rendering of flame data set using VisIt + IceT on Longhorn. Image credit: Tom Fogal. My perception of my role in Longhorn/XD. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Hank’s ActivitiesLonghorn/XD AHM

Austin, TXDecember 20, 2010

Volume rendering of 4608^3 combustion data set

Image credit: Mark Howison

Volume rendering of flame data set using VisIt + IceT on Longhorn. Image credit: Tom Fogal

Page 2: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010
Page 3: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

My perception of my role in Longhorn/XD

Help users succeed via: Direct support Ensuring necessary algorithms/functionality are in

place Research most effective way to utilize

Longhorn Also help test machine through aggressive usage

Collaborate with / facilitate for other project members

Provide visibility for center externally (outreach, etc)

Page 4: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Outline Researching how to best use Longhorn

HW-accelerated volume rendering on Longhorn SW-ray casting on Longhorn

Collaborations Manta/VisIt VDF/VisIt

User support Analysis of 4K^3 turbulent data

Connected components algorithms Other user support

Outreach

Page 5: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

HW-accelerated volume rendering on longhorn

“Large Data Visualization on Distributed Memory Multi-GPU Clusters”, HPG2010

Authors: Fogal, Childs, Shankar, Krueger, Bergeron, and Hatcher

Ran VisIt + IceT on Longhorn, varying data size and number of GPUs.

Stage data on CPU, transfer to GPU (high transfer time, but can look at bigger data sets)

Volume rendering of flame data set using VisIt + IceT on Longhorn. Image credit: Tom Fogal

Page 6: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

HW-accelerated volume rendering on longhorn

Observation about CPU volume rendering:

Number of cores

Large Small

Ray evaluation Fast Slow

Compositing Slow Fast

Paper purpose: study the performance characteristics of GPU volume rendering at

high concurrency on big data.

Paper purpose: study the performance characteristics of GPU volume rendering at

high concurrency on big data.

Idea: GPU volume rendering has the computational horsepower to do ray evaluation quickly, but will have many fewer MPI participants.

Page 7: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Big data

Lots of GPUs

Fast-ish on small data

Page 8: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Software ray-casting

Previous work (not XD-related): “MPI-Hybrid Parallelism for Volume

Rendering on Large Multi-Core Systems”, EGPGV 2010

Authors: Howison, Bethel, and Childs Strong scaling study up 216,000 cores on ORNL

Jaguar machine looking at 4608^3 data. Study outcome: hybrid parallelism benefits this

algorithm, primarily during the compositing phase, since there are less participants in MPI communication.

One of two EGPGV best paper winners, invited for follow on article to TVCG.

Volume rendering of combustion data set

Image credit: Mark Howison

Page 9: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Software ray-casting

TVCG article (unpublished research): Add

weak scaling study (up to 22K^3) on Jaguar GPU scaling study on Longhorn

GPU scaling study: Went up to 448 GPUs Purpose: similar to Fogal work, but with a

different spin … show that hybrid parallelism is beneficial. Instead of pthreads or OpenMP on the CPU, we are now using CUDA on the GPU.

Page 10: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Scaling results on GPU

2308^3 data

2308^3 data 2308^3-

4608^3 data

2308^3-4608^3

data

Page 11: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Software ray-casting on Longhorn

Two caveats:(1)We didn’t optimize for CUDA. So we could

have had favorable numbers to an even higher concurrency level.

(2)But Jaguar @ 46K processors has more memory and can look at way bigger data sets.

Two caveats:(1)We didn’t optimize for CUDA. So we could

have had favorable numbers to an even higher concurrency level.

(2)But Jaguar @ 46K processors has more memory and can look at way bigger data sets.

Takeaway: for this algorithm and this data size, longhorn is as powerful as 46K processors of

jaguar.

Takeaway: for this algorithm and this data size, longhorn is as powerful as 46K processors of

jaguar.

Page 12: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Manta/VisIt

Carson Brownlee delivers integration of VisIt and Manta via vtkManta objects.

Hank does some small work: Updates work from VisIt 2.0 to

VisIt 2.2 & makes a branch for Hank and Carson to put fixes on.

Testing Carson and Hank create a list of

issues and are in the process of tracking them down.

Rendering of isosurface by VisIt using Manta

Page 13: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Visualizing and Analyzing Large-Scale Turbulent Flow

Detect, track, classify, and visualize features in large-scale turbulent flow.

Analysis effort by Kelly Gaither (TACC), Hank Childs (LBNL), & Cyrus Harrison (LLNL).

Stresses two algorithms that are difficult in a distributed memory parallel setting:

1. Can we identify connected components?

2. Can we characterize their shape?

VisIt calculated connected components on a 4K^3 turbulence data in parallel using TACC's Longhorn machine. 2 million components were initially identified and then the map expression was used to select only the components that had total volume greater than 15. Data courtesy of P.K. Yeung & and Diego Donzis

Page 14: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Identifying connected components in parallel is difficult.

Hard to do efficiently

Tremendous bookkeeping problem.

4 stage algorithm that finds local connectivity and then merges globally.

Participating in 2011 EGPGV submission describing this

algorithm and its performance. Authors: Harrison, Childs, Gaither

Participating in 2011 EGPGV submission describing this

algorithm and its performance. Authors: Harrison, Childs, Gaither

Page 15: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

We used shape characterization to assist our feature tracking.

15

Shape characterization metric: chord length distribution Difficult to perform

efficiently in a distributed memory setting

P0P1

P3

P2Line Scan Filter

1) ChooseLines

2) CalculateIntersections

3) Segmentredistribution

4) Analyzelines

5) Collectresults

Line Scan Analysis Sink

It is our hope that chord length distributions, a characteristic

function, can assist in tracking component behavior over time.

It is our hope that chord length distributions, a characteristic

function, can assist in tracking component behavior over time.

Page 16: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

My role in this effort

Easily summarized: “use VisIt to get results to Kelly”

Several iterations: Started with just statistics of components Looked at how variation in isovalue affected statistics Added in chord length distributions as a characteristic

function Took still images of each component for visual

inspection (recently) extracted each component as its own

surface for combined inspection.

Page 17: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

VDF/VisIt

John Clyne and Dan Lagreca add VDF reader to VisIt.

Hank performs some testing and debugging. Still lots to do:

Formal commit to VisIt repo. Also add in new VisIt multi-res hooks.

Study how well large features are preserved across refinement level.

Use coarsest versions in conjunction with analysis code from Janine Bennett.

Page 18: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Other user support

Small amount of effort helping Saju Varghese and Kentaro Nagmine of UNLV Fixed VisIt bug with ray-casting + point

meshes Helped them format their data into BOV

format

Page 19: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Outreach & Service VisIt tutorials:

SC10 (beginning and advanced), Nov 2010, NOLA Users at US ARL, Sep 2010, Abderdeen, MD SciDAC 2010, July 2010, Chattanooga, TN

Speaker at NSF Extreme Scale I/O and Data Analysis Workshop, March 2010, Austin, TX

Participant in NSF Workshop on SW Development Environments, Sep 2010, Washington DC

Given ~10 additional talks at various venues this year

Page 20: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Proposed Future Plans

Continue collaboration with Kelly on analyzing turbulent flow

Formally integrate VDF Multi-res study with John & Kelly Would like to do 1T cell

runs on Longhorn Continued user support

Esp. CIG Connected components @

EGPGV VisIt + GPU

Two trillion cell data set, rendered

in VisIt by David Pugmire on ORNL

Jaguar machine

Page 21: Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010

Summary Researching how to best use Longhorn

HW-accelerated volume rendering on Longhorn SW-ray casting on Longhorn

Collaborating with other Longhorn/XD members Manta/VisIt VDF/VisIt

Doing user support Helping Kelly analyze 4K^3 turbulent data

Working to make sure connected components algorithms is up to snuff

Some user support and more to come… Performing outreach activities