lcse – unisys collaboration paul woodward, lcse unisys briefing june 7, 2005

27
LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Upload: baldwin-allison

Post on 20-Jan-2016

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

LCSE – UnisysCollaboration

Paul Woodward, LCSE

Unisys BriefingJune 7, 2005

Page 2: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Unisys Donation, March, 2003:Unisys donated a 32-

processorES7000 to the LCSE & one to MSI.

Microsoft donated software,DataCenter 2003 and SQL Server.

Intel donated chips.

Page 3: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Unisys was initiating an HPC program.

LCSE could demonstrate power of the ES7000 on scientific problems using the Windows OS.

LCSE could explore possibility of supporting graphics applications on this machine.

Page 4: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Performance Study for Computational Fluid Dynamics:

LCSE codes ported to ES7000.Computational kernel

performance measured, with excellent results.

Parallel performance study identified issues that were addressed successfully with Unisys assistance.

Page 5: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

This is the best performance per CPU that we have obtained anywhere to date.

To achieve this, we did not compromise our code implementation strategy – we can still do completely out-of-core computations on problems of any size.

We worked with Dave Johnson of Unisys to pin our processes down to their CPUs, while we allowed the data read and written to come from and go to any place in the machine.

We are now working to get both 16-CPU partitions computing this efficiently together. Unisys now offers a larger shared memory configuration that solves this problem, but our task would be needed to get multiple of those machines to work together.

Page 6: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Bottom Line:LCSE performance figures are

triple those achieved by NCSA job mix, or by applications represented at the Natl. Academy “Future of Supercomputing” meeting.

Focus on running small job fast exploits unique SMP advantage.

Also, SHMOD allows out-of-core, billion-cell simulation.

Page 7: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

ES-7000 doing Billion-Cell Simulations & Many Smaller Ones

Large memory a great advantage.

Many fast attached disks.Highly reliable system.We like Windows.Serves as central hub of the

LCSE.White papers for Unisys &

acknow-ledgements in scientific papers.

Page 8: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

First large-scale multifluid PPM simulation.

Billion-Cell Simulation Underway.New interface tracking method implemented for

Los Alamos

Page 9: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

A parameter study of turbulent shear layer flowswas made possible by the ES7000

Page 10: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Plans:Integrate ES7000 as data

analysis and central control engine of newly funded prototype system.

Explore possibility of greater Unisys participation in a proposal next January for a full-up system.

Page 11: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

New NSF Major Research Instrumentation Project:

$300,000 for 1-year prototyping.

Goal is truly interactive visualization of 2 TB data set on PowerWall at full resolution.

Prototype will handle only 1 panel.

Plan January proposal for 10 panels

Data replication to avoid contention, SATA disks, Infiniband networking.

Page 12: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Motivation:Move from data presentation

to data exploration.Generate PowerWall movies

under interactive user control (just roll the mouse wheel and travel).

Pipeline the data analysis and visualization process, so that it no longer takes days for each step, but is instead immediate.

Page 13: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Motivation for the Motivation:Need to do this because of the

data explosion implied by national supercomputing installations.

Largest machines at NSF centers can easily generate 5 to 10 TB/day (60-120 MB/sec) of useful fluid flow simulation data.

LambdaRail 10 Gbit/s connection can bring this directly to LCSE.

Page 14: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Two Modes of Interactive Use:Pre-computed bricks of bytes

are replicated on the disks at each node and user travels through this 4-D data volume in virtual vehicle.

Upon button click, raw data snapshot drawn into large shared memory, and user travels through this 3-D volume, looking at any desired variable, in virtual vehicle.

Page 15: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Mode A Requirement (full-up system):

2 TB replicated at each node.Local disk system streams data

into graphics card at 400 MB/sec.

80 graphics engines each render at 400 MVoxels/sec to 10 PWall panels.

Peak rendering rate of 32 GVoxel/s produces 2 frames/sec of 8.6 Gvoxel/frame on 10-panel PWall.

(Prototype system does only 1 panel).

Page 16: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Mode B Requirement (full-up system):

SMP memory holds raw data snap shot of 6×2×2048³ B = 96 GB

SMP memory holds 32-bit single variable array of 4×2048³ B = 32 GB

SMP processes data at 80 Gflop/s.

IB4X streams 400 MB/sec to each node simultaneously from SMP

80 graphics engines render at 400 MVoxels/sec to 10 PWall panels.

Page 17: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

New NSF Major Research Instrumentation Project:

ES7000 will be the data analysis end of the data processing pipeline.

Goal is interactive visualization directly from raw data, rather than from pregenerated voxel bricks.

Much more I/O intensive.Large memory shared among

many CPUs allows rapid voxel gen.

Page 18: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

System now under construction in the LCSE.

Dell PC nodes can act as intelligent storage servers and also as image generation engines.

Dell 670nDual 3.6 Xeon EM64 8GB DDR2 SDRAM

Page 19: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Key issues for Unisys:SATA disks directly attached

or just through IB from PC servers?

PCI Express x16 for high-end graphics engines, or put these into PC nodes & use PCI-e for IB__X?

Infiniband network integrating with other machines and their storage.

Bigger shared memory & more CPUs than present 16.

Page 20: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Potential Unisys Role:Memory network 10X faster

than cluster network, so can work from raw data, which is 10X larger.

Then can see any quantity on demand

Entirely new capability for interactive data exploration.

Unisys SMP would need to drive 80 rendering engines & 960 disks, either directly or in PC nodes on IB switch.

Page 21: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Opportunity (keeping options open for multiple possible suppliers):

NSF encourages us to go back for ~1 M$ after proof of concept.

Schedule gives Unisys time to integrate any essential new technologies – PCI Express x16, Infiniband, SATA.

We can be testbed, working w Unisys.

Major opportunity on horizon.

Page 22: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Prototyping Effort Now:Proposed Linux cluster to NSF;

have 12 Dell nodes, each with:Dual P4 Xeon @ 3.6 GHz8 GB memorynVidia Quadro 4400 graphics

card12 Seagate 400 GB SATA

disks3Ware 12-channel SATA

controller Infiniband 4X (Topspin) HCA

10 IB4X links to Unisys ES7000

Page 23: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Near-Term Goals for ES7000:Infiniband drivers.Get all 32 CPUs cooperating

over IB.Improve performance of A3D

data analysis application.Integrate with Linux cluster

We are fine with WindowsGovernment sponsors insist

on LinuxPipeline data from A3D on

ES7000 to HVR on PCs for raw data rendering.

Page 24: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Middle-Term Goals for ES7000:3Ware SATA controller drivers?Attach SATA drives directly?Measure I/O performance.Experiment, in preparation for

January NSF proposal, with IB on more recent Unisys

model?Potential to drive Nvidia

graphics?Experiment with resource

sharing and on demand (preemptive) visualization

Page 25: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

We have not settled on the scaled-up architecture:

Scale up of present system is possible.

IB cluster of Dell nodes is possible.

SMP cluster of Unisys nodes is possible.

Time is short, so options other than first 2 are handicapped.

Other vendors unlikely.

Page 26: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Things that now seem definite:SATA disks (Seagate

partnership under negotiation; buying 200 now).

Programmable graphics engine(s) on PCI Express X16 (nVidia, perhaps with SLI, or perhaps even IBM cell).

Infiniband 4X.(12X in full-up system?)

Linux (our sponsors are determined).

Intel CPUs.

Page 27: LCSE – Unisys Collaboration Paul Woodward, LCSE Unisys Briefing June 7, 2005

Our Guess at a Best Fit Role for Unisys:

Scale up present system with IB__X.

Dell PC nodes act as storage servers.

Dell PC nodes host programmable graphics engines that cooperatively render images to PowerWall display.

Unisys SMP provides large shared memory and 80 Gflop/s processing power to enable interactive visualiza-tion from raw data.