the neutron science teragrid gateway: teragrid

37
Presented by The Neutron Science TeraGrid Gateway: TeraGrid Cyberinfrastructure at ORNL John W. Cobb, Ph.D. Computer Science and Mathematics Office of Cyberinfrastructure National Science Foundation In collaboration with many teams: NSTG, SNS, McStas group, Open Science Grid, Earth Systems Grid, TechX Corp, OGCE, UK eScience effort, and the TeraGrid Partners teams.

Upload: others

Post on 10-Dec-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Neutron Science TeraGrid Gateway: TeraGrid

Presented by

The Neutron Science TeraGrid Gateway:TeraGrid Cyberinfrastructure at ORNL

John W. Cobb, Ph.D.Computer Science and Mathematics

Office of CyberinfrastructureNational Science Foundation

In collaboration with many teams:

NSTG, SNS, McStas group, Open Science Grid, Earth Systems Grid,TechX Corp, OGCE, UK eScience effort, and the TeraGrid Partners teams.

Page 2: The Neutron Science TeraGrid Gateway: TeraGrid

2 Cobb_TeraGrid_SC07

Outline

1. Cyberinfrastructure (CI)

2. TeraGrid

3. The Spallation Neutron Source

4. The Neutron Science TeraGrid Gateway (NSTG)

5. Other collaborations

2 Cobb_TeraGrid_SC07

Page 3: The Neutron Science TeraGrid Gateway: TeraGrid

3 Cobb_TeraGrid_SC07

Outline

1. Cyberinfrastructure (CI)

2. TeraGrid

3. The Spallation Neutron Source

4. The Neutron Science TeraGrid Gateway (NSTG)

5. Other collaborations

3 Cobb_TeraGrid_SC07

Page 4: The Neutron Science TeraGrid Gateway: TeraGrid

4 Cobb_TeraGrid_SC07

Cyberinfrastructureoverview (from 2006)

Scientificdiscoveryenabled

Applications anduser consulting

Software stack/middleware

Resources: CPU, storage, …

Networking/connectivity

“Like the physical infrastructure of roads, bridges,power grids, telephone lines, and water systemsthat support modern society, ““Cyberinfrastructure"refers to the distributed computer, information andcommunication technologies combined with thepersonnel and integrating components thatprovide a long-term platform to empower themodern scientific research endeavor” ‡

2003 Blue Ribbon Panel:“Atkins Report”

†Maslow, A. H. (1943). A Theory of Human Motivation. Psychological Review, 50, 370-396.

‡ David Hart in NCSA News release “National Science Foundation Releases New Reportfrom Blue-Ribbon Advisory Panel on Cyberinfrastructure” February 3, 2003http://access.ncsa.uiuc.edu/Releases/03Releases/02.03.03_National_S.html as quoted onCyberinfrastructure Wikipedia entry.

Personal view:A hierarchyof needs†

Physical Infrastructure: Space, Power, Cooling, Security, …

Page 5: The Neutron Science TeraGrid Gateway: TeraGrid

5 Cobb_TeraGrid_SC07

Cyberinfrastructure (CI)Maturing

• CI is now more understood

• Infrastructure is recognized as critical to service and research success

• “We believe, we get tremendous competitive advantage by essentially buildingour own infrastructures.” Eric Schmidt, Google CEO*

• National focus now shifting to enabling scientific discovery

From data to knowledge: Enhancing human cognition and generating newknowledge from a wealth of heterogeneous digital data

Understanding complexity in natural, built, and socialsystems: Deriving fundamental insights on systems comprising multiple interactingelements

Building virtual organizations: Enhancing discovery and innovationby bringing people and resources together across institutional, geographical andcultural boundaries

• Cyber Enabled Discovery is predicated upon advanced,national-scale cyberinfrastructure

* Source The New York Times online edition, Sept. 30, 2007

Page 6: The Neutron Science TeraGrid Gateway: TeraGrid

6 Cobb_TeraGrid_SC07

Outline

1. Cyberinfrastructure (CI)

2. TeraGrid

3. The Spallation Neutron Source

4. The Neutron Science TeraGrid Gateway (NSTG)

5. Other collaborations

6 Cobb_TeraGrid_SC07

Page 7: The Neutron Science TeraGrid Gateway: TeraGrid

7 Cobb_TeraGrid_SC07

TeraGrid overview

“The TeraGrid is the world's largest, most powerful andcomprehensive distributed cyberinfrastructure for openscientific research. It currently supports more than 1,000projects and over 4,000 researchers geographicallyspanning the entire United States.”

National Science Foundation in press release 07-09

• Noun: Cyberinfrastructure

• Adjectives

Most powerful

Comprehensive

Distributed

For open research

Page 8: The Neutron Science TeraGrid Gateway: TeraGrid

8 Cobb_TeraGrid_SC07

TeraGrid network

• Dedicated Links

• Resource toResource

• Bandwidth unit:10 Gbps

• Substrate for highperformance highquality datamovement andcommunications

And …

Page 9: The Neutron Science TeraGrid Gateway: TeraGrid

9 Cobb_TeraGrid_SC07

Caltech

USC/ISIUNC/RENCI

UW

Resource Provider (RP)Resource Provider (RP)

Software Integration

Partner

SDSCSDSC

TACCTACC

UC/ANLUC/ANL

NCSANCSA

ORNLORNL

PUPU

IUIU

PSCPSCNCARNCAR

LSULSU

U Tenn.U Tenn.

GridInfrastructure

Group (UC)

2 New Resource Providers

Page 10: The Neutron Science TeraGrid Gateway: TeraGrid

10 Cobb_TeraGrid_SC07

TeraGrid resources

• 9 distributed centers (soon 11)

• > 300 TFlops today+400 TF in January+1 PF in late 2008On path to double every year

• > 1.5 PB disk

• Multi PB archive

• Different Resource typesCapability

Capacity

High throughput

Visualization

Specialized use

• Common user environments

• Tuned data transfer

Page 11: The Neutron Science TeraGrid Gateway: TeraGrid

11 Cobb_TeraGrid_SC07

TeraGrid distributed andunified management

• All TeraGrid resources are allocated via a peer-reviewed,unified national process, XRAC

• Proposals: POPS See “Allocations” tab on portal

• Single sign-on across TeraGrid (GSI)

• Common software stack across the sites

• TeraGrid portal for computer user access and ease

Unified user support TeraGrid-wide

Unified ticket system (7x24 coverage)

User services

• Advanced support for TeraGrid applications (ASTA)

• Coordinated high performance data movement

GridFTP

Wide area file systems (GPFS in production, Lustre, pNFS in pilots)

https://portal.teragrid.org

Page 12: The Neutron Science TeraGrid Gateway: TeraGrid

12 Cobb_TeraGrid_SC07

TeraGrid serves NSFusers and the openscience community

23%7%

6%

4%

21%

10%

23%

2005

Hours

(m

illio

ns)

2006

160

140

120

100

80

60

40

20

0Apr Jun Aug Oct Dec Feb Apr Jun Aug

Specific

Roaming

TeraGrid Roaming and Non Roaming Usage

Page 13: The Neutron Science TeraGrid Gateway: TeraGrid

13 Cobb_TeraGrid_SC07

Science gateways

• TeraGrid size: How can we manage it?

Today: 4000+ user accountsacross 20 resources

Target: 10–100 increasein user base; increase in numberof resources

• TeraGrid Science Gateways

1. Web portal with users in front andTeraGrid services in back

2. Grid-bridging gateways:Extending the reach of a community grid(devoted to a single area of science)so it can use TeraGrid resources

3. Application programs running on users'machines (e.g., workstations anddesktops) that access TeraGrid(and other) services

• The NSTG is oneof 20+ existing gateways

Charting new territory for widecyberinfrastructure services:

• Federated user management

• Callback end-user identification withautomated service auditing acrossfederated identity space “gatewaycommunity accounts”

• Grid interoperability

• Participation in attribute-basedauthorization

Page 14: The Neutron Science TeraGrid Gateway: TeraGrid

14 Cobb_TeraGrid_SC07

Outline

1. Cyberinfrastructure (CI)

2. TeraGrid

3. The Spallation Neutron Source

4. The Neutron Science TeraGrid Gateway (NSTG)

5. Other collaborations

14 Cobb_TeraGrid_SC07

Page 15: The Neutron Science TeraGrid Gateway: TeraGrid

15 Cobb_TeraGrid_SC07

• 3 allocatedinstrumentsin Fall 2007 :

– Backscatteringspectrometer

– Liquids reflectometer

– Magnetismreflectometer

User program

• Commissioning3 to 4 per year

instruments

• Ongoing,183 kW record set inAugust 2007

Acceleratorpower-up

• 1998–2006,TPC $1.4 billion

Constructioncomplete

• Neutrons as a probeof matter

Largeworld-classuser facility

Spallation Neutron Source“The next generation of materials research”

15 Cobb_TeragGrid_SC07

10/06 12/06 02/07 04/07 06/07 08/07

Beam power on target

200

120

40

Pow

er (

kW)

Previousrecord

August 11

Page 16: The Neutron Science TeraGrid Gateway: TeraGrid

16 Cobb_TeraGrid_SC07

SNS data access via portal(Reduction, analysis, and simulation as well)

Turnkey solutions are being requestedby users and user groups

• Neutron Science User Portal

Totally in Java

Providing advanced tools in an easyfashion while retaining customizability

SNS approachProblem

Acknowledgement: S. Miller and SNS Advisory SW group;J. Kohl, S. Vazhkudai and ORNL/CSM division)

Function deployment order

• First: Data acquisitionand storage/stewardship

• Then: Reduction, analysis,simulation, proposalsfor one-stop access

Page 17: The Neutron Science TeraGrid Gateway: TeraGrid

17 Cobb_TeraGrid_SC07

Neutron science portal view

Page 18: The Neutron Science TeraGrid Gateway: TeraGrid

18 Cobb_TeraGrid_SC07

NeXusA community-drivenstandards process

http://www.nexusformat.org/

• Data format for neutron, X-ray, and muon science

• Based on HDF5

• Self-describing

• Use increasing as new facilitiescome online

SNS

OPAL at ANSTO

ISIS second target station

J-SNS

• McStas (neutron Monte Carlo ray-trace simulation tool)can read and write NeXus format

Page 19: The Neutron Science TeraGrid Gateway: TeraGrid

19 Cobb_TeraGrid_SC07

• Opportunity to access HPC resourcesvia the Neutron Science Portal

TeraGrid

• Instrument local DAS and analysis machines

• Facility centralized services including storage, application hosting,execution servers

Hardware

• Data acquisition system (DAS) at each instrument

• iCAT captures raw data

– Stores data locally to instrument in a pre-NeXus format

– Broadcast data downstream to analysis team

– Catalogues raw data

• Translation service: Raw data to NeXus formats

• Messaging service

• Application orchestration

• Integrated authentication with facility software XCAMS

Softwareandsystems

SNS cyberinfrastructure to enablethe Neutron Science Portal

Page 20: The Neutron Science TeraGrid Gateway: TeraGrid

20 Cobb_TeraGrid_SC07

Outline

1. Cyberinfrastructure (CI)

2. TeraGrid

3. The Spallation Neutron Source

4. The Neutron Science TeraGrid Gateway (NSTG)

5. Other collaborations

20 Cobb_TeraGrid_SC07

Page 21: The Neutron Science TeraGrid Gateway: TeraGrid

21 Cobb_TeraGrid_SC07

Neutron Science TeraGridGateway (NSTG)One of 9 TeraGrid partner Resource Providers

• Outreach to a specific sciencecommunity (neutron science)

– Expertise

– Technology transfer

– Education, in a broad sense

– User outreach

• A science gateway interfacefor neutron science

• Exploration of large science facilityintegration with national-scalecyberinfrastructure

• Combine TeraGrid computationalresources with available neutronscattering datasets for analysisand simulation comparison

• Neutron science

• Connecting facilitieswith cyberinfrastructure

• Bridging cyberinfrastructures

• Data movement withinand across TeraGrid

Resources providedFocus areas

Page 22: The Neutron Science TeraGrid Gateway: TeraGrid

22 Cobb_TeraGrid_SC07

NSTG operational components

• Wide area network connections to TeraGrid

• Local connections, at maximum bandwidth,to local facilities (SNS and HFIR)

• Modest local cluster with completeTeraGrid environment for small local TeraGrid jobs

• Long-term HPSS archival storage via ORNL’s LCF/NCCS

• System operations

• Application consulting

Page 23: The Neutron Science TeraGrid Gateway: TeraGrid

23 Cobb_TeraGrid_SC07

TeraGrid footprint: ORNL

*

*

**

Page 24: The Neutron Science TeraGrid Gateway: TeraGrid

24 Cobb_TeraGrid_SC07

NSTG results: Highlight

• Ability to launch McStas neutron Monte Carlo ray tracing instrumentsimulations on the TeraGrid directly from Neutron Science Portal

• Production use of TeraGrid community account “Jimmy Neutron”

New TeraGrid science gateway scaling technology

• In development: A generalized fitting service for experimental dataintegrated into neutron science portal and using TeraGrid computationalresources

• Deployment of wide area Lustre pilot client;write speeds of 9.72 Gbs over a 10-Gbs wide area connection

• Testing of GridFTP servers with 10-Gbs end-to-end (NIC-switch-router-WAN-router-switch-NIC); >90% of theoretical maximum

First principles sample kernel simulationsof experimentally relevant materials (E. Farhi, ILL)

Page 25: The Neutron Science TeraGrid Gateway: TeraGrid

25 Cobb_TeraGrid_SC07

Porting and deploymentof simulation toolsto TeraGrid cyber-infrastructure: McStasand IDEAS

Many codescurrently I/Olimited. Nextstep is to modifyI/O strategy incodes to takeadvantageof I/O architectures.

Neutron science simulationsNext step: Provide analysis and relevantsimulations in support of neutron science

Instrument design optimization:

Moving simulations to TeraGrid allowslarger runs and faster turnaround.Assisted in effort to redesigninstrument to lower cost. Simulationsprovided confidence that resolutionwould not be sacrificed.

Improved and faster data analysis:Reflectometry

Faster and better 2 minimization.

1

10

100

1 10 100

DataStar /gpfs

DataStar /gpfs-wan (NSF)

DataStar /home

SDSC /gpfs-wan interactive

Spee

dup

cores

100

1000

104

0 4000 8000 12000

SDSCNCSAUCPSC (rachel)PSC (lemieux)ORNL_8PurdueDataStar2

Iterations in 1 hour

Page 26: The Neutron Science TeraGrid Gateway: TeraGrid

26 Cobb_TeraGrid_SC07

InteractiveUser Inputs

Portal Applet(Simulation

GUI)

Simulation Servlet

AuthorizationAs User

CommunityCertificate

TeraGridComputational

Resourcesas “Jimmy Neutron”Community Account

ssh

https

“runas”, params

SNS-user

private key

Back-End Web Server

Globus Front-End

globus-job-status

globus-job-

submit

user“JimmyNeutron”

Results!

globus-url-copyGridFTP

grid-proxy-init

Job Information Service

Portal-initiated simulations . . .under the covers!

Page 27: The Neutron Science TeraGrid Gateway: TeraGrid

27 Cobb_TeraGrid_SC0727 Cobb_TeraGrid_SC07

NSTG was and is“ahead of the curve”

• Proposed a Science Gateway as part of ORNL RP in 2003,before creation of TG science gateways program

• Included a focus on cyber-enabled science from the outset:Enabling neutron science

• Early (but not first) intergrid production operation

• Early (but not first) concentration on intra-TeraGridhigh-performance wide area data transfer

• Concentration on high-performance transfer between TeraGridand external endpoints

• Early adopter and pathfinder postures fit with NSTG RP role

Page 28: The Neutron Science TeraGrid Gateway: TeraGrid

28 Cobb_TeraGrid_SC07

Outline

1. Cyberinfrastructure (CI)

2. TeraGrid

3. The Spallation Neutron Source

4. The Neutron Science TeraGrid Gateway (NSTG)

5. Other collaborations

28 Cobb_TeraGrid_SC07

Page 29: The Neutron Science TeraGrid Gateway: TeraGrid

29 Cobb_TeraGrid_SC07

NSTG participates in severalcollaborative areas

Neutron science:

• SNS andother facilities

• McStas instrumentsimulation software

• Portal developmenttools anddeployments

OGCE, GumTreeEclipse RCP

Data transport:

• Within TeraGridpartners

• In wider area(REDDnet)

Inter-grid operation:

• Open Science Grid

• Earth Systems Grid

Page 30: The Neutron Science TeraGrid Gateway: TeraGrid

30 Cobb_TeraGrid_SC07

McStas

• Widely used neutron instrument simulationpackage for instrument design and experiment planning

• Development team at Risø National Laboratory (Denmark)and Institut Laue-Langevin (France)

• Monte Carlo ray tracing of neutron path through beamline elements

• Many advanced features

• Use cases:

Instrument design and construction

Experiment interpretation

Experiment planning

Experiment proposal evaluation

• Challenge: Simulation of sample and sample environment.

• E. Farhi (largest NSTG cluster user) has just completed a preliminarystudy to computationally compute scattering kernels using VASP(private communication)

http://neutron.risoe.dk/

Page 31: The Neutron Science TeraGrid Gateway: TeraGrid

31 Cobb_TeraGrid_SC07

McStas: Sample kernel simulations(private communication, E. Farhi, ILL)

Page 32: The Neutron Science TeraGrid Gateway: TeraGrid

32 Cobb_TeraGrid_SC07

GumTree

• An integrated scientific workbench

• Eclipse-based rich client

• Neutron science data acquisitionand analysis software

• Under development at Australian NuclearScience and Technology Organisation’s(ANSTO) Bragg Institute for instruments onthe OPAL reactor source

• ANSTO Team: Nick Hauser, Tony Lam

• Mark Green (TechX, Buffalo) lookingto integrate into SNS portaland application efforts

• See OGCE workshopNovember 11–12, 2007

http://gumtree.codehaus.org/

Page 33: The Neutron Science TeraGrid Gateway: TeraGrid

33 Cobb_TeraGrid_SC07

Research and EducationResearch and Education

Data Depot NetworkData Depot Network

REDDnetResearch and Education Data Depot Network

• National-scale logisticalstorage network

• Distributed (but acts local!)

• Funding from NSF andLibrary of Congress

• >700 TB (500 disk, 200 tape)

• Multiple application domains

Satellite imagery (AmericaView)

High energy physics: LHC

Terascale Supernova Initiative

Structural biology

Vanderbilt TV news archive

National Geospatial Data Archive

• Data everywhere(under the cover tools to managemovement, replication, …)

http://www.reddnet.org/

ORNL REDDnet nodesin NSTG and LCF

See Vanderbilt research booth for SC’07 demo and more details

Page 34: The Neutron Science TeraGrid Gateway: TeraGrid

34 Cobb_TeraGrid_SC07

Open Science Grid (OSG)

• Similar to TeraGrid in many ways:

Distributed resources

Includes Globus toolkit in software stack

Science focus

• Significant operational differences and culture from TeraGrid

• NSTG cluster is “dual deployed”: Both a TeraGrid and an OSG resource

Supports production jobs on both grids

Reports usage accordingly

Serves TG roaming accounts and most OSG Vos

Pick TG or OSG environment using “@” macros

• Pursued at request of neutron science community

• Significant overlap with REDDnet partners

• Purdue TeraGrid partner: Another TG/OSG integration case

http://www.opensciencegrid.org/

Page 35: The Neutron Science TeraGrid Gateway: TeraGrid

35 Cobb_TeraGrid_SC07

Earth Systems Grid (ESG)

• TeraGrid transport for backendtransport (ORNL NCAR)

• ESG host in NSTG TeraGrid is at ORNL

• ESG host talks to HPSS at ORNL andlaunches GridFTP transfers for ESG

• See other TeraGrid partners: NCAR

www.earthsystemgrid.org/

Page 36: The Neutron Science TeraGrid Gateway: TeraGrid

36 Cobb_TeraGrid_SC07

Support

• National Science Foundation

• Department of Energy

• ORNL Laboratory Director’sresearch and development fund

Staff and Effort

• NSTG: M. Chen1, J. Cobb1, V. Hazlewood1,S. Hicks1, G. Hinkel1, V. Lynch1, P. Newman1,J. Nichols1, D. Pack1, G. Pike1, Jim Rome1,Bill Wing1

• SNS: J. Bilheux1, G. Granroth1, M. Hagen1,J. Kohl1, D. Mikkelson9, R. Mikkelson9,S. Miller1, P. Peterson1, S. Ren1,M. Reuter1, J. Schwidder1, B. Smith8,T. Swain8, J. Trater1, S. Vazhkudai1

• McStas group: E. Farhi3, K. Lefmann5,P. Willendrup5

• TechX Corp.: M. Green6

• OSG: R. Pourdes2

• ESG: D. Bernholdt1, D. Middleton4

• UK eScience: L. Sastry7

• TeraGrid partners and others

Organizations

1ORNL, 2Fermilab, 3ILL, 4NCAR,5RISO, 6TechX Corp.,7UK eScience, 8UT,9U Wisc., Stout

Acknowledgementsand thanks

Page 37: The Neutron Science TeraGrid Gateway: TeraGrid

37 Cobb_TeraGrid_SC07

Contact

John W. Cobb, Ph.D.

Computer Science and Mathematics(865) [email protected]

37 Cobb_TeraGrid_SC07