teragrid national cyberinfrastructure for terascale science dane skow deputy director, teragrid

29
Charlie Catlett ([email protected]) January 2007 TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid www.teragrid.org The University of Chicago and Argonne National Laboratory February 2007 ourtesy of Charlie Catlett (UC/ANL), Tony Rimovsky (NCSA) and Reagan Moore (SDSC) is supported by the National Science Foundation Office of Cyberinfrastructure Petascale

Upload: beata

Post on 09-Jan-2016

45 views

Category:

Documents


0 download

DESCRIPTION

TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid www.teragrid.org The University of Chicago and Argonne National Laboratory February 2007. Petascale. Slides courtesy of Charlie Catlett (UC/ANL), Tony Rimovsky (NCSA) and Reagan Moore (SDSC) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

TeraGridNational Cyberinfrastructure for

Terascale Science

Dane SkowDeputy Director, TeraGridwww.teragrid.org The University of Chicago and Argonne National Laboratory

February 2007

Slides courtesy of Charlie Catlett (UC/ANL), Tony Rimovsky (NCSA) and Reagan Moore (SDSC)TeraGrid is supported by the National Science Foundation Office of Cyberinfrastructure

Petascale

Page 2: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

“NSF Cyberinfrastructure Vision for 21st Century Discovery”

1. Distributed, scalable up to

petaFLOPS HPC

2. Data, data analysis,

visualization

3. Collaboratories, observatories, virtual

organizations

includes networking, middleware,

systems software

“sophisticated” science application software

includes data to and from

instruments

4. Education and Workforce

• provide sustainable and evolving CI that is secure, efficient, reliable, accessible, usable, and interoperable

• provide access to world-class tools and servicesDraft 7.1 CI Plan at

www.nsf.gov/oci/Adapted from: Dan Atkins, NSF Office of Cyberinfrastructure

Page 3: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

TeraGrid Mission• TeraGrid provides integrated, persistent, and pioneering

computational resources that will significantly improve our nation’s ability and capacity to gain new insights into our most challenging research questions and societal problems.

– Our vision requires an integrated approach to the scientific workflow including obtaining access, application development and execution, data analysis, collaboration and data management.

– These capabilities must be accessible broadly to the science, engineering, and education community.

Page 4: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

SDSC

TACC

UC/ANL

NCSA

ORNL

PU

IU

PSC

NCAR

CaltechUSC/ISI

UNC/RENCI

UW

Resource Provider (RP)

Software Integration Partner

Grid Infrastructure

Group(GIG)

TeraGrid Facility Partners

NIU

Page 5: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Networking

SDSC

UC/ANL PSC

TACC

ORNLLA DEN

NCSA

NCAR

Abilene

2x10G

1x10G 1x10G

PU

IPGrid

IU

CHI

1x10G

1x10G each

2x10G

1x10G

1x10G

3x10G each

Cornell

1x10G

1x10G

Page 6: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

TeraGrid Usage Growth

Specific Allocations Roaming Allocations

Normalized Units (millions)

100

200

TeraGrid currently delivers to users an average of 400,000 cpu-hours per day -> ~20,000 CPUs DC

Page 7: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

1

100

10,000

O N D J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O N D

2003 2004 2005 2006

Active UsersAll Users EverNew Accounts

TeraGrid User Community GrowthBegin TeraGrid Production Services

(October 2004)

Incorporate NCSA and SDSC Core (PACI) Systems and Users

(April 2006)

Decommissioning of systems typically causes slight reductions in active users. E.g. December 2006 is due to decommissioning of Lemeux (PSC).

FY05 FY06

New User Accounts 948 2,692

Avg. New Users per Quarter 315 365*

Active Users 1,350 3,228

All Users Ever 1,799 4,491(*FY06 new users/qtr excludes Mar/Apr 2006)

Page 8: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

TeraGrid Projects by Institution

Blue: 10 or more PI’sRed: 5-9 PI’sYellow: 2-4 PI’sGreen: 1 PI

1000 projects, 3200 users

TeraGrid allocations are available to researchers at any US educational institution by peer review. Exploratory allocations can be obtained through a biweekly review process. See www.teragrid.org.

Page 9: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

FY06 Quarterly Usage by Discipline

100

50

Percent Usage

Page 10: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

TeraGrid Science Gateways Initiative:Service-Oriented Approach

The science and engineering community has been building discipline-specific cyberinfrastructure in the form of portals, applications, and grids. Our objective is to enable these to use TeraGrid resources transparently as “back-ends” to their infrastructure.

The TeraGrid Science Gateways program has developed, in partnership with 20+ communities and multiple major Grid projects, an initial set of processes, policies, and services that enable these gateways to access TeraGrid (or other facilities) resources via web services.

TeraGridTeraGridGrid-XGrid-X Grid-YGrid-Y

Web Services

Page 11: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Use ModalityUse Modality Community SizeCommunity Size(est. number of projects)(est. number of projects)

Batch Computing on Individual Resources 850

Exploratory and Application Porting 650

Workflow, Ensemble, and Parameter Sweep 160

Science Gateway Access 100

Remote Interactive Steering and Visualization 35

Tightly-Coupled Distributed Computation 10

TeraGrid User Community in 2006

Grid

-y U

sers

Page 12: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Data Storage Resources

• Local Cluster Files System

• Global File System

GPFS-WAN 250TB

• Data Collections

• Archive Storage

Graphic courtesy of SDSC datacentral

Page 13: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Local Cluster Storage

• Normal site user/group permissions apply– TeraGrid users typically have individual accounts connected with

their project team via usual uid/gid groups

– Therefore normal containment/forensic tools work inside the system

• GridFTP transfer from one resource to another– Dedicated GridFTP mover nodes for parallel systems

– Dynamic GridFTP mover “fleet” direct from apps

– Central TeraGrid Listener to gather system aggregate data• Modification to standard set to lift “vail of privacy” within TeraGrid

• System metrics and diagnostics

• Forensics analysis database

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

•Shared NFS-like file system within a single site

–GPFS, Lustre, NFS, PVFS,QFS, CXFS, …

Page 14: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

“Global” File System

• TeraGrid has central GPFS-WAN server at SDSC mounted by several clusters across the grid.

QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.

•Pros–Common namespace

–POSIX syntax for remote file access

–Single Identity space (x509) across WAN

–High speed parallel systems available

•Cons–GPFS-WAN: IBM licensing and availability

–Lustre-WAN: Lack of WAN security model

–No group authZ construct support

Page 15: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Archived Storage

• Just now beginning to deal with archived storage as an allocated resource.

QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.

• Issues– Retention policy/guarantee

– Media migration

– Privacy/Security/Availability on abandoned files

– Economic Model (NCAR has a “Euro” approach with common currency)

Page 16: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Using an SRB Data Grid - Details

Storage Resource Broker

•Data request goes to SRB Server

Storage Resource Broker

Metadata Catalog

DB

•Server looks up information in catalog

•Catalog tells which SRB server has data

•1st server asks 2nd for data

•The data is found and returned

•User asks for data

QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.

Page 17: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Lessons Learned

• Lesson from Stakkato was not (just) scale of attack, but rather importance of being able to restore control– In a connected world with agents this means

• Virtual borders -- ALL > collaborators > pair-wise trusts

• Centralized logging for forensics/IDS– USE THE SAME SYSTEM FOR DAILY OPERATIONS/METRICS !

– We must be able to (perhaps painfully) outpace attackers in cleaning system

• Ease of use and ubiquity are essential to adoption– AFS’s change to directory permission from file

permissions had a huge adoption barrier cost

Page 18: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Lessons Learned

• Work is needed on distributed group authorization/management tooling– Group membership and roles are best maintained by the

leaders of the group

– Policy rules are best kept and enforced by the data store

• Security Triad: – Who you are

– Where you can go

– What you can do

Page 19: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Lessons Learned

• Work is needed on distributed group authorization/management tooling– Group membership and roles are best maintained by the

leaders of the group

– Policy rules are best kept and enforced by the data store

• Security Triad:

–Who you are– Where you can go– What you can do

Page 20: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Lessons Learned• Work is needed on distributed group

authorization/management tooling– Group membership and roles are best maintained by the

leaders of the group– Policy rules are best kept and enforced by the data store

• Security Triad: – Who you are– Where you can go– What you can do

• Some actions are so dangerous that they deserve to have the 2 person rule enforced – (e.g. archive tape erasure)

Page 21: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Lessons Learned

• Security is never “done”– The coordination team (building) from the Stakkato

incident was THE most valuable result.

Page 22: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Dane Skow ([email protected]) February 2007

Security in Distributed Data Management Systems

Storage Resource BrokerStorage Resource Broker

Reagan W. MooreReagan W. Moore

Wayne SchroederWayne Schroeder

Mike WanMike Wan

Arcot RajasekarArcot Rajasekar

{moore, schroede, mwan, sekar}@sdsc.edu

http://www.sdsc.edu/srb

http://irods.sdsc.edu/http://irods.sdsc.edu/

Page 23: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

Logical Name Spaces

• Logical User name– Unique identifier for each person accessing the

system• {User-name, project-name}

– User groups - aggregations of users• Membership in multiple groups

– Data grids (zones) • {user-name, project-name, zone-name}

Page 24: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

Authorization - SRB

• Assign access controls on each name space– Files– Metadata– Storage

• Assign roles that represent sets of allowed operations– Role - administrator, curator, read, write, annotate

Page 25: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

Rule-based Data Management

iRODS (integrated Rule Oriented Data System)• Map from management policies to rules controlling

execution of remote micro-services• Manage persistent state information for results of

micro-service execution• Support an additional three logical name spaces

– Rules– Micro-services– Persistent state information

Page 26: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

Controlling Remote Operations

Data ManagementEnvironment

ConservedProperties

ControlMechanisms

RemoteOperations

ManagementFunctions

AssessmentCriteria

ManagementPolicies

Capabilities

Data ManagementInfrastructure

PersistentState

Rules Micro-services

PhysicalInfrastructure

Database Rule Engine StorageSystem

iRODS - integrated Rule-Oriented Data SystemiRODS - integrated Rule-Oriented Data System

Page 27: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

Rule-based Access

• Associate security policies with each digital entity– Redaction, access controls on structures within a file– Time-dependent access controls (how long to hold data

proprietary)

• Associate access controls with each rule– Restrict ability to modify, apply rules

• Associate access controls with each micro-service– Explicit control of operation execution within a given

collection– Much finer control than provided by Unix r:w:e

Page 28: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

For More Information

Reagan W. Moore

San Diego Supercomputer Center

[email protected]

http://www.sdsc.edu/srb/

http://irods.sdsc.edu/

Page 29: TeraGrid National Cyberinfrastructure for Terascale Science Dane Skow Deputy Director, TeraGrid

Charlie Catlett ([email protected]) January 2007

Call for ParticipationPapers, tutorials, posters, BOFs, and demonstrations are being accepted through February in three tracks: Science, Technology, and Education, Outreach and Training

Submissions are being accepted through April for three competitions for high school, undergraduate and graduate students:

•Impact of Cyberinfrastructure

•Research posters

•On-site advancing scientific discovery