the national grid cyberinfrastructure open science grid and teragrid john-paul “jp” navarro...

30
The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde Open Science Grid Education Coordinator University of Chicago/Argonne National Laboratory March 25, 2007

Upload: arabella-stone

Post on 16-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

The National Grid CyberinfrastructureOpen Science Grid and TeraGrid

John-Paul “JP” NavarroTeraGrid Area Co-Director for Software Integration

Mike WildeOpen Science Grid Education Coordinator

University of Chicago/Argonne National LaboratoryMarch 25, 2007

Page 2: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>3

Grid Resources in the US

Origins:– National Grid (iVDGL, GriPhyN,

PPDG) and LHC Software & Computing Projects

Current Compute Resources:– 61 Open Science Grid sites– Connected via Inet2, NLR.... from

10 Gbps – 622 Mbps– Compute & Storage Elemets– All are Linux clusters– Most are shared

• Campus grids• Local non-grid users

– More than 10,000 CPUs• A lot of opportunistic usage • Total computing capacity

difficult to estimate• Same with Storage

Origins:– National Grid (iVDGL, GriPhyN,

PPDG) and LHC Software & Computing Projects

Current Compute Resources:– 61 Open Science Grid sites– Connected via Inet2, NLR.... from

10 Gbps – 622 Mbps– Compute & Storage Elemets– All are Linux clusters– Most are shared

• Campus grids• Local non-grid users

– More than 10,000 CPUs• A lot of opportunistic usage • Total computing capacity

difficult to estimate• Same with Storage

Origins: – National Super Computing Centers,

funded by the National Science Foundation

Current Compute Resources:– 9 TeraGrid sites– Connected via dedicated multi-Gbps

links– Mix of Architectures

• ia64, ia32: LINUX• Cray XT3• Alpha: True 64• SGI SMPs

– Resources are dedicated but• Grid users share with local and

grid users• 1000s of CPUs, > 40 TeraFlops

– 100s of TeraBytes

Origins: – National Super Computing Centers,

funded by the National Science Foundation

Current Compute Resources:– 9 TeraGrid sites– Connected via dedicated multi-Gbps

links– Mix of Architectures

• ia64, ia32: LINUX• Cray XT3• Alpha: True 64• SGI SMPs

– Resources are dedicated but• Grid users share with local and

grid users• 1000s of CPUs, > 40 TeraFlops

– 100s of TeraBytes

The TeraGrid The OSG

Page 3: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>14

What is the TeraGrid?

Technology + Support = Science

Page 4: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>17

NSF Funded Research

• NSF-funded program to offer high end compute, data and visualization resources to the nation’s academic researchers

• Proposal-based, researchers can use resources at no cost

• Variety of disciplines

Page 5: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>18

TeraGrid PI’s By Institution as of May 2006

TeraGrid PI’s

Blue: 10 or more PI’sRed: 5-9 PI’sYellow: 2-4 PI’sGreen: 1 PI

Page 6: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>20

TeraGrid Hardware Components

• High-end compute hardware– Intel/Linux clusters– Alpha SMP clusters– IBM POWER3 and POWER4 clusters– SGI Altix SMPs– SUN visualization systems– Cray XT3– IBM Blue Gene/L

• Large-scale storage systems– hundreds of terabytes for secondary storage

• Visualization hardware• Very high-speed network backbone (40Gb/s)

– bandwidth for rich interaction and tight coupling

Page 7: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>21

ANL/UC IU NCSA ORNL PSC Purdue SDSC TACCComputationalResources

Itanium 2(0.5 TF)

IA-32(0.5 TF)

Itanium2(0.2 TF)

IA-32(2.0 TF)

Itanium2(10.7 TF)

SGI SMP (7.0 TF)

Dell Xeon(17.2TF)

IBM p690(2TF)

Condor Flock(1.1TF)

IA-32 (0.3 TF)

XT3 (10 TF)

TCS (6 TF)

Marvel SMP

(0.3 TF)

Hetero(1.7 TF)

IA-32(11 TF)Opportunistic

Itanium2(4.4 TF)

Power4+(15.6 TF)

Blue Gene(5.7 TF)

IA-32(6.3 TF)

Online Storage

20 TB 32 TB 1140 TB 1 TB 300 TB 26 TB 1400 TB 50 TB

Mass Storage 1.2 PB 5 PB 2.4 PB 1.3 PB 6 PB 2 PB

Net Gb/s, Hub

30 CHI 10 CHI 30 CHI 10 ATL 30 CHI 10 CHI 10 LA 10 CHI

DataCollections# collectionsApprox total sizeAccess methods

5 Col.>3.7 TBURL/DB/GridFTP

> 30 Col.URL/SRB/DB/GridFTP

4 Col.7 TBSRB/Portal/OPeNDAP

>70 Col.>1 PBGFS/SRB/DB/GridFTP

4 Col. 2.35 TBSRB/Web Services/URL

Instruments ProteomicsX-ray Cryst.

SNS and HFIR Facilities

VisualizationResourcesRI: Remote InteractRB: Remote BatchRC: RI/Collab

RI, RC, RB IA-32, 96 GeForce 6600GT

RBSGI Prism, 32 graphics pipes; IA-32

RI, RBIA-32 + Quadro4 980 XGL

RBIA-32, 48 Nodes

RB RI, RC, RBUltraSPARC IV, 512GB SMP, 16 gfx cards

TeraGrid Resources

100+ TF8 distinct

architectures

3 PB Online Disk

>100 data collections

Page 8: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>23

Coordinated TeraGrid Software & Services 4

• CTSS 4 Core Integration Capability– Authorization/Accounting/Security– Policy– Software deployment– Information services

• Remote Compute Capability Kit• Data Movement and Management Capability Kit• Remote Login Capability Kit• Local Parallel Programming Capability Kit• Grid Parallel Programming Capability Kit• <more capability kits>

Page 9: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>24

Science GatewaysA new initiative for the TeraGrid

• Increasing investment by communities in their own cyberinfrastructure, but heterogeneous:

• Resources• Users – from expert to K-12• Software stacks, policies

• Science Gateways– Provide “TeraGrid Inside”

capabilities– Leverage community investment

• Three common forms:– Web-based Portals – Application programs running on

users' machines but accessing services in TeraGrid

– Coordinated access points enabling users to move seamlessly between TeraGrid and other grids.

Workflow Composer

Page 10: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>25

Gateways are growing in numbers

• 10 initial projects as part of TG proposal• >20 Gateway projects today• No limit on how many gateways can use TG resources

– Prepare services and documentation so developers can work independently

• Open Science Grid (OSG)• Special PRiority and Urgent Computing Environment

(SPRUCE)• National Virtual Observatory (NVO)• Linked Environments for Atmospheric Discovery

(LEAD)• Computational Chemistry Grid (GridChem)• Computational Science and Engineering Online (CSE-

Online)• GEON(GEOsciences Network)• Network for Earthquake Engineering Simulation (NEES)• SCEC Earthworks Project• Network for Computational Nanotechnology and

nanoHUB• GIScience Gateway (GISolve)• Biology and Biomedicine Science Gateway• Open Life Sciences Gateway• The Telescience Project• Grid Analysis Environment (GAE)• Neutron Science Instrument Gateway• TeraGrid Visualization Gateway, ANL• BIRN• Gridblast Bioinformatics Gateway• Earth Systems Grid• Astrophysical Data Repository (Cornell)

• Many others interested– SID Grid– HASTAC

Page 11: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>27

The TeraGrid Facility

• Grid Infrastructure Group (GIG)– University of Chicago– TeraGrid integration, planning, management, coordination– Organized into areas

• User Services• Operations• Gateways• Data/Visualization/Scheduling• Education Outreach & Training• Software Integration

• Resource Providers (RP)– Currently NCSA, SDSC, PSC, Indiana, Purdue, ORNL, TACC, UC/ANL– Systems (resources, services) support, user support – Provide access to resources via policies, software, and mechanisms

coordinated by and provided through the GIG.

Page 12: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>28

TeraGrid Facility Today

HeterogeneousResources at Autonomous Resource Provider Sites

Local Value-Added User Environment

Common TeraGrid Computing Environment

• A single point of contact for help• Integrated documentation and training• A common allocation process • Coordinated Software and Services• A common baseline user environment

Page 13: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>29

Useful links

• TeraGrid website– http://www.teragrid.org

• Policies/procedures posted at:– http://www.paci.org/Allocations.html

• TeraGrid user information overview– http://www.teragrid.org/userinfo/index.html

• Summary of TG Resources– http://www.teragrid.org/userinfo/guide_hardware_table.html

• Summary of machines with links to site-specific user guides (just click on the name of each site)– http://www.teragrid.org/userinfo/guide_hardware_specs.html

• Email: [email protected]

Page 14: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

Open Science Grid Overview

The OSG is supported by the National Science Foundation and the U.S. Department of Energy's Office of

Science.

Page 15: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>31

•the grid service providers - middleware developers, cluster, network and storage administrators, local-grid communities

•the grid consumers - from global collaborations to the single researcher, through campus communities to under-served science domains

•into a cooperative to share and sustain a common heterogeneous distributed facility in the US and beyond.

•Grid providers serve multiple communities, Grid consumers use multiple grids.

The Open Science Grid Consortium brings:

Page 16: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>32

96 Resources across production & integration infrastructures

20 Virtual Organizations +6 operations

Includes 25% non-physics.

~20,000 CPUs (from 30 to 4000)

~6 PB Tapes

~4 PB Shared Disk

Snapshot of Jobs on OSGs

Sustaining through OSG submissions:

3,000-4,000 simultaneous jobs .

~10K jobs/day

~50K CPUhours/day.

Peak test jobs of 15K a day.

Using production & research networks

OSG Snapshot

Page 17: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

AstroPhysicsLIGO VO

The Open Science Grid

UW Campus

Grid

Tier2 site ATier2 site A

Tier2 site ATier2 site A

OSG Operations

BNL cluster

BNL cluster

FNALcluster

User Communities

Biology nanoHub

HEP PhysicsCMS VO

HEP PhysicsCMS VO

HEP PhysicsCMS VO

HEP PhysicsCMS VO

AstromomySDSS VO

AstromomySDSS VOAstronomy SDSS VO

Nanotech nanoHub

AstroPhysicsLIGO VO

AstrophysicsLIGO VO

OSG Resource Providers

VO support center

RP support center

VO support center

VO support center A

RP support center

RP support center

RP support center A

UW Campus

Grid

Dep.cluste

r

Dep.cluste

r

Dep.cluste

r

Dep.cluste

r

Virtual Organization (VO): Organization composed of institutions, collaborations and individuals, that share a common interest, applications or resources. VOs can be both consumers and providers of grid resources.

Page 18: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>43

Open Science Grid

Page 19: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

The OSG ENVironment

• Provide access to grid middleware ($GRID)– On the gatekeeper node via shared space

– Local disk on the worker node via wn-client.pacman

• OSG “tactical” or local storage directories– $APP: global, where you install applications

– $DATA: global, write job output staging area

– SITE_READ/SITE_WRITE: global, but on a Storage Element on site

– $WN_TMP: local to Worker Node, available to job

Page 20: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>56

OSG Middleware

Infr

astr

uctu

reA

pplic

atio

ns

VO Middleware

Core grid technology distributions: Condor, Globus, Myproxy: shared with

TeraGrid and others

Virtual Data Toolkit (VDT) core technologies + software needed by stakeholders:many components shared with

EGEE

OSG Release Cache: OSG specific configurations, utilities

etc.

HEP

Data and workflow management etc

Biology

Portals, databases etc

User Science Codes and Interfaces

Existing Operating, Batch systems and Utilities.

Astrophysics

Data replication etc

Page 21: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

The OSG Software Cache

• Most software comes from the Virtual Data Toolkit (VDT)

• OSG components include– VDT configuration scripts – Some OSG specific packages too

• Pacman is the OSG Meta-packager– This is how we deliver the entire cache to Resource

Providers

Page 22: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>58

What is the VDT?• A collection of software

– Grid software: Condor, Globus and lots more– Virtual Data System: Origin of the name “VDT”– Utilities: Monitoring, Authorization, Configuration– Built for >10 flavors/versions of Linux

• Automated Build and Test: Integration and regression testing.

05

10

15

20

25

3035

4045

Jan-

02

May

-02

Sep-0

2

Jan-

03

May

-03

Sep-0

3

Jan-

04

May

-04

Sep-0

4

Jan-

05

May

-05

Sep-0

5

Jan-

06

May

-06

Sep-0

6

Nu

mb

er o

f m

ajo

r co

mp

on

ents

VDT 1.1.x VDT 1.2.x VDT 1.3.x

VDT 1.0Globus 2.0bCondor-G 6.3.1

VDT 1.1.3, 1.1.4 & 1.1.5, pre-SC 2002

VDT 1.1.8Adopted by LCG

VDT 1.1.11Grid2003 VDT 1.2.0

VDT 1.3.0

VDT 1.3.9For OSG 0.4

VDT 1.3.11Current ReleaseMoving to OSG 0.6.0

VDT 1.3.6For OSG 0.2

• An easy installation: – Push a button, everything just works. – Quick update processes.

• Responsive to user needs:– process to add new components based

on community needs.

• A support infrastructure: – front line software support, – triaging between users and software

providers for deeper issues.

Page 23: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

Condor Group

Condor/Condor-G

DAGMan

Fault Tolerant Shell

ClassAds

NeST

Globus (pre WS & GT4 WS)

Job submission (GRAM)

Information service (MDS)

Data transfer (GridFTP)

Replica Location (RLS)

EDG & LCG

Make Gridmap

Cert. Revocation list updater

Glue & Gen. Info. provider

VOMS

Condor Group

Condor/Condor-G

DAGMan

Fault Tolerant Shell

ClassAds

NeST

Globus (pre WS & GT4 WS)

Job submission (GRAM)

Information service (MDS)

Data transfer (GridFTP)

Replica Location (RLS)

EDG & LCG

Make Gridmap

Cert. Revocation list updater

Glue & Gen. Info. provider

VOMS

What is in the VDT? (A lot!)

ISI & UC

Chimera & Pegasus

NCSA

MyProxy

GSI OpenSSH

UberFTP

LBL

PyGlobus

Netlogger

DRM

Caltech

MonALISA

jClarens (WSR)

VDT

VDT System Profiler

Configuration software

ISI & UC

Chimera & Pegasus

NCSA

MyProxy

GSI OpenSSH

UberFTP

LBL

PyGlobus

Netlogger

DRM

Caltech

MonALISA

jClarens (WSR)

VDT

VDT System Profiler

Configuration software

US LHC

GUMS

PRIMA

Others

KX509 (U. Mich.)

Java SDK (Sun)

Apache HTTP/Tomcat

MySQL

Optional packages

Globus-Core {build}

Globus job-manager(s)

US LHC

GUMS

PRIMA

Others

KX509 (U. Mich.)

Java SDK (Sun)

Apache HTTP/Tomcat

MySQL

Optional packages

Globus-Core {build}

Globus job-manager(s)

Condor Group

Condor/Condor-G

DAGMan

Fault Tolerant Shell

ClassAds

NeST

Globus (pre WS & GT4 WS)

Job submission (GRAM)

Information service (MDS)

Data transfer (GridFTP)

Replica Location (RLS)

EDG & LCG

Make Gridmap

Cert. Revocation list updater

Glue & Gen. Info. provider

VOMS

Condor Group

Condor/Condor-G

DAGMan

Fault Tolerant Shell

ClassAds

NeST

Globus (pre WS & GT4 WS)

Job submission (GRAM)

Information service (MDS)

Data transfer (GridFTP)

Replica Location (RLS)

EDG & LCG

Make Gridmap

Cert. Revocation list updater

Glue & Gen. Info. provider

VOMS

ISI & UC

Chimera & Pegasus

NCSA

MyProxy

GSI OpenSSH

UberFTP

LBL

PyGlobus

Netlogger

DRM

Caltech

MonALISA

jClarens (WSR)

VDT

VDT System Profiler

Configuration software

ISI & UC

Chimera & Pegasus

NCSA

MyProxy

GSI OpenSSH

UberFTP

LBL

PyGlobus

Netlogger

DRM

Caltech

MonALISA

jClarens (WSR)

VDT

VDT System Profiler

Configuration software

US LHC

GUMS

PRIMA

Others

KX509 (U. Mich.)

Java SDK (Sun)

Apache HTTP/Tomcat

MySQL

Optional packages

Globus-Core {build}

Globus job-manager(s)

US LHC

GUMS

PRIMA

Others

KX509 (U. Mich.)

Java SDK (Sun)

Apache HTTP/Tomcat

MySQL

Optional packages

Globus-Core {build}

Globus job-manager(s)

Core software

User Interface

Computing Element

Storage Element

Authz System

Monitoring System

Page 24: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

Pacman

• Pacman is:– a software environment installer (or Meta-Packager)– a language for defining software environments– an interpreter that allows creation, installation, configuration, update,

verification and repair of installation environments– takes care of dependencies

• Pacman makes installation of all types of software easy

LCG/Scram

ATLAS/CMT

Globus/GPT Nordugrid/RPM

LIGO/tar/make D0/UPS-UPD

CMS DPE/tar/make

NPACI/TeraGrid/tar/make

OpenSource/tar/make

Commercial/tar/make

% pacman –get OSG:CE

Enables us to easily and coherently combine and manage software from arbitrary sources.

ATLASATLAS

NPACINPACI

D-ZeroD-Zero

iVDGLiVDGL

UCHEPUCHEPVDT

CMS/DPE

LIGO Enables remote experts to define installation config updating for everyone at once.

Page 25: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

Pacman Installation

1. Download Pacman– http://physics.bu.edu/~youssef/pacman/

2. Install the “package”– cd <install-directory>– pacman -get OSG:OSG_CE_0.2.1– ls

condor/ globus/ post-install/ setup.sh

edg/ gpt/ replica/ vdt/

ftsh/ perl/ setup.csh vdt-install.log

/monalisa ...

Page 26: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>62

Grid Operations Center

• Based at Indiana University and provides a central repository of staff and monitoring systems for:– Real time grid monitoring.– Problem tracking via a trouble ticket system.– Support for developers and sys admins.– Maintains infrastructure – VORS, MonALISA and

registration DB.– Maintains OSG software repositories.

OSG Consortium Mtg March 2007Quick Start Guide to the OSG

Page 27: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>63

Applications can cross infrastructures e.g: OSG and TeraGrid

Page 28: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>64

Genome Analysis and Database Update system

• Runs across TeraGrid and OSG. Uses the Virtual Data System (VDS) workflow & provenance.

•Pass through public DNA and protein databases for new and newly updated genomes of different organisms and runs BLAST, Blocks, Chisel. 1200 users of resulting DB.

• Request: 1000 CPUs for 1-2 weeks. Once a month, every month. On OSG at the moment >600CPUs and 17,000 jobs a week.

Page 29: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>65

Summary of OSG today

• Providing core services, software and a distributed facility for an increasing set of research communities.

• Helping Virtual Organizations access resources on many different infrastructures.

• Reaching out to others to collaborate and contribute our experience and efforts.

Page 30: The National Grid Cyberinfrastructure Open Science Grid and TeraGrid John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration Mike Wilde

JP Navarro <[email protected]>66

it’s the people…that make the grid a community!