rit colloquium (may 23, 2007)paul avery 1 paul avery university of florida [email protected]...

49
RIT Colloqui um (May 23, 2007) Paul Avery 1 Paul Avery University of Florida [email protected] Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open Science Grid Linking Universities and Laboratories In National Cyberinfrastructure www.opensciencegrid.org

Upload: justin-brooks

Post on 15-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 1

Paul AveryUniversity of [email protected]

Physics ColloquiumRIT (Rochester, NY)

May 23, 2007

Open Science GridLinking Universities and Laboratories In National

Cyberinfrastructure

www.opensciencegrid.org

Page 2: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 2

Cyberinfrastructure and Grids Grid: Geographically distributed computing

resources configured for coordinated use

Fabric: Physical resources & networks providing raw capability

Ownership: Resources controlled by owners and shared w/ others

Middleware: Software tying it all together: tools, services, etc.

Enhancing collaboration via transparent resource sharing

US-CMS“Virtual Organization”

Page 3: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 3

Motivation: Data Intensive Science

21st century scientific discoveryComputationally & data intensiveTheory + experiment + simulation Internationally distributed resources and collaborations

Dominant factor: data growth (1 petabyte = 1000 terabytes)

2000 ~0.5 petabyte2007 ~10 petabytes2013 ~100 petabytes2020 ~1000 petabytes

Powerful cyberinfrastructure neededComputation Massive, distributed CPUData storage & access Large-scale, distributed storageData movement International optical networksData sharing Global collaborations (100s – 1000s)Software Managing all of the above

How to collect, manage,access and interpret thisquantity of data?

Page 4: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 4

Open Science Grid: July 20, 2005Consortium of many organizations (multiple

disciplines)Production grid cyberinfrastructure80+ sites, 25,000+ CPUs: US, UK, Brazil, Taiwan

Page 5: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 5

The Open Science Grid Consortium

OpenScience

Grid

U.S. gridprojects LHC experiments

Laboratorycenters

Educationcommunities

Science projects & communities

Technologists(Network, HPC, …)

ComputerScience

Universityfacilities

Multi-disciplinaryfacilities

Regional andcampus grids

Page 6: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 6

Open Science Grid Basics Who

Comp. scientists, IT specialists, physicists, biologists, etc.

WhatShared computing and storage resourcesHigh-speed production and research networksMeeting place for research groups, software experts, IT

providers

VisionMaintain and operate a premier distributed computing facilityProvide education and training opportunities in its useExpand reach & capacity to meet needs of stakeholdersDynamically integrate new resources and applications

Members and partnersMembers: HPC facilities, campus, laboratory & regional

gridsPartners: Interoperation with TeraGrid, EGEE, NorduGrid,

etc.

Page 7: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 7

Crucial Ingredients in Building OSG

Science “Push”: ATLAS, CMS, LIGO, SDSS1999: Foresaw overwhelming need for distributed

cyberinfrastructure

Early funding: “Trillium” consortiumPPDG: $12M (DOE) (1999 – 2006)GriPhyN: $12M (NSF) (2000 – 2006) iVDGL: $14M (NSF) (2001 – 2007)Supplements + new funded projects

Social networks: ~150 people with many overlapsUniversities, labs, SDSC, foreign partners

Coordination: pooling resources, developing broad goalsCommon middleware: Virtual Data Toolkit (VDT)Multiple Grid deployments/testbeds using VDTUnified entity when collaborating internationallyHistorically, a strong driver for funding agency collaboration

Page 8: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 8

OSG History in Context

1999 2000 2001 2002 20052003 2004 2006 2007 2008 2009

PPDG

GriPhyN

iVDGL

Trillium Grid3 OSG(DOE)

(DOE+NSF)(NSF)

(NSF)

European Grid + Worldwide LHC Computing Grid

Campus, regional grids

LHC OpsLHC construction, preparation

LIGO operation LIGO preparation

Page 9: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 9

Principal Science Drivers High energy and nuclear physics

100s of petabytes (LHC) 2007Several petabytes 2005

LIGO (gravity wave search)0.5 - several petabytes 2002

Digital astronomy10s of petabytes 200910s of terabytes 2001

Other sciences coming forwardBioinformatics (10s of petabytes)NanoscienceEnvironmentalChemistryApplied mathematicsMaterials Science?

Data

gro

wth

Com

mu

nit

y g

row

th

2007

2005

2003

2001

2009

Page 10: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 10

OSG Virtual OrganizationsATLAS HEP/LHC HEP experiment at CERN

CDF HEP HEP experiment at FermiLab

CMS HEP/LHC HEP experiment at CERN

DES Digital astronomy Dark Energy Survey

DOSAR Regional grid Regional grid in Southwest US

DZero HEP HEP experiment at FermiLab

DOSAR Regional grid Regional grid in Southwest

ENGAGE Engagement effort A place for new communities

FermiLab Lab grid HEP laboratory grid

fMRI fMRI Functional MRI

GADU Bio Bioinformatics effort at Argonne

Geant4 Software Simulation project

GLOW Campus grid Campus grid U of Wisconsin, Madison

GRASE Regional grid Regional grid in Upstate NY

Page 11: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 11

OSG Virtual Organizations (2)GridChem

Chemistry Quantum chemistry grid

GPN Great Plains Network www.greatplains.net

GROW Campus grid Campus grid at U of Iowa

I2U2 EOT E/O consortium

LIGO Gravity waves Gravitational wave experiment

Mariachi Cosmic rays Ultra-high energy cosmic rays

nanoHUB

Nanotech Nanotechnology grid at Purdue

NWICG Regional grid Northwest Indiana regional grid

NYSGRID NY State Grid www.nysgrid.org

OSGEDU EOT OSG education/outreach

SBGRID Structural biology Structural biology @ Harvard

SDSS Digital astronomy Sloan Digital Sky Survey (Astro)

STAR Nuclear physics Nuclear physics experiment at Brookhaven

UFGrid Campus grid Campus grid at U of Florida

Page 12: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 12

Partners: Federating with OSG Campus and regional

Grid Laboratory of Wisconsin (GLOW) Grid Operations Center at Indiana University (GOC) Grid Research and Education Group at Iowa (GROW) Northwest Indiana Computational Grid (NWICG) New York State Grid (NYSGrid) (in progress) Texas Internet Grid for Research and Education (TIGRE) nanoHUB (Purdue) LONI (Louisiana)

National Data Intensive Science University Network (DISUN) TeraGrid

International Worldwide LHC Computing Grid Collaboration (WLCG) Enabling Grids for E-SciencE (EGEE) TWGrid (from Academica Sinica Grid Computing) Nordic Data Grid Facility (NorduGrid) Australian Partnerships for Advanced Computing (APAC)

Page 13: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 13

Search for Origin of Mass New fundamental forces Supersymmetry Other new particles 2007 – ?

TOTEM

LHCb

ALICE

27 km Tunnel in Switzerland & France

CMS

ATLAS

Defining the Scale of OSG:Experiments at Large Hadron Collider

LHC @ CERN

Page 14: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 14

CMS: “Compact” Muon Solenoid

Inconsequential humans

Page 15: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 15

All charged tracks with pt > 2 GeV

Reconstructed tracks with pt > 25 GeV

(+30 minimum bias events)

109 collisions/sec, selectivity: 1 in 1013

Collision Complexity: CPU + Storage

Page 16: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 16

CMSATLAS

LHCb

Storage Raw recording rate 0.2 – 1.5 GB/s Large Monte Carlo data samples 100 PB by ~2013 1000 PB later in decade?

Processing PetaOps (> 300,000 3 GHz PCs)

Users 100s of institutes 1000s of researchers

LHC Data and CPU Requirements

Page 17: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 17

CMS Experiment

OSG and LHC Global Grid

Online System

CERN Computer Center

FermiLabKorea RussiaUK

Maryland

200 - 1500 MB/s

>10 Gb/s

10-40 Gb/s

2.5-10 Gb/s

Tier 0

Tier 1

Tier 3

Tier 2

Physics caches

PCs

Iowa

UCSDCaltechU Florida

5000 physicists, 60 countries

10s of Petabytes/yr by 2009 CERN / Outside = 10-20%

FIU

Tier 4

OSG

Page 18: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 18

ATLAS CMS

LHC Global Collaborations

2000 – 3000 physicists per experiment USA is 20–31% of total

Page 19: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 19

LIGO: Search for Gravity Waves LIGO Grid

6 US sites3 EU sites (UK & Germany)

* LHO, LLO: LIGO observatory sites* LSC: LIGO Scientific Collaboration

Cardiff

AEI/Golm •

Birmingham•

Page 20: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 20

Sloan Digital Sky Survey: Mapping the Sky

Page 21: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 21

Integrated Database

Integrated Database Includes: Parsed Sequence Data and

Annotation Data from Public web sources.

Results of different tools used for Analysis: Blast, Blocks, TMHMM, …

GADU using GridApplications executed on Grid as

workflows and results are stored in integrated Database.

GADU Performs:Acquisition: to acquire Genome

Data from a variety of publicly available databases and store temporarily on the file system.

Analysis: to run different publicly available tools and in-house tools on the Grid using Acquired data & data from Integrated database.

Storage: Store the parsed data acquired from public databases and parsed results of the tools and workflows used during analysis.

Bidirectional Data Flow

Public DatabasesGenomic databases available on the web.Eg: NCBI, PIR, KEGG, EMP, InterPro, etc.

Applications (Web Interfaces) Based on the Integrated Database

PUMA2Evolutionary Analysis of

Metabolism

ChiselProtein Function Analysis

Tool.

TARGETTargets for Structural analysis of proteins.

PATHOSPathogenic DB for

Bio-defense research

PhyloblocksEvolutionary analysis of

protein families

TeraGrid OSG DOE SG

GNARE – Genome Analysis Research Environment

Services to Other Groups

•SEED(Data Acquisition)

•Shewanella Consortium

(Genome Analysis)Others..

Bioinformatics: GADU / GNARE

Page 22: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 22

Bioinformatics (cont)

Shewanella oneidensisgenome

Page 23: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 23

Nanoscience Simulations

collaboration

nanoHUB.org

courses, tutorialsonline simulation

seminars

learning modules

Real users and real usage >10,100 users

1881 sim. users>53,000 simulations

Page 24: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 24

OSG Engagement Effort Purpose: Bring non-physics applications to OSG

Led by RENCI (UNC + NC State + Duke)

Specific targeted opportunitiesDevelop relationshipDirect assistance with technical details of connecting to

OSG

Feedback and new requirements for OSG infrastructure

(To facilitate inclusion of new communities)More & better documentationMore automation

Page 25: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 25

OSG and the Virtual Data Toolkit VDT: a collection of software

Grid software (Condor, Globus, VOMS, dCache, GUMS, Gratia, …)

Virtual Data SystemUtilities

VDT: the basis for the OSG software stackGoal is easy installation with automatic configurationNow widely used in other projectsHas a growing support infrastructure

Page 26: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 26

Why Have the VDT? Everyone could download the software from the

providers But the VDT:

Figures out dependencies between software Works with providers for bug fixesAutomatic configures & packages softwareTests everything on 15 platforms (and growing)

Debian 3.1 Fedora Core 3 Fedora Core 4 (x86, x86-64) Fedora Core 4 (x86-64) RedHat Enterprise Linux 3 AS (x86, x86-64, ia64) RedHat Enterprise Linux 4 AS (x64, x86-64) ROCKS Linux 3.3 Scientific Linux Fermi 3 Scientific Linux Fermi 4 (x86, x86-64, ia64) SUSE Linux 9 (IA-64)

Page 27: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 27

05

101520253035404550

Jan-

02

Jul-0

2

Jan-

03

Jul-0

3

Jan-

04

Jul-0

4

Jan-

05

Jul-0

5

Jan-

06

Jul-0

6

Jan-

07

Nu

mb

er o

f m

ajo

r co

mp

on

ents

VDT 1.1.x VDT 1.2.x VDT 1.3.x VDT 1.4.0 VDT 1.5.x VDT 1.6.x

VDT 1.0Globus 2.0bCondor-G 6.3.1

VDT 1.1.8Adopted by LCG

VDT 1.1.11Grid2003 VDT 1.2.0

VDT 1.3.0

VDT 1.3.9For OSG 0.4

VDT 1.6.1For OSG 0.6.0

VDT 1.3.6For OSG 0.2

More dev releases

Both added and removed software

VDT Growth Over 5 Years (1.6.1i now)

vdt.cs.wisc.edu #

of

Com

pon

en

ts

Page 28: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 28

Collaboration with Internet2www.internet2.edu

Page 29: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 29

Optical, multi-wavelength community owned or leased “dark fiber” (10 GbE) networks for R&E

Spawning state-wide and regional networks (FLR, SURA, LONI, …)

Bulletin: NLR-Internet2 merger announcement

Collaboration with National Lambda Rail

www.nlr.net

Page 30: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 30

UltraLight

10 Gb/s+ network• Caltech, UF, FIU, UM, MIT• SLAC, FNAL• Int’l partners• Level(3), Cisco, NLR

http://www.ultralight.org

Funded by NSF

Integrating Advanced Networking in Applications

Page 31: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 31

REDDnet: National Networked Storage

NSF funded project Vandebilt

8 initial sitesMultiple disciplines

Satellite imagery HEP Terascale

Supernova Initative Structural Biology Bioinformatics

Storage 500TB disk 200TB tape

Brazil?

Page 32: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 32

OSG Jobs Snapshot: 6 Months

5000 simultaneous jobs

from multiple VOs

Sep Dec FebNov JanOct Mar

Page 33: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 33

OSG Jobs Per Site: 6 Months

5000 simultaneous jobs

at multiple sites

Sep Dec FebNov JanOct Mar

Page 34: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 34

Completed Jobs/Week on OSG

400K CMS “Data Challenge”

Sep Dec FebNov JanOct Mar

Page 35: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 35

# Jobs Per VONew Accounting

System(Gratia)

Page 36: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 36

Massive 2007 Data Reprocessing

by D0 Experiment @ Fermilab

SAM

OSG

LCG~ 400M total~ 250M OSG

Page 37: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 37

CDF Discovery of Bs Oscillations

s sB B

/ 21 2sin / cos /t

s s s s sB e x t B x t B

2.8THz2sxf

Page 38: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 38

Communications:International Science Grid This

WeekSGTW iSGTWFrom April 2005Diverse audience>1000

subscribers

www.isgtw.org

Page 39: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 39

OSG News: Monthly Newsletter

18 issues by Apr. 2007

www.opensciencegrid.org/osgnews

Page 40: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 40

Grid Summer Schools

Summer 2004, 2005, 20061 week @ South Padre Island, TexasLectures plus hands-on exercises to ~40 studentsStudents of differing backgrounds (physics + CS), minorities

Reaching a wider audienceLectures, exercises, video, on webMore tutorials, 3-4/yearStudents, postdocs, scientistsAgency specific tutorials

Page 41: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 41

Project Challenges Technical constraints

Commercial tools fall far short, require (too much) invention

Integration of advanced CI, e.g. networks

Financial constraints (slide)Fragmented & short term funding injections (recent

$30M/5 years)Fragmentation of individual efforts

Distributed coordination and managementTighter organization within member projects compared to

OSGCoordination of schedules & milestonesMany phone/video meetings, travelKnowledge dispersed, few people have broad overview

Page 42: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 42

Funding & Milestones: 1999 – 2007

2000 2001 2003 2004 2005 2006 20072002

GriPhyN, $12M

PPDG, $9.5M

UltraLight, $2M

CHEPREO, $4M

Grid Communications

Grid Summer Schools 2004,

2005, 2006

Grid3 start OSG

start

VDT 1.0

First US-LHCGrid

Testbeds

Digital Divide Workshops04, 05, 06

LIGO Grid

LHC startiVDGL,

$14M

DISUN, $10M

OSG, $30M NSF, DOE

VDT 1.3

Grid, networking projects Large experiments Education, outreach, training

Page 43: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 43

Challenges from Diversity and Growth

Management of an increasingly diverse enterpriseSci/Eng projects, organizations, disciplines as distinct

culturesAccommodating new member communities (expectations?)

Interoperation with other gridsTeraGrid International partners (EGEE, NorduGrid, etc.)Multiple campus and regional grids

Education, outreach and trainingTraining for researchers, students… but also project PIs, program officers

Operating a rapidly growing cyberinfrastructure25K 100K CPUs, 4 10 PB diskManagement of and access to rapidly increasing data stores

(slide)Monitoring, accounting, achieving high utilizationScalability of support model (slide)

Page 44: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 44

Rapid Cyberinfrastructure Growth: LHC

0

50

100

150

200

250

300

350

2007 2008 2009 2010Year

MS

I200

0

LHCb-Tier-2

CMS-Tier-2

ATLAS-Tier-2

ALICE-Tier-2

LHCb-Tier-1

CMS-Tier-1

ATLAS-Tier-1

ALICE-Tier-1

LHCb-CERN

CMS-CERN

ATLAS-CERN

ALICE-CERN

CERN

Tier-1

Tier-2

2008: ~140,000PCs

Meeting LHC service challenges & milestonesParticipating in worldwide simulation productions

Page 45: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 45

OSG Operations

Distributed modelScalability!VOs, sites, providersRigorous problem

tracking & routingSecurityProvisioningMonitoringReporting

Partners with EGEE operations

Page 46: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 46

Integrated Network Management

Five Year Project Timeline & Milestones

LHC Simulations Support 1000 Users; 20PB Data Archive

Contribute to Worldwide LHC Computing Grid LHC Event Data Distribution and Analysis

Contribute to LIGO Workflow and Data Analysis

Additional Science Communities +1 Community

+1 Community

+1 Community

Facility Security : Risk Assessment, Audits, Incident Response, Management, Operations, Technical Controls

Plan V1 1st Audit Risk Assessment

Audit Risk Assessment

Audit Risk Assessment

Audit Risk Assessment

VDT and OSG Software Releases: Major Release every 6 months; Minor Updates as needed VDT 1.4.0VDT 1.4.1VDT 1.4.2 … … … …

Advanced LIGO LIGO Data Grid dependent on OSG

CDF Simulation

STAR, CDF, D0, Astrophysics

D0 Reprocessing D0 SimulationsCDF Simulation and Analysis

LIGO data run SC5

Facility Operations and Metrics: Increase robustness and scale; Operational Metrics defined and validated each year.

Interoperate and Federate with Campus and Regional Grids

2006 2007 2008 2009 2010 2011

Project start End of Phase I End of Phase II

VDT Incremental

Updates

dCache with role based

authorization

OSG 0.6.0OSG 0.8.0 OSG 1.0 OSG 2.0 OSG 3.0 …

Accounting Auditing

VDS with SRMCommon S/w Distribution

with TeraGridEGEE using VDT 1.4.X

Transparent data and job movement with TeraGridTransparent data management with

EGEE

Federated monitoring and information services

Data Analysis (batch and interactive) Workflow

Extended Capabilities & Increase Scalability and Performance for Jobs and Data to meet Stakeholder needsSRM/dCache Extensions

“Just in Time” Workload Management

VO Services Infrastructure

Improved Workflow and Resource Selection

Work with SciDAC-2 CEDS and Security with Open Science

+1 Community

2006 2007 2008 2009 2010 2011

+1 Community

+1 Community

+1 Community

+1 Community

STAR Data Distribution and Jobs 10KJobs per Day

Page 47: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 47

Extra Slides

Page 48: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 48

VDT Release Process (Subway Map)

Gather requirements

Build software

Test

Validation test bed

ITB Release Candidate

VDT Release

Integration test bed

OSG Release

TimeDay 0

Day N

From Alain Roy

Page 49: RIT Colloquium (May 23, 2007)Paul Avery 1 Paul Avery University of Florida avery@phys.ufl.edu Physics Colloquium RIT (Rochester, NY) May 23, 2007 Open

RIT Colloquium (May 23, 2007)

Paul Avery 49

VDT Challenges How should we smoothly update a production

service? In-place vs. on-the-sidePreserve old configuration while making big changesStill takes hours to fully install and set up from scratch

How do we support more platforms?A struggle to keep up with the onslaught of Linux

distributionsAIX? Mac OS X? Solaris?

How can we accommodate native packaging formats?

RPMDeb

Fedora Core 3

Fedora Core 4RHEL 3

RHEL 4BCCD

Fedora Core 6