the open science grid

71
The Open Science Grid Miron Livny OSG Facility Coordinator University of Wisconsin-Madison

Upload: elmo-edwards

Post on 31-Dec-2015

44 views

Category:

Documents


3 download

DESCRIPTION

The Open Science Grid. Miron Livny OSG Facility Coordinator University of Wisconsin-Madison. Some History and background …. U.S. “Trillium” Grid Partnership. Trillium = PPDG + GriPhyN + iVDGL Particle Physics Data Grid:$18M (DOE) (1999 – 2006) GriPhyN:$12M (NSF) (2000 – 2005) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Open Science Grid

The Open Science Grid

Miron LivnyOSG Facility Coordinator

University of Wisconsin-Madison

Page 2: The Open Science Grid

2

Some History

and

background …

Page 3: The Open Science Grid

04/19/23 3

U.S. “Trillium” Grid Partnership Trillium = PPDG + GriPhyN + iVDGL

Particle Physics Data Grid: $18M (DOE) (1999 – 2006)GriPhyN: $12M (NSF) (2000 – 2005) iVDGL: $14M (NSF) (2001 – 2006)

Basic composition (~150 people)PPDG: 4 universities, 6 labsGriPhyN: 12 universities, SDSC, 3 labs iVDGL: 18 universities, SDSC, 4 labs, foreign partnersExpts: BaBar, D0, STAR, Jlab, CMS, ATLAS, LIGO,

SDSS/NVO

Complementarity of projectsGriPhyN: CS research, Virtual Data Toolkit (VDT)

developmentPPDG: “End to end” Grid services, monitoring, analysis iVDGL: Grid laboratory deployment using VDTExperiments provide frontier challengesUnified entity when collaborating internationally

Page 4: The Open Science Grid

04/19/23 4

From Grid3 to OSG

11/0311/03

2/052/05

4/054/05

12/0512/059/05

9/05

2/062/06

4/064/06

7/067/06

OSG 0.2.1OSG 0.2.1

OSG 0.4.0OSG 0.4.0

OSG 0.4.1OSG 0.4.1

OSG 0.6.0OSG 0.6.0

Page 5: The Open Science Grid

5

What is OSG?

The Open Science Grid is a US national distributed computing facility that supports scientific computing via an open collaboration of science researchers, software developers and computing, storage  and network providers. The OSG Consortium is building and operating the OSG, bringing resources and researchers from universities and national laboratories together and cooperating with other national and international infrastructures to give scientists from a broad range of disciplines access to shared resources worldwide.

Page 6: The Open Science Grid

6

The OSG Project

Co-funded by DOE and NSF at an annual rate of ~$6M for 5 years starting FY-07

Currently main stakeholders are from physics - US LHC experiments, LIGO, STAR  experiment, the Tevatron Run II and Astrophysics experiments

A mix of DOE-Lab and campus resources

Active “engagement” effort to add new domains and resource providers to the OSG consortium

Page 7: The Open Science Grid

7

OSG Consortium

Page 8: The Open Science Grid

OSG Project Execution

Executive DirectorRuth Pordes

Resources ManagersPaul Avery, Albert Lazzarini

Applications CoordinatorsTorre Wenaus, Frank Würthwein

Facility CoordinatorMiron Livny

Education, Training, Outreach Coordinator: Mike Wilde

OSG PIMiron Livny

External Projects

Engagement CoordinatorAlan Blatecky

Operations CoordinatorLeigh Grundhoefer

Software CoordinatorAlain Roy

Security OfficerDon Petravick

OSG Executive Board

Deputy Executive DirectorRob Gardner, Doug Olson

√ Role includesProvision of middleware

Page 9: The Open Science Grid

9

OSG Principles

Characteristics - Provide guaranteed and opportunistic access to shared

resources. Operate a heterogeneous environment both in services available at any site and for

any VO, and multiple implementations behind common interfaces.

Interface to Campus and Regional Grids. Federate with other national/international Grids. Support multiple software releases at any one time.

Drivers - Delivery to the schedule, capacity and capability of LHC and LIGO:

Contributions to/from and collaboration with the US ATLAS, US CMS, LIGO software and computing programs.

Support for/collaboration with other physics/non-physics communities. Partnerships with other Grids - especially EGEE and TeraGrid. Evolution by deployment of externally developed new services and technologies:.

Page 10: The Open Science Grid

10

Grid of Grids - from Local to Global

Community Campus

National

Page 11: The Open Science Grid

11

Who are you?

A resource can be accessed by a user via the campus, community or national grid.

A user can access a resource with a campus, community or national grid identity.

Page 12: The Open Science Grid

12

OSG sites

Page 13: The Open Science Grid

13

running (and monitored) “OSG jobs” in 06/06.

Page 14: The Open Science Grid

14

Example GADU run in 04/06

Page 15: The Open Science Grid

15

CMS Experiment - an exemplar community grid

GermanyTaiwan UKItaly

Data & jobs moving locally, regionally & globally within CMS grid.

Transparently across grid boundaries from campus to global.

Florida

USA

CERN

Caltech

Wisconsin

UCSD

France

Purdue

MIT

UNL

OSG

EGEE

Page 16: The Open Science Grid

16

The CMS Grid of Grids

Job submission:

16,000 jobs per day submitted across EGEE & OSG via INFN Resource Broker (RB).

Data Transfer:Peak IO of 5Gbps from FNAL to 32 EGEE and 7 OSG sites.

All 7 OSG sites have reached 5TB/day goal.

3 OSG sites (Caltech, Florida, UCSD) exceeded 10TB/day.

Page 17: The Open Science Grid

17

CMS Xfer on OSG

All sites haveexceeded 5TBper day in June.

Page 18: The Open Science Grid

18

The US CMS center at

FNAL transfers data to

39 sites worldwide in

CMS global Xfer challenge

Peak Xfer rates of ~5Gbps

are reached.

CMS Xfer FNAL to World

Page 19: The Open Science Grid

19

EGEE–OSG inter-operability

Agree on a common Virtual Organization Management System (VOMS)

Active Joint Security groups: leading to common policies and procedures.

Condor-G interfaces to multiple remote job execution services (GRAM, Condor-C).

File Transfers using GridFTP. SRM V1.1 for managed storage access. SRM V2.1 in

test. Publish OSG BDII to shared BDII for Resource

Brokers to route jobs across the two grids. Automate ticket routing between GOCs.

Page 20: The Open Science Grid

20

OSG Middleware Layering

NSF Middleware Initiative (NMI): Condor, Globus, Myproxy

Virtual Data Toolkit (VDT) Common Services NMI + VOMS, CEMon (common EGEE

components), MonaLisa, Clarens, AuthZ

OSG Release Cache: VDT + Configuration, Validation, VO management

CMSServices & Framewor

k

LIGOData Grid

CDF, D0SamGrid &Framework

Infr

ast

ruct

ure

Applic

ati

ons …ATLAS

Services &Framewor

k

Page 21: The Open Science Grid

21

OSG Middleware Pipeline

Domain science requirements.

OSG stakeholders and middleware developer (joint) projects.

Integrate into VDT Release. Deploy on OSG integration grid

Provision in OSG release & deploy to OSG production.

Condor, Globus,EGEE etc

Test on “VO specific grid”

Test InteroperabilityWith EGEE and TeraGrid

Page 22: The Open Science Grid

The Virtual Data Toolkit

Alain RoyOSG Software Coordinator

Condor TeamUniversity of Wisconsin-Madison

Page 23: The Open Science Grid

23

What is the VDT?

A collection of software Grid software (Condor, Globus and lots more) Virtual Data System (Origin of the name “VDT”) Utilities

An easy installation Goal: Push a button, everything just works Two methods:

Pacman: installs and configures it all RPM: installs some of the software, no configuration

A support infrastructure

Page 24: The Open Science Grid

24

How much software?

Page 25: The Open Science Grid

25

Who makes the VDT?

The VDT is a product of Open Science Grid (OSG) VDT is used on all OSG grid sites

OSG is new, but VDT has been around since 2002

Originally, VDT was a product of the GriPhyN/iVDGL VDT was used on all Grid2003 sites

Page 26: The Open Science Grid

26

Who makes the VDT?

Miron Livny

Alain Roy

Tim Cartwright

Andy Pavlo

1 Mastermind

+

3 FTEs

Page 27: The Open Science Grid

27

Who uses the VDT?

Open Science Grid

LIGO Data Grid

LCG LHC Computing Grid, from CERN

EGEE Enabling Grids for E-Science

Page 28: The Open Science Grid

28

Why should you care?

The VDT gives insight into technical challenges in building a large grid What software do you need? How do you build it? How do you test it? How do you deploy it? How do you support it?

Page 29: The Open Science Grid

29

What software is in the VDT?

Security VOMS (VO membership) GUMS (local authorization) mkgridmap (local authorization) MyProxy (proxy management) GSI SSH CA CRL updater

Monitoring MonaLISA gLite CEMon

Accounting OSG Gratia

Job Management Condor (including Condor-G &

Condor-C) Globus GRAM

Data Management GridFTP (data transfer) RLS (replication location) DRM (storage management) Globus RFT

Information Services Globus MDS GLUE schema & providers

Note: The type, quantity, and variety of software is more important to my talk today than the specific software I’m naming

Page 30: The Open Science Grid

30

Client tools Virtual Data System SRM clients (V1 and V2) UberFTP (GridFTP client)

Developer Tools PyGlobus PyGridWare

Testing NMI Build & Test VDT Tests

What software is in the VDT?

Support Apache Tomcat MySQL (with MyODBC) Non-standard Perl modules Wget Squid Logrotate Configuration Scripts

And More!

Page 31: The Open Science Grid

31

Building the VDT

We distribute binaries Expecting everyone to build from source is impractical Essential to be able to build on many platforms, and replicate

builds

We build all binaries with NMI Build and Test infrastructure

Page 32: The Open Science Grid

32

Building the VDT

Sources(CVS)

Patching

NMIBuild & TestCondor pool

(70+ computers)

Build

Test

Package

VDT

Contributors

Build

RPM downloads

Pacman CacheBinaries

Binaries

Test

Users

Page 33: The Open Science Grid

33

Testing the VDT

Every night, we test: Full VDT install Subsets of VDT Current release: You might be surprised how often things break

after release! Upcoming release On all supported platforms

Supported means “we test it every night” VDT works on some unsupported platforms

We care about interactions between the software

Page 34: The Open Science Grid

34

Fedora Core 3 Fedora Core 4 Fedora Core 4/x86-

64 ROCKS 3.3 SuSE 9/ia64

Supported Platforms

RedHat 7 RedHat 9 Debian 3.1 RHAS 3 RHAS 3/ia64 RHAS 3/x86-64 RHAS 4 Scientific Linux 3

The number of Linux distributions grows constantly, and they have important differences People ask for new platforms, but rarely ask to drop platforms System administration for heterogeneous systems is a lot of work

Page 35: The Open Science Grid

35

Results on webResults via email

A daily reminder!

Tests

Page 36: The Open Science Grid

36

Deploying the VDT We want to support root and non-root

installations

We want to assist with configuration

We want it to be simple

Our solution: Pacman Developed by Saul Youssef, BU Downloads and installs with one command Asks questions during install (optionally) Does not require root Can install multiple versions at same time

Page 37: The Open Science Grid

37

Challenges we struggle with

How should we smoothly update a production service? In-place vs. on-the-side Preserve old configuration while making big changes. As easy as we try to make it, it still takes hours to fully install

and set up from scratch

How do we support more platforms? It’s a struggle to keep up with the onslaught of Linux

distributions Mac OS X? Solaris?

Page 38: The Open Science Grid

38

More challenges

Improving testing We care about interactions between the software: “When using

a VOMS proxy with Condor-G, can we run a GT4 job with GridFTP transfer, keeping the proxy in MyProxy, while using PBS as the backend batch system…”

Some people want native packaging formats RPM Deb

What software should we have? New storage management software

Page 39: The Open Science Grid

39

One more challenge

Hiring We need high quality software developers Creating the VDT involves all aspects of software development But: Developers prefer writing new code instead of

Writing lots of little bits of code Thorough testing Lots of debugging User support

Page 40: The Open Science Grid

40

Where do you learn more?

http://vdt.cs.wisc.edu

Support: Alain Roy: [email protected] Miron Livny: [email protected] Official Support: [email protected]

Page 41: The Open Science Grid

41

Security Infrastructure

Identity: X509 Certificates OSG is a founding member of the US TAGPMA. DOEGrids provides script utilities for bulk requests of

Host certs, CRL checking etc. VDT downloads CA information from IGTF.

Authentication and Authorization using VOMS extended attribute certficates. DN-> Account mapping done at Site (multiple CEs, SEs)

by GUMS. Standard authorization callouts to Prima(CE) and

gPlazma(SE).

Page 42: The Open Science Grid

42

Security Infrastructure

Security Process modeled on NIST procedural controls starting from an inventory of the OSG assets: Management - Risk assessment, planning,

Service auditing and checking Operational - Incident response, Awareness

and Training, Configuration management, Technical - Authentication and Revocation,

Auditing and analysis. End to end trust in quality of code executed on remote CPU -signatures?

Page 43: The Open Science Grid

43

User and VO Management VO Registers with with Operations Center

Provides URL for VOMS service to be propagated to the sites. Several VOMS are shared with EGEE as part of WLCG.

User registers through VOMRS or VO administrator User added to VOMS of one or more VOs. VO responsible for users to sign AUP. VO responsible for VOMS service support.

Site Registers with the Operations Center Signs the Service Agreement. Decides which VOs to support (striving for default admit) Populates GUMS from VOMSes of all VOs. Chooses account UID policy for

each VO & role.

VOs and Sites provide Support Center Contact and joint Operations.

For WLCG: US ATLAS and US CMS Tier-1s directly registered to WLCG. Other support centers propagated through OSG GOC to WLCG.

Page 44: The Open Science Grid

44

Operations and User Support

Virtual Organization (VO) Group of one or more researchers

Resource provider (RP) Operates Compute Elements and Storage Elements

Support Center (SC) SC provides support for one or more VO and/or RP

VO support centers Provide end user support including triage of user-related trouble tickets

Community Support Volunteer effort to provide SC for RP for VOs without their own SC, and

general help discussion mail list

Page 45: The Open Science Grid

45

Operations Model

Real support organizations

often play multiple roles

Lines represent communication paths and, in our model, agreements.

We have not progressed very far with agreements yet.

Gray shading indicates that OSG Operations composed of effort from all the support centers

Page 46: The Open Science Grid

46

OSG Release Process

ApplicationsIntegrationProvisionDeploy

Integration Testbed (15-20) Production (50+) sites

Sao PaoloSao Paolo Taiwan, S.KoreaTaiwan, S.Korea

ITBITB OSGOSG

Page 47: The Open Science Grid

47

Integration Testbed

As reported in GridCat status catalog

siteITB release service facility

status

Ops map

Tier 2 sites

Page 48: The Open Science Grid

48

Release Schedule

06/06 03/0712/06 09/07 12/07 3/08 6/08 9/0803/06 09/06 06/0701/06

Funct

ionalit

y

OSG 0.4.1

OSG 0.4.0

OSG 0.6.0

OSG 0.8.0OSG 1.0.0!

Incremental Updates (minor release)

Incremental Updates

Incremental Updates

CMS CSA06SC4 Advanced LIGO

WLCG ServiceCommissioned

ATLAS Cosmic Ray Run

Page 49: The Open Science Grid

49

OSG Release Timeline

2/052/05

4/054/05

12/0512/059/05

9/05

2/062/06

4/064/06

7/067/06

ITB 0.1.2ITB 0.1.2

ITB 0.1.6ITB 0.1.6

ITB 0.3.0ITB 0.3.0

ITB 0.3.4ITB 0.3.4

ITB 0.3.7ITB 0.3.7

OSG 0.2.1OSG 0.2.1

OSG 0.4.0OSG 0.4.0

OSG 0.4.1OSG 0.4.1

OSG 0.6.0OSG 0.6.0

ITB 0.5.0ITB 0.5.0 IntegrationIntegration

ProductionProduction

Page 50: The Open Science Grid

50

Deployment and Maintenance Distribute s/w through VDT and OSG

caches.

Progress technically via VDT weekly office hours - problems, help, planning - fed from multiple sources (Ops, Int, VDT-Support, mail, phone).

Publish plans and problems through VDT “To do list”, Int-Twiki and ticket systems.

Critical updates and patches follow Standard Operating Procedures.

Page 51: The Open Science Grid

51

Release Functionality

OSG 0.6 Fall 2006 Accounting;

Squid (Web caching in support of s/w distribution + database information);

SRM V2+AuthZ;

CEMON-ClassAd based Resource Selection.

Support for MDS-4.

OSG 0.8 Spring 2007 VM based Edge Services;

Just in time job scheduling, Pull-Mode Condor-C,

Support for sites to run Pilot jobs and/or Glide-ins using gLexec for identity changes.

OSG1.0 End of 2007

Page 52: The Open Science Grid

52

Inter-operability with Campus grids

FermiGrid is an interesting example for the challenges we face when making the resources of a campus (in this case a DOE Laboratory) grid accessible to the OSG community

Page 53: The Open Science Grid

53

OSG Principles

Characteristics - Provide guaranteed and opportunistic access to shared resources. Operate a heterogeneous environment both in services available at any site and

for any VO, and multiple implementations behind common interfaces. Interface to Campus and Regional Grids. Federate with other national/international Grids. Support multiple software releases at any one time.

Drivers - Delivery to the schedule, capacity and capability of LHC and LIGO:

Contributions to/from and collaboration with the US ATLAS, US CMS, LIGO software and computing programs.

Support for/collaboration with other physics/non-physics communities. Partnerships with other Grids - especially EGEE and TeraGrid. Evolution by deployment of externally developed new services and

technologies:.

Page 54: The Open Science Grid

54

OSG Middleware Layering

NSF Middleware Initiative (NMI): Condor, Globus, Myproxy

Virtual Data Toolkit (VDT) Common Services NMI + VOMS, CEMon (common EGEE

components), MonaLisa, Clarens, AuthZ

OSG Release Cache: VDT + Configuration, Validation, VO management

CMSServices & Framewor

k

LIGOData Grid

CDF, D0SamGrid &Framework

Infr

ast

ruct

ure

Applic

ati

ons …ATLAS

Services &Framewor

k

Page 55: The Open Science Grid

55

Summary

OSG facility opened July 22nd 2005.

OSG facility is under steady use— ~2-3000 jobs at all times— HEP but large Bio/Eng/Med occasionally— Moderate other physics - Astro/Nuclear - LIGO expected to ramp up.

OSG project— 5 year Proposal to DOE & NSF funded starting 9/06

— Facility & Improve/Expand/Extend/Interoperate & E&O

Off to a running start … but lot’s more to do.— Routinely exceeding 1Gbps at 3 sites

— Scale by x4 by 2008 and many more sites— Routinely exceeding 1000 running jobs per client

— Scale by at least x10 by 2008— Have reached 99% success rate for 10,000 jobs per day submission

— Need to reach this routinely, even under heavy load

Page 56: The Open Science Grid

56

EGEE–OSG inter-operability

Agree on a common Virtual Organization Management System (VOMS)

Active Joint Security groups: leading to common policies and procedures.

Condor-G interfaces to multiple remote job execution services (GRAM, Condor-C).

File Transfers using GridFTP. SRM V1.1 for managed storage access. SRM V2.1 in

test. Publish OSG BDII to shared BDII for Resource

Brokers to route jobs across the two grids. Automate ticket routing between GOCs.

Page 57: The Open Science Grid

57

What is FermiGrid?

Integrates resources across most (soon all) owners at Fermilab.

Supports jobs from Fermilab organizations to run on any/all accessible campus FermiGrid and national Open Science Grid resources.

Supports jobs from OSG to be scheduled onto any/all Fermilab sites,.

Unified and reliable common interface and services for FermiGrid gateway - including security, job scheduling, user management, and storage.

More information is available at http://fermigrid.fnal.gov

Page 58: The Open Science Grid

58

Job Forwarding and Resource Sharing

Gateway currently interfaces 5 Condor pools with diverse file systems and >1000 Job Slots. Plans to grow to 11 clusters (8 Condor, 2 PBS and 1 LSF)

Job scheduling policies and in place agreements for sharing allow fast response to changes in resource needs by Fermilab and OSG users.

Gateway provides single bridge between OSG wide area distributed infrastructure and FermiGrid local sites. Consists of a Glbus gate-keeper and a Condor-G

Each cluster has its own Globus gate-keeper

Storage and Job execution policies applied through Site-wide managed security and authorization services.

Page 59: The Open Science Grid

59

Access to FermiGridOSG

GeneralUsers

FermilabUsers

CDF Condor pool

FermiGridGateway

OSG “agreed”

Users

GT-GK

GT-GKDZero

Condor pool

GT-GKShared

Condor pool

GT-GKCMS

Condor pool

GT-GK

Condor-G

Page 60: The Open Science Grid

04/19/23 60

GLOW: UW Enterprise Grid

• Condor pools at various departments integrated into a campus wide grid

–Grid Laboratory of Wisconsin• Older private Condor pools at other departments

– ~1000 ~1GHz Intel CPUs at CS– ~100 ~2GHz Intel CPUs at Physics– …

• Condor jobs flock from on-campus and of-campus to GLOW

• Excellent utilization– Especially when the Condor Standard Universe is used

• Premption, Checkpointing, Job Migration

Page 61: The Open Science Grid

04/19/23 61

Grid Laboratory of Wisconsin

• Computational Genomics, Chemistry• Amanda, Ice-cube, Physics/Space Science• High Energy Physics/CMS, Physics• Materials by Design, Chemical Engineering• Radiation Therapy, Medical Physics• Computer Science

2003 Initiative funded by NSF/UWSix GLOW Sites

GLOW phases-1,2 + non-GLOW funded nodes have ~1000 Xeons + 100 TB disk

Page 62: The Open Science Grid

04/19/23 62

How does it work?

• Each of the six sites manages a local Condor pool with its own collector and matchmaker

• Through the High Availability Demon (HAD) service offered by Condor, one of these matchmaker is elected to manage all GLOW resources

Page 63: The Open Science Grid

04/19/23 63

GLOW Deployment• GLOW is fully Commissioned and is in constant use

– CPU• 66 GLOW + 50 ATLAS + 108 other nodes @ CS• 74 GLOW + 66 CMS nodes @ Physics• 93 GLOW nodes @ ChemE• 66 GLOW nodes @ LMCG, MedPhys, Physics• 95 GLOW nodes @ MedPhys• 60 GLOW nodes @ IceCube• Total CPU: ~1339

– Storage• Head nodes @ at all sites• 45 TB each @ CS and Physics• Total storage: ~ 100 TB

• GLOW Resources are used at 100% level– Key is to have multiple user groups

• GLOW continues to grow

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 64: The Open Science Grid

04/19/23 64

GLOW Usage• GLOW Nodes are always running hot!

– CS + Guests• Largest user• Serving guests - many cycles delivered to guests!

– ChemE• Largest community

– HEP/CMS• Production for collaboration• Production and analysis of local physicists

– LMCG• Standard Universe

– Medical Physics• MPI jobs

– IceCube• Simulations

Page 65: The Open Science Grid

04/19/23 65

GLOW Usage 3/04 – 9/05

Over 7.6 million CPU-Hours (865 CPU-Years) served!

Takes advantage of “shadow” jobs

Take advantage of check-pointing jobs

Leftover cycles available for “Others”

Page 66: The Open Science Grid

04/19/23 66

Example Uses• ATLAS

– Over 15 Million proton collision events simulated at 10 minutes each

• CMS– Over 70 Million events simulated, reconstructed and

analyzed (total ~10 minutes per event) in the past one year• IceCube / Amanda

– Data filtering used 12 years of GLOW CPU in one month• Computational Genomics

– Prof. Shwartz asserts that GLOW has opened up a new paradigm of work patterns in his group

• They no longer think about how long a particular computational job will take - they just do it

• Chemical Engineering– Students do not know where the computing cycles are

coming from - they just do it - largest user group

Page 67: The Open Science Grid

04/19/23 67

Open Science Grid & GLOW

• OSG Jobs can run on GLOW– Gatekeeper routes jobs to local condor cluster– Jobs flock to campus wide, including the GLOW

resources– dCache storage pool is also a registered OSG

storage resource– Beginning to see some use

• Now actively working on rerouting GLOW jobs to the rest of OSG– Users do NOT have to adapt to OSG interface and

separately manage their OSG jobs– New Condor code development

Page 68: The Open Science Grid

68www.cs.wisc.edu/~miron

Schedd

ScheddOn The

Side

Elevating from GLOW to OSG

Specialized scheduler operating on schedd’s jobs.

Job 1Job 2Job 3Job 4Job 5…Job 4*

job queue

Page 69: The Open Science Grid

69www.cs.wisc.edu/~miron

Gatekeeper

X

The Grid Universe

Schedd

Startds

RandomSeedRandomSeed

RandomSeedRandomSeed

RandomSeedRandomSeed Random

SeedRandomSeed

RandomSeedRandomSeed

RandomSeedRandomSeed

RandomSeed

RandomSeed

RandomSeed

RandomSeed

•easier to live with private networks•may use non-Condor resources

•restricted Condor feature set(e.g. no std universe over grid)•must pre-allocating jobsbetween vanilla and grid universe

vanilla site X

Page 70: The Open Science Grid

70www.cs.wisc.edu/~miron

Dynamic Routing Jobs

Schedd

LocalStartds

RandomSeedRandomSeed

RandomSeedRandomSeed

RandomSeedRandomSeed

RandomSeed Random

SeedRandomSeed

RandomSeed Random

SeedRandomSeed

RandomSeed Random

SeedRandomSeed

RandomSeed

RandomSeed

RandomSeed

RandomSeed

RandomSeed

ScheddOn The

Side Gatekeeper

X

Y

Z

vanilla site X

RandomSeed

RandomSeed

site Y site Z

•dynamic allocation of jobsbetween vanilla and grid universes.•not every job is appropriate fortransformation into a grid job.

Page 71: The Open Science Grid

71

Final Observation …

A production grid is the product of a complex interplay of many forces:

Resource providers Users Software providers Hardware trends Commercial offerings Funding agencies Culture of all parties involved …