chep 2003 summary grid architecture, infrastructure, & middleware monitoring & security...

27
CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

Upload: merilyn-ford

Post on 17-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

CHEP 2003 Summary Grid Architecture,

Infrastructure, & Middleware Monitoring & Security

Andrew HanushevskyStanford Linear Accelerator Center

Page 2: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 2: CHEP 2003

Legal Disclaimer

This summary is from one perspective It is not representative of any particular view

Other than the presenter

This summary is not warranted for any purpose whatsoever Participants assume all direct and indirect

(consequential or inconsequential) damages

Do you want to stay?Do you want to stay?

Page 3: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 3: CHEP 2003

Grid Deployment

Track I talks referenced grid “deployment” Deployment has many meanings

Minimally, if you have it working it better be usable Is it production ready?

Page 4: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 4: CHEP 2003

Production Grids

LCG Experience Suggests It Is Difficult Packaging, Installation, Configuration, &

Validation Issues “These issues (and more) make the difference

between the research project ending with a demo and the product to be used for production.”

-- Zdenek Sekera

Assume LCG (T#184) interpretation of production Harsh but be need a benchmark

Page 5: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 5: CHEP 2003

What is “production quality”?

It is all of the following in no particular order: availability 24 x 7 performance stability, robustness user friendliness maintainability user support

From LCG T#184From LCG T#184

Page 6: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 6: CHEP 2003

So Where Are We?

Let’s take a look at presented “grid” projects in alphabetic order From Grid to Grid-Like

Disclaimer! This is not representative of all such projects

Page 7: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 7: CHEP 2003

AliEn (M#253)

Distributed environment with Grid interface SASL (includes GSI) EDG compatible

authentication Distributed RDBMS-based file catalog Condor-like job scheduling Attempts to unify grid infrastructures Adopted by MamoGrid (M#66)

Page 8: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 8: CHEP 2003

Amanda (M#110)

Ostensibly production ready Condor + Bypasses + Local Tools (Grid Navigator)

Uses central s/w and data repositories Runs a specific application software suite Plan to integrate Globus middleware as it matures

Page 9: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 9: CHEP 2003

DIRAC (M#253)

Distributed environment Essentially a roll-your-own grid-like solution

Interface to EDG now in test EDG stability considered problematic

Successfully deployed on 17 sites

Page 10: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 10: CHEP 2003

EDG

Workload Management WP1 (M#132 & 137)

Deployed for 18 months Still pre-production stage

Various problems in reliability & scalability

Numerous improvements planned DAGMan integration Grid Accounting Resource reservation & co-allocation

• Globus GARA Approach

Page 11: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 11: CHEP 2003

EDG (continued)

Data Management WP2 (T#249 & 490) Basic use cases satisfied Not proven in a “real user environment”

Pre-production

Numerous additions planned Logical collection Enhanced security

Authorization and delegation OGSA direction with future compliance

Page 12: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 12: CHEP 2003

NorduGrid (M#109)

Modified/Extended Globus + EDG RLS Pre-production stage Additional EDG integration as stability improves Web Services (OGSA) plans

Page 13: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 13: CHEP 2003

SAM (T#335)

Successful for D0 and CDF Work under way to integrate with grid middleware Production D0 release of SAMGrid (JIM+Condor-G)

scheduled for April One of the arguably successful grid-like projects

Largely dealing with data management issues

Page 14: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 14: CHEP 2003

STAR (T#442)

Distributed environment Essentially a roll-your-own grid-like solution

Interface to Condor-G Uses LBL HRM/DRM

Successful (but limited) deployment NERSC & BNL

Page 15: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 15: CHEP 2003

Storage Resource Broker (T#211)

Successful deployment across multiple fields Work underway to integrate with Globus data

mangement One of the arguably successful grid-like projects

Limited to data management

Page 16: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 16: CHEP 2003

The Successes

Few projects have achieved “production” status Those which have are focused and grid-like

SAM, SRB soon to follow AliEn. Dirac, & Star It is not clear why this is so

Historical timeline? Immediate need for results? Funding model? Grid protocols in flux (e.g., Globus 2 vs Globus 3)? Open software/collaboration issues? Sociological phenomena?

Fortunately many plan to integrate with the “standard” grid Time will tell….

Page 17: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 17: CHEP 2003

The Fast Trackers

These projects have only incorporated some grid middle-ware Amanda & NorduGrid

Many difficult issues have been avoided, but…. Are we entering the OSI model of development?

Pick and choose from a bag of protocols & tools This does not bode well for interoperability

Page 18: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 18: CHEP 2003

The Simmering

“These” projects have embraced the grid EDG (parallels and derivates)

Problems not being avoided Adopted the long range view (2 or more years)

Will this be to the benefit of the HEP community? Depends on your of view of next generation computing It seems that all projects are hedging their bet

You wonder where we would be if all the hundreds of current FTE’s were focused on making this model really work

Page 19: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 19: CHEP 2003

State of Security

Three dominate themes Private Key Management

KCA (T#422), VSC etc. (T#81) Virtual Organization Management

VOMs (T#317) & GUMs (T#363) Authorization (a.k.a. Access Control)

GACL (T#190), SAZ (T#423), Akenti (T#426), CAS (T#441, 518)

Page 20: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 20: CHEP 2003

Security Convergence

Other than x.509 there is little common ground But, does there need to be any common ground?

Key management is a matter of trust policy VO administration is a site or multi-lateral prerogative Authorization is largely a local issue

It seems that if you can agree on the credentials (i.e., x.509 + endorsements) the rest is relegated to collaboration policy irrespective of implementation

This appears to be the direction Even if it’s not obvious at the moment

Page 21: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 21: CHEP 2003

Grid Monitoring

There is much activity Much of it overlapping

BOSS (M#84), GMA (M#403), GridMonitor (M#321), Mona Lisa (M#103), PerfMC (M#522), & R-GMA (M#407)

Some convergence Minimum set of events Format (XML yet no “lingua franca” agreement)

This is an area to watch! GGF is likely the stomping ground for agreement

Page 22: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 22: CHEP 2003

The Ultimate Highlights

Virtual Data

XML

Distributed File Systems

Job Scheduling

Peer to Peer Computing

“The” Award

Page 23: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 23: CHEP 2003

The Innovation Most At Risk

Virtual Data (T#106 & 114)

Great concept at technological mercy The Optiputer is the menace. Consider….

Unlimited bandwidth Ever decreasing storage costs Constant software changes Sociological problems of capturing the processing path

Together these may make VD untenable

Page 24: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 24: CHEP 2003

Things to Watch For I

XML This is rapidly becoming the common syntax

Yet little effort in developing a common language Assumption, perhaps misguided, that WSDL repositories

will address the problem• Diamonds (iKnow) architecture (Java RMI + JINI)

Distributed Grid File Systems Minimal data movement with global access

AlienFS (R#254) There are many others that were not presented

Page 25: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 25: CHEP 2003

Things to Watch For II

Job to Data Scheduling Algorithms to place a job near the data

Minimize data movement

Peer To Peer Computing Marxist scheduling aiming for 100% utilization

Not yet addressed by current grid architectures Ad hoc protocols Subversive in that this may be the “real” next thing

Augernome (R#293)

Page 26: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 26: CHEP 2003

Summarizer’s Award

The project that makes innovative yet practical use of existing grid protocols Grid Brick (R#493)

Parallel root-based query using Globus scheduling Uncomplicated and practical needs-based approach It’s so obvious you wonder why you didn’t do it first

It works within a standard grid environment! Load balancing and fault tolerance to be explored

Page 27: CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

March 25, 2003 27: CHEP 2003

Conclusions

Grid efforts are still meandering Great for innovation Dismal for standardization

Security is a bright spot Rapid convergence on authentication issues Authorization is more fuss than furry

There is a light at the end of tunnel

Monitoring situation is disappointing The need is recognized but no agreement on how to proceed

Cross grid monitoring is in serious jeopardy