interfacing interactive data analysis tools with the grid: ppdg cs-11 activity doug olson, lbnl...

30
Interfacing Interactive Data Analysis Tools with the Grid: PPDG CS-11 Activity Doug Olson, LBNL Joseph Perl, SLAC ACAT 2002, Moscow 24 June 2002

Post on 21-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Interfacing Interactive Data Analysis Tools with the Grid:

PPDG CS-11 Activity

Doug Olson, LBNLJoseph Perl, SLAC

ACAT 2002, Moscow24 June 2002

24 June 2002 D. Olson, PPDG CS-11 for ACAT 2

Contents

• Background on PPDG, CS-11• Who is involved• Workshop last week (18,19 June)• Themes that emerged• Near-term goals• Longer term planning• Summary

24 June 2002 D. Olson, PPDG CS-11 for ACAT 3

The 3 US grid projects for HENP are PPDG, GriPhyN, iVDGL

24 June 2002 D. Olson, PPDG CS-11 for ACAT 4

PPDG CS-11

24 June 2002 D. Olson, PPDG CS-11 for ACAT 5

Background

• CS-11 long title:Interfacing and Integrating Interactive Data Analysis Tools with the Grid and Identifying Common Components and Services– Subtitle:

Consider physicist sitting at home institution; “What does she need from grid to carry out physics analysis?”

• CS-11 is:– Not new funding– New work area within PPDG mission of grid enabling

end-to-end physics applications for US HENP– Driven by experiments needs, middleware providers

want to know if new/different grid services needed

24 June 2002 D. Olson, PPDG CS-11 for ACAT 6

18,19 June workshop in Berkeley

Chip Watson [email protected] Liu [email protected] Olson [email protected] Perl [email protected] Steenberg [email protected] Zhu [email protected] Johnson [email protected] Hjort [email protected] Sakrejda [email protected] Shoshani [email protected] Wu [email protected] Romosan [email protected] Gu [email protected] Sim [email protected] Holtman [email protected]

Fons Rademakers [email protected] Pordes [email protected] Stockinger [email protected] Marco [email protected] Martinez Rivero [email protected] Turri [email protected] Donszelmann [email protected] Ballintijn [email protected] J. Gowdy [email protected] Alexander [email protected] Brock [email protected] Avery [email protected] Deng [email protected] Aslakson [email protected]

Purpose: •Review experiments’ requirements•Overview of existing tools & technology•Discuss existing/planned activities•Identify opportunities for cooperative work on defining interfaces and prototype integration of analysis tools with common grid services.

24 June 2002 D. Olson, PPDG CS-11 for ACAT 7

Review Use Cases for requirements

24 June 2002 D. Olson, PPDG CS-11 for ACAT 8

Review Tools & Technology

24 June 2002 D. Olson, PPDG CS-11 for ACAT 9

Abstract Interfaces for Data Analysis

24 June 2002 D. Olson, PPDG CS-11 for ACAT 10

Java Analysis Studio

24 June 2002 D. Olson, PPDG CS-11 for ACAT 11

PROOF

24 June 2002 D. Olson, PPDG CS-11 for ACAT 12

Clarens

24 June 2002 D. Olson, PPDG CS-11 for ACAT 13

Interactivity in a batched grid environment

24 June 2002 D. Olson, PPDG CS-11 for ACAT 14

MCAT – Metadata Catalog in SRB

24 June 2002 D. Olson, PPDG CS-11 for ACAT 15

SDM Center – bitmap index

24 June 2002 D. Olson, PPDG CS-11 for ACAT 16

Grid Architecture view

24 June 2002 D. Olson, PPDG CS-11 for ACAT 17

EDG testbed

24 June 2002 D. Olson, PPDG CS-11 for ACAT 18

Experiments thoughts, plans, activities

• ATLAS– Python interface between Athena framework and

grid services

• CMS– Grid Analysis Environment (GAE)

• Phobos (& ALICE) – PROOF-based analysis

• Others (BaBar, Jlab, STAR) at meeting without presentations– Extraction model probably good for BaBar, Jlab– PROOF likely to work for STAR

24 June 2002 D. Olson, PPDG CS-11 for ACAT 19

ATLAS extraction view

24 June 2002 D. Olson, PPDG CS-11 for ACAT 20

24 June 2002 D. Olson, PPDG CS-11 for ACAT 21

CMS Analysis Scope

24 June 2002 D. Olson, PPDG CS-11 for ACAT 22

24 June 2002 D. Olson, PPDG CS-11 for ACAT 23

CMS – Clarens for interconnect (arrows)

24 June 2002 D. Olson, PPDG CS-11 for ACAT 24

PROOF & Grid

24 June 2002 D. Olson, PPDG CS-11 for ACAT 25

ALICE (by proxy)

24 June 2002 D. Olson, PPDG CS-11 for ACAT 26

Themes (or opinions)

• Varying degrees of depth to which grid penetrates interactive analysis:1. Select data from grid and extract a local

(non-grid) copy (proceed with interactive analysis independent of grid)

2. Run analysis as grid batch jobs while having intermediate results returned for monitoring

3. Run analysis as grid jobs while having intermediate results returned and have a control channel to jobs to interrupt or guide processing

24 June 2002 D. Olson, PPDG CS-11 for ACAT 27

Sample Requirements

• Ability to select/extract data objects from grid at one level below event (raw, ESD, AOD, … components)– Do not need arbitrarily fine-grained objects from grid (hit,

track, …)

• User interface/interaction should be same with or without network connection– Similar to web browser cache, I.e., same tool, same URL

• Ability to debug grid jobs• Distributed databases (metadata, calibration/conditions,

…)• Working single sign-on and VO/group/user authorization• Estimate of time & resources to run an analysis• Laundry list of requirements being developed in use-

cases document, not all shown here

24 June 2002 D. Olson, PPDG CS-11 for ACAT 28

Near-term goals

• Interest in common metadata catalog– ATLAS, CMS collaborating on GriPhyN Virtual Data

Catalog, others welcome– What about SRB/MCAT, AliEn?

• Considering metadata catalog at event-component level

• Considering AIDA, HepRep for results collection, extraction interface

• Interest in PROOF-Grid• Interest in JAS-Grid• Finish use cases / requirements document• Example demos for SC2002

24 June 2002 D. Olson, PPDG CS-11 for ACAT 29

Longer-term planning

• Develop detailed workplan for Sept. 9 (US Physics Grid Projects week in San Diego)

• Discuss interaction & cooperation with Crossgrid work on interactive analysis

• Consider grid interface to PROOF and JAS as good test of common services

24 June 2002 D. Olson, PPDG CS-11 for ACAT 30

Summary

• Just beginning to consider grid for interactive data analysis.

• Aim at interfacing existing tools to grid services• To identify missing services and collaborate on

defining/developing common services– HEP-specific metadata catalog– Interactive control/monitor interface?

• Identify a few common projects, possible candidates are:– PROOF + Grid– JAS + Grid– Event component level catalog– … (work in progress)

• Want close ties with other grid effort on interactive analysis