conditions metadata for tags

Conditions Metadata for TAGs

Elizabeth Gallas,(Ryan Buckingham, Jeff Tseng) - Oxford

ATLAS Software & Computing WorkshopCERN – April 19-23, 2010

April 2010 Elizabeth Gallas - COMA 2

Outline History of Conditions Metadata / “Current” COMA Schema Evolution in context to overall TAG DB project

Overview of Oracle Schemas for TAGs

1. TAG DB (Event-wise Metadata)2. CATALOG (Dataset Catalogue Metadata)3. COMA (COnditions MetadatA)

Upload procedure/Status Documentation and Links Fundamental design principles Uses of Conditions Metadata

ELSSI usage runBrowser – interface runBrowserReport – interface

Functionality in our Future Some unfortunate current operational issues Conclusions

Upon consideration of limited time, see next slide

…


Current COMA Tables

Okay ! Too much detail for 15 minutes talk (don’t send me into the volcano) … See Backup/DocLinks for many details … I’m going to summarize in a few slides, then talk about the fun part: using COMA in ELSSI, runBrowser, runBrowserReport


Documentation and LinksCOMA Documentation

COMA Schemahttps://gallas.web.cern.ch/gallas/COMA_Visio.pdf

COMA Tableshttps://gallas.web.cern.ch/gallas/COMA_Tables.html

COMA developer documentation (not user doc):https://gallas.web.cern.ch/gallas/Conditions_to_TAGs.html

COMA user documentation Will be integrated into the interfaces below

COMA User Interface Links (test versions)NOTE: The interfaces require a grid certificate in you browserNOTE: The interfaces are not in production server locations

runBrowser (2) Link:https://lxvm0341.cern.ch/tagservices/dev/rbucking/TAGBrowser/

trunk/runBrowser2/runBrowser.php runBrowserReport Link:

https://lxvm0341.cern.ch/tagservices/dev/gallas/TAGS/runBrowser/runBrowserReport.php?q=find+run+142406 for Run 142406

(replace the Run Number in the URL with your Run of interest)

Let’s try them after I mention some design

principles


General Design Principles (1)The fundamental components are1. The COMA Relational Database tables2. The runBroswerReport – the report interface for the COMA Tables3. The runBrowser – the interface for RunLB selection using COMA Tables

1. COMA Tables: Must provide information ELSSI needs to decode TAG attributes Include information for both Online and MC Runs

TAGs for Online/MC have the same attributes (no MC truth) Catalogue for Online/MC reflects similar processing workflows

Overall system must handle gracefully missing information Upload select conditions for Runs of ‘analysis interest’

not all Runs and not all Conditions Refine/Correct/Derive conditions to form more effective criteria

2. runBrowserReport = web report interface to COMA Tables Intended to display what COMA knows about each Run Provides links to information in other systems

runQuery, AMI, Trigger, Data Quality …or reports using COOLCherryPy Will be used by ELSSI & runBrowser to provide more information


General Design Principles (2)3. runBrowser = interface for RunLB selection using COMA Tables

Purpose: Make conditions metadata available as selection criteria in advance of analysis … Envisioned as the Run-level browser for ELSSI … current implementation makes it also available stand-alone. Intermediate results may be what the user is looking for

I.E. show me the Runs taken on this date or during this Fill. Final output (clicking on “Finish” button):

LB level criteria is applied at the final “Finish” stage.Output: A report showing the Run/LBs passing final criteriaOutput: An xml file (GoodRunList) which can be used by ELSSI etc.

runBrowser IS NOT runQuery (browser to all online Runs in COOL) Enables not only Run selection by conditions criteria but also

displays the possible values of remaining criteria and its relationship to other criteria

Criteria can be imposed in any order … some choices open selection to deeper criteria

Where appropriate: Allows radio, checkbox, or text (command line) entry of criteria Allow list and/or ranges of values, wildcards, case insensitivity ...

Incorporate features to customize rows displayed and other tricks to improve performance


Test version of runBrowserhttps://lxvm0341.cern.ch/tagservices/dev/rbucking/TAGBrowser/trunk/

runBrowser2/runBrowser.php Note: “Under Construction”!

Each section expands/collapses

1. Purpose / Instructions

2. Selection Summary

Starts out empty

3. Selection Criteria

A. Data Source

B. Run Type

C. Temporal Selection

D. Project Name

E. Trigger Master Key

F. DAQ Configuration

G. Run Number

These are example criteria …

more criteria will be added


Example demonstrates General Principles: There is no prescribed order of

selection or mandatory selections Expand section of interest, make

selection: available radio/checkbox or use

the textbox to type a list or range of values

Click Submit

I chose Project “data09_900GeV”, then I see there are 99 Runs left I see their run and date range I see the criteria has appeared in the

selection summary I could remove it with button click

I see ALL the other sections have changed to reflect this criteria !

Look at the remaining 99 runs ..

(next slide)

Iterate anynumber of times


The Run Section includes links to other systems Click on the Run Number generates the runBrowserReport Other links are to AMI, RunList, and Trigger Reports for that Run number

Other related selections to be added to runBrowser2:

1. Run Duration

2. Number of LB

3. Number of Events Recorded

runBrowserRun Number Section

Click on the Run Section to open it … the run numbers appear

Run selection is NOT mandatory … you can go onto FINISH without any Run explicitly selected

Next slide:runBrowserReport

for Run number 142406


runBrowserReport OverviewEach Yellow section expands … This

report has 5 Primary sections, the Trigger section has subsections

General Run info Run Date …

COMA Load Status COMA loading messages

DQ LBSUMM assessments COOL tagged/locked

Prescale Evolution How many times did prescales

change during the Run Trigger section

Has an HLT summary Counts of active/disabled

Has expanding subsections HLT Chains (2 subsections)

Physics Others

Level 1 Items


COMA runBrowserReport : Overview

Run 142406 Trigger Summary shows:has 23 active physics chains (of 162)

Click on the Show/Hide link to show/hide the grey rows of chain/items

tables in respective subsections: HLT (show/hide disabled chains)

Physics (complete EF-L2-L1 chains) Others (commissioning chains)

Level 1 (show/hide passive items)The trigger tables show the prescale range

and PS,PT,RR flags of the new derived “Run Aggregate prescale” COMA tables

This new information allows:ELSSI to show only chains which are “active” during the Run.With report like this to show all chains.

https://lxvm0341.cern.ch/tagservices/dev/gallas/TAGS/runBrowser/runBrowserReport.php?q=find+run+142406


Functionality in our Future (1) COMA: Loading programs continue to be improved/refined

Some areas better resolved than others. We are making a list of the basics needed for production running … listed here are things initiated.

Select Runs with many of the criteria now available in runQuery Project Name, Magnetic field, Beam conditions, DAQ config, Master Key

(Trigger, LVL1/HLT PS Keys) … Select Runs in particular Streams (with event counts available)

New: Define Run-wise Aggregate prescales, passthrough, rerun Calculates Run-wise summary flag: PS, PT, Rerun per triggger

Useful to ELSSI to display active/passive/disabled triggers per Run … So that users do not select triggers in a Run which would yield 0 events because that trigger was not active

Useful for runBrowser to find the active chains for each Run Working with Trigger group for validation

Data Quality upload Can load tagged/locked LBSUMM DQ assessments Schema and Loading – groundwork laid, interfaces underway

Can envision ways to include virtual flags with this information Will work with DQ group for verification/refinement


Functionality in our Future (2) … more Trigger related criteria …

Select Runs with one/more particular trigger chains Chains configured Chains active via prescale (have prescale >0) Chains active via passthrough (have passthrough > 0)

The RAW trigger result per event is a TAG DB attribute Chains which may be RERUN.

Some criteria will use information from the CATALOG Metadata: Select Runs uploaded to TAG DB

These Runs ARE the Runs of analysis interest The Runs ATLAS chooses to reconstruct and reprocess These Runs follow the CREM deletion policy

Select Runs with a particular AMI tag Processing related information is tracked in the Catalog


Some Unfortunate Operational Notes Recently, the server for runQuery and trigger web interfaces

became unrecoverable. Some services were moved to new locations Some files on the failed server were lost and are difficult to

recover Some services were in the process of moving to a new server

anyway. This has had 2 effects on COMA/runBrowser/Report:

Some loading programs were using lost data so some runs are now missing that data

Plan: Rewrite those programs to use more stable source. The web report links point to the old server

Plan: Fix the links to point to the new location.

This is not a complaint about these particular web servicesWe need to make COMA loading more robust

we have a plan how to do that.


ConclusionsThis talk reflects an evolving system … current information in the system

is growing based on information available and use cases Adding more dimensions to the Conditions data

With suitable relationships to facilitate queries Making that criteria available in a dynamic useable interface

We want to insure the Metadata is complete enough to satisfy use cases while reflecting accurately its limitations

Interfaces are being constructed to use selection syntax, criteria, and communication in common use in ATLAS

i.e. runQuery, GoodRunList xml …

This facilitates cross checks with other systems

Continuous process: talking with various experts to ensure

data integrity, completeness, compatibility w/other systems


Backup


Early history of this Conditions Metadata project 2007: first Conditions Metadata tables were filled for MC

simulation tests/prototype development Streaming Test, Full Dress Rehearsal (FDR) exercises Run/LB-wise conditions were collected from MC log files

and other sources into relational DB tables (COMA) And INSERTED into then new folders in Conditions DB (COOL) Which formed the prototype for the ATLAS Luminosity calculation Other Trigger/DAQ COOL folders defined at same time

(RichardHawkings, Trigger/TDAQ)

2008: COMA tables used by ELSSI prototype (still just MC) Conditions, particularly trigger configuration … things not

practical to store event-wise 2009: start extracting Run/LB-wise information from COOL

into COMA tables to facilitate efficient access to Conditions Metadata by ELSSI for Online Runs


History of this Conditions Metadata project2009 … (continued … #1) … runBrowser: new interface for finding Runs sharing common

conditions Initially development tool: check data integrity/relationships of COMA

tables We (TAG developers) realized this runBrowser would be more

generally useful … separate Run-browsing (runBrowser) from Event-browsing (ELSSI) … make runBrowser a stand-alone tool

May2009: TAG group developed first DTD for GoodRunList XML This xml was how runBrowser would communicate to ELSSI the Run/LB

selection and selections criteria. This DTD has since been taken up/over by the ATLAS experiment to

communicate good Run/LB ranges between subsystems. NOTE: runBrowser is NOT a replacement for runQuery (the web based

browser to all online Runs in the Conditions DB) runBrowserReport development started

Which will help ELSSI and runBrowser describe underlying COMA info and provides more links to other ATLAS systems

COMA tables expanded to include more conditions in anticipation of use cases / expand selection criteria


History of this Conditions Metadata project2009 … (continued … #2) … As data has been added to COMA tables

runBrowser evolved accordingly At various points … selection criteria included:

Data Source (online,simulation), Data Type, Run Number, Duration, Number of LBs, temporal, DAQ configuration, Filename tag (Project Name), Trigger Super Master Key, Trigger LVL1 and/or HLT Prescale Key, DQ assessments from LBSUMM detectors…

Useful for initial purpose and query development

This implementation demonstrated the basic design principles … but the code was not generally well constructed …


More History of this Conditions Metadata project2009 … (continued #3) … into 2010 … October 2009: Ryan Buckingham (Oxford) joined project

Now up to speed on many aspects of the system. primarily working on runBrowser

Reorganization/enhancement of runBrowser code converting components to object oriented structure improved

internal communication inclusion of criteria to the GRL xml to pass to ELSSI Facilitates eventual system to save user criteria for future sessions and

sharing with others, physics groups Making addition of selection criteria more modular Adding features and functionality to facilitate syntax alignment

with runQuery Improving look/feel of interface


Evolution of overall TAG DB project (last year) TAG DB / ELSSI: evolved to a distributed architecture

Realization: Not possible to upload all TAGs at any one Oracle site Advantageous to have some TAGs at multiple sites …

So ELSSI: needs to know which TAGs are uploaded at which voluntary sites Add new relational schema TAGS CATALOG (Elisabeth Vinek)

Contains processing/upload information needed to deploy distributed TAG services on the grid

TAG DB / CATALOG / COMA Schemas work together Common threads include: Run Number, Stream, Project Name … COMA tables also use the CATALOG

Upload only Run/LB metadata for Runs in CATALOG This reduces handling of conditions anomalies

allowing us to focus on Runs of ‘event analysis interest’ Steps in Database loading – ideally within hours of reconstruction

1. TAGs uploaded to Oracle2. CATALOG tables updated3. COMA tables updated

Now changing to use Tier-0 DB to advance the COMA loading phase


TAG DB Event-wise metadata tables

electron (Et, eta, phi …) muon (Et, eta, phi …) …and references to RAW, ESD, AOD files

Official data processing chain:RAW

Data Catalogue tables Stores information on file and dataset

processing and location Project name AMI tag (what processing occurred)…

Sources: AMI (ATLAS Metadata Catalogue, Tier0 …

‘COMA’ (COnditions MetadatA) tables Conditions of data taking

Beam conditions Trigger and DAQ conditions Magnetic field …

Various sources:Conditions DB, Log files, xml files, email…

Oracle Database: TAG DB and associated metadata tables

RUNS

3. COMA tables

…EventLBRun …EventLBRun

…EventLBRun …EventLBRun

…EventLBRun …EventLBRun…EventLBRun

1. TAG DB: Event-wise metadata

ORACLE DB

2. DATA Catalogue Tables

conditions metadata for tags

Documents

coma tablespurpose

coma tablesthe runbrowser

coma tablesintended

coma tablescoma tables

conditions metadata

tags elizabeth gallas

elssi runbrowser

information elssi