what we’re doing why we’re doing it what we’ve learned by doing it
Post on 05-Jan-2016
56 Views
Preview:
DESCRIPTION
TRANSCRIPT
Dave Morrison, CHEP, February 7, 2000
• What we’re doing
• Why we’re doing it
• What we’ve learned by doing it
PHENIX Offline Computing
David Morrison
Brookhaven National Laboratory
Dave Morrison, CHEP, February 7, 2000
a word from our sponsors ...
• large collaboration (>400 physicists)
• large, complex detector– ~300,000 channels– 11 different detector subsystems
• large volume of data, large number of events– 20 MB/sec for 9 months each year– 109 Au+Au events each year
• broad physics program– partly because RHIC itself is very flexible– Au+Au at 100+100 GeV/A, spin polarized p+p, and everything in-between– muons, electrons, hadrons, photons
Dave Morrison, CHEP, February 7, 2000
from the PHENIX photo album
DPM, in hardhat
Dave Morrison, CHEP, February 7, 2000
the eightfold way of PHENIX offline computing
• know your physics program– for PHENIX, event processing rather than event selection
• know your constraints– money, manpower ... and tape mounts
• avoid “not invented here” syndrome: beg, borrow, collaborate– doesn’t automatically imply use of commercial products
• focus on modularity, interfaces, abstract base classes
• viciously curtail variety of architecture/OS– Linux, Solaris
• data management and data access are really hard problems– don’t rely on fine-grained random access to 100’s of TB of data
• everyone has their favorite reference works...– Design Patterns (Gamma et al)
• run-time aggregation, shallow inheritance trees– The Mythical Man-Month (Brooks)
• avoid implementation by committee
Dave Morrison, CHEP, February 7, 2000
building blocks
• small group of “core” offline developers– M. Messer, K. Pope, M. Velkovsky, M. Purschke, D. Morrison, (M. Pollack)
• large number of computer-savvy subsystem physicists– recruitment via “help wanted” list of projects that need people
• PHENIX object-orented library, PHOOL (see talk by M. Messer)– object-oriented analysis framework
• analysis modules all share common interface– type-safe, flexible data manager
• extensive use of RTTI, avoids (void *) casts by users• ROOT I/O used for persistency
– “STL” operations on collection of modules or data nodes
• varied OO views on analysis framework design– ranging from passive data to “event, reconstruct thyself”– PHOOL follows a hybrid approach
• migrated to PHOOL from STAF in early 1999– no user code modified (~120,000 LOC)
Dave Morrison, CHEP, February 7, 2000
more blocks
• lots of physics-oriented objects in PHENIX code– geometry, address/index objects, track models, reconstruction
• file catalog– metadata management, tracks related files, tied in with run info DB
• “data carousel” for retrieving files from HPSS – retrieval seen as group-level activity (subsystems, physics working groups)– carousel optimizes file retrieval, mediates resource usage between groups– scripts on top of IBM-written batch system
• event display(s)– very much subsystem-centered efforts; all are ROOT-based– clearly valuable for algorithm development and debugging– value for PHENIX physics analysis much less clear
• GNU build system, Mozilla-derived recompilation (poster M. Velkovsky)– autoconf, automake, libtool, Bonsai, Tinderbox, etc.– capable, robust, widely used by large audience on variety of platforms – feedback loop for code development
Dave Morrison, CHEP, February 7, 2000
databases in PHENIX
• Objectivity used for “archival” database needs – Objy used in fairly “mainstream” manner
• all Objy DBs are resident online (not storing event data)– autonomous partitions, data replicated between counting house, RCF– RCF (D. Stampf) ported Objy to Linux
• PdbCal class library aimed at calibration DB application– insulates typical user from Objectivity– objects stored with validity period, versioning– usable interactively from within ROOT
• mySQL used for other database applications– Bonsai, Tinderbox system uses mySQL– heavily used in “data carousel”
Dave Morrison, CHEP, February 7, 2000
simplified data flow
Objectivity federated DB
calibrations& conditions
countinghouse
diskNFSdisk
analysisfarm
HPSS
Dave Morrison, CHEP, February 7, 2000
• subclasses of abstract “Eventiterator” class used to read raw data– from online pool, file, or fake test events - user code unchanged
• online control architecture based on CORBA “publish-subscribe”
• Java used in counting house for GUIs, CORBA
• subsystem reconstruction code uses STL, design patterns– not unusual to hear “singleton”, “iterator” at computing meetings
• OO emerging out of subsystems faster than from core offline crew
OO ubiquitous, mainstream in PHENIX
Dave Morrison, CHEP, February 7, 2000
OO experiences
• no Fortran in new post-simulation code – sidestepped many awkward F77/C++ issues, allowed OO to permeate
• loosely coupled, short hierarchy design working well – information localization on top of information encapsulation– allows decoupled, independent development
• no formal design tools, but lots of cloudy chalkboard diagrams– usually just a few interacting classes
• social engineering as important as software engineering– OO not science-fiction, not difficult ... and it’s here to stay– lots of hands-on examples, people are usually pleasantly surprised
Dave Morrison, CHEP, February 7, 2000
more OO experiences
• OO was oversold (not by us!) as a computing panacea– does make big computing problem tractable, not trivial– occasional need for internal “public-relations”
• cognizance of “distance” between concepts advocated by developers and those held by users
– e.g., CORBA IDL a great thing; tough to sell to collaboration at-large
• takes time and effort to “get it”, to move beyond “F77++”– general audience OO and C++ tutorials have helped– also work closely with someone from each subsystem - helps the OO
“meme” take hold
Dave Morrison, CHEP, February 7, 2000
summary
• PHENIX computing is essentially ready for physics data– use of PHOOL proven very successful during “mock data challenge”
• ObjectivityDB is primary database technology used throughout PHENIX
• reasonably conventional file-oriented data processing model
• loosely coupled, shallow hierarchy OO design– common approach across online and offline computing
• several approaches to recruiting, stretching scarce manpower– deliberate, explicit choice by collaboration to move to OO– recruit manpower from detector subsystems– loosely coupled OO design aids loosely coupled development
• OO has slowed implementation, but has been indispensable for design
• PHENIX will analyze physics data because of OO, not in spite of it
top related