cern it department ch-1211 geneva 23 switzerland t data management & analysis gs group meeting...
TRANSCRIPT
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
Data Management &Analysis
GS Group Meeting of 2008
April 4th 2008
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
People
Ali Boloori KH visitorAndrew Maier StaffAnton Lechner Doct StudentBirger Koblitz StaffDan Van Der Ster FellowDietrich Liko StaffFabrizio Furano FellowFlavia Donno StaffHurng-Chun Lee ASGC funded long-term visitorKuba Moscicki StaffMaarten Litmaath StaffMassimo Lamanna StaffWilliam Ollivier Technical student
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
DMA Activities
• Data management
– Several activities
• SRM (and data management at large)
• xrootd (a.k.a. SCALLA)
• AMGA
• Analysis
– Heritage of the ARDA
• Main activity: Ganga
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
ATLAS Distributed Computing– Development (M. Lamanna)
• R. Rocha (Monitor)– B. Gaidioz
• J. Elmsheuser (Distributed analysis)– Replacing Dietrich Liko in this role– Dan Van Der Ster
• R. Wenaus (Panda)• M. Branco (DDM)
– Operations (A. Klimentov – bd 510 part time)• ...• S. Campana (Tier 0)
– Hurng-Chun Lee, Alessandro Di Girolamo• B. Koblitz (Central Services)• ...
– Flavia and Maarten• ...
Ganga ”communities” out
there!
HARP GarfieldGarfield
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Ganga users (most of them happy, I hope :)
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
• More on projects/activities later...
• Some considerations:– Data management
• Key for success. IT core business of IT (CASTOR, SRM, FTS, LCG ...) and of previous activities (again SRM, AMGA, xrootd, etc...)
– Analysis• Another key for success. Closer to the experiments
(users), nevertheless our contribution is central (Ganga, ASAP --> Monitor --> Dashboard, etc...)
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices WLCG
GS/DMA
ITIT/GS
ATLAS and CMS
GS/DMA
LHCb
ATLAS
LHCb
ALICE
CMSALICE
DMA Geography
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
DMA personnel
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Massimo Lamanna
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Functions
• Section leader
• Atlas Distributed Computing
– Responsible for the development
• Running down on ARDA project leader, EGEE NA4 coordinator, EGEE User Forum organiser etc...– Lots of interesting things done in the last few
years– Several of them still with us
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Challenges
• Give my contribution to make ATLAS as a
success
– This is not a commitment to ATLAS per se. CERN
success coincides with a successful LHC
programme, whose essential part is the success
of LCG and of the 4 experiments
• Help you– to be (even more!) effective in your contribution
(as an individual and as a team)– to discover/use/improve your skills
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Dietrich Liko
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Dietrich (as seen by Massimo)
• During the 4-year period of ARDA/EGEE1&2
he acted as my deputy
• He is the ATLAS Distributed Coordinator
– Being replaced by Johannes Elshauser (Munich)
• He has a great fraction of the merit of the
spectacular grow of users
– Happy users: in the ATLAS SW week in Munich I
was personally amazed by the encouraging
feedback!
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Challenges
• Forget pyhton, learn physics
– Susy, I have been told
• Move into new unknown territories:
– CMS and Vienna
• I (Massimo) would like to thank him for the
invaluable contribution and wish all the best
for his career and life back in Austria Felix
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Hurng-Chun Lee
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Hurng-Chun Lee
• Long-term visitor (ASGC) since '05; at NIKHEF starting in May
• WLCG/ATLAS
– first gLite WMS evaluation ( ATLAS-EGEE taskforce '05)
– Ganga development: LCG plugin dev/maint. (2006 - now) adopting WMS functionalities improving grid usability for user analysis
– CERN Panda service test (2008)
– NL T1 manager and contact for ATLAS (2008)
• EGEE/NA4
– exploit and disseminate the grid tools developed by ARDA (DIANE, GANGA, AMGA) in non-HEP communities, especially in the Asia-Pacific area
– initiate the avian flu drug challenge on EGEE in 06 developer, coordinator as well as the bi-direction
contact person between local biologists and the EGEE biomed community
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Hurng-Chun Lee
• Being the interface between ATLAS and the NL T1 grid facility– understanding both the experiment requirements
and the site issue and trying to improve the communication
– coordinating the production activities within the cloud
• Helping local users to use the grid for data analysis– maintaining the local analysis environment
– continuing the participation of the grid tool development taking into account the new requirements from local users
Again, special thanks from Massimo
and best wishes for the new position at NIKHEF
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Flavia Donno
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Previous and current activities
Coordinator of the Grid Storage System Deployment (GSSD) Working Group (Jan 2007 – Jan 2008) :• Deploy SRM v2.2 in production by the end of 2007• Define Glue 1.3 for Storage and validate the correspondent information providers accordantly• Coordinate the definition and configuration of Storage Classes and Storage Tokens at sites• Define a plan for migration from SRM v1.1 to SRM v2.2
SRM enabledTesting Framework
Developer of the S2 testing framework and test families for SRM.
• S2 is distributed through sourceforge.net• It is used by developers (dCache/CASTOR) to validate their new version• It is used by the WLCG certification team• S2 will be integrated into SAM
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Challenges
Technical support for experiments: • Concentrating on storage management, file transfer, and data access• … And not only
Member of the Storage Solutions Working Group• Spec out the addendum to the SRM v2.2 WLCG MoU and detail the identified missing features to be implemented• Report and follow up storage issues reported by the experiments, suggesting possible workaround/solutions or coordinating the strategies proposed by the storage providers.
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Fabrizio Furano
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
CS degree, formerly working in the industry (web apps, call
switching and modern Contact Center tech), Ph.D. in CS
working for INFN-PD (BaBar from 2002 to 2007)One of the day-0 core dev. of XROOTD (aka Scalla
[Structured Cluster Architecture for Low Latency Access])Resp. of Client side and data xfer optimizations
Good and long collaboration with the ROOT teamNow working in IT/GS as xrootd expert (critical for ALICE)And for whoever else asks for support or new ideasStriving to put its storage efficiency and robustness to new
levelsMaking interactive analysis possibleA passion for high quality sw, music and happy living.
CS degree, formerly working in the industry (web
apps, call switching and modern Contact Center tech),
Ph.D. in CS working for INFN-PD (BaBar from 2002 to
2007)One of the day-0 core dev. of XROOTD (aka Scalla
[Structured Cluster Architecture for Low Latency
Access])Resp. of Client side and data xfer optimizations
Good and long collaboration with the ROOT
teamNow working in IT/GS as xrootd expert (critical for
ALICE)And for whoever else asks for support or new ideasStriving to put its storage efficiency and robustness to
new levelsMaking interactive analysis possibleA passion for high quality sw, music and happy living.
04-Apr-2008 Fabrizio Furano - IT/GS intro 23
23
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
A carefully crafted solution for ‘pure’ massive storageWith hooks for connecting to ext. systemsLow-latency scalable high performance features
Structured Clustering provided by cmsd servers (formerly olbd)
Exponentially scalable and self organizing
Several ‘customers’, old and new integrations with many, many
systems and tools (bundled with ROOT since 2004)Very soon on Savannah as a central point to gather the various
information around. Work in progress…Some key features: Full POSIX access, FUSE support (hence,
SRM-open), fully WAN-enabledServer clustering (~200K servers per cluster), WAN metaclustersLow setup / admin costs. Self organisation. Fault toleranceHigh efficiency (low CPU/byte overhead, small memory footprint)Complexity scales linearly (from trivial to overkill, based on
requirements) 04-Apr-2008 2
4
24
Fabrizio Furano - IT/GS intro
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Anton Lechner
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
• Doctoral student at CERN, working on PhD thesis in Physics: „High precision
dosimetry for ion therapy“
– Monte Carlo particle transport simulations using the Geant4 toolkit
– Implementation and validation of EM physics models (low-energy domain)
– Focus on medical applications, primarily heavy ion therapy (MedAustron)
– Distributed computing techniques (Grid) for MC production
• Current and past activities:
– Thorough and quantitative assessment of Geant4 physics models relevant to
the simulation of low-energy heavy ion beams and their secondary products
• Proton Bragg peak validation (examining different EM and hadronic
models)
• Examining precision of computed electron and photon energy deposition
(keV domain)
• Investigation of transmission and backscattering of electrons,...
– Application of Ganga and DIANE for large-scale MC productions on the Grid
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
• Ion model implementation– Implementation of various
electronic stopping models in Geant4 (for heavy ions): semi-empirical, theoretical, first-principle approaches
– Modelling of straggling
• Validation of new models– Precise validation of the ion
models with respect to reproducibility of experimental data (Spatial energy deposition, stopping powers,...)
• Medical simulations
• Application of ion models to
medical use cases
• Utilization of Grid
infrastructure for medical
simulation approaches
• Main activities in 2008
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
Maarten Litmaath
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Previous experience in IT/GD
•2003-2004 Certification section–Middleware integration, testing, debugging–Support
•GD/FIO/PSS, experiments, sites, partner projects
–Liaison to VDT and Globus
•2005-2007 Service Coordination section–2006 Service Challenges (data throughput)–SRM v2.2, GLUE v1.3–Representing deployment in gLite Design Team–Grid Security Vulnerability Group
IT-GS-DMA-Maarten
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Plans & Challenges
•Support–GS/GD/DM/FIO/…–Experiments–Sites
•Pilot job frameworks review•ATLAS Panda server at CERN•EGEE-OSG interoperability group•GLUE 2.0•Grid Security Vulnerability Group
IT-GS-DMA-Maarten
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Kuba Moscicki
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Kuba
• Ganga development and coordination
– leading the Core developments
– overlooking and organizing the day-to-day life of the project
– integrating new team members with the project infrastructure
and the software development environment
• Ganga numbers:
– 1360 users from Jan 2007 (¾ Atlas, ¼ LHCb, ¼ others)
– ~ 175 user on average per month
– ~15 regular developers in 8 institutes
– 300+ test cases in the testing framework
• Thanks to Adrian Muraru– he has done excellent job in supporting Ganga, DIANE, ... over last 2 years
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Kuba
• Development of user-level Grid tools
– DIANE: agent-based lightweight jobs scheduler improving
reliability and efficiency of the basic Grid infrastructure
• Integration of new user communities in the Grid
using Ganga and DIANE
– Lattice QCD: 10K jobs in parallel
– ITU: 200K short jobs processed in 2 hours
– regular usage: Geant 4 Monte Carlo, Avian Flu, ...
– 16 distinct applications including commercial and scientific:
• image processing and image recognition
• bioinformatics, theoretical physics
• EGEE Tutorials...
• EGI Application Taskforce
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Andrew Maier
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Andrew
• Ganga Core developer
• Responsible for LHCb extensions in Ganga– Integration of contributions for LHCb in Ganga
– Define release for GangaLHCb
• Interface to DIRAC for Ganga
• Participated in the LHCb bookkeeping task force
• User support for LHCb– Includes problems solving
– Provide tutorials and training
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Andrew
• Participate in Ganga Release Manager tasks
• Initiated and supervised port of Ganga to Windows
• LatticeQCD project ~30 CPU years in 1 week
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Birger Koblitz
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Birger Koblitz
• Main Activity is Coordination of Procedures for
Atlas Central Services– VO-Boxes, Panda, Catalogue services, Production
system, Databases
– Service Manager for the VO-Boxes, Panda and the
catalogues, very close collaboration with Simone, Maarten
& S. Jezequel, contact to FIO
– Test of Components, e.g Throughput Tests (again with
Simone, S. Jezequel)
• Contribution to ADC development– SRM 2 testing together with Flavia
– Provided Site Index (AMGA based), now in production
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Birger Koblitz
• Coordination of AMGA project
– Transition to maintenance mode
– Bug fixes, support
– Supervision of Ali's thesis on WS-DAIR
implementation, collaboration with OGF
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Ali Javadzadeh Boloori
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Master student in Software Engineering of Distributed
Systems at the Royal Institute of Technology (KTH),
Stockholm, Sweden
Working on the implementation and evaluation of a WS-
DAIR compatible web service interface for AMGA, under
supervision of Birger Koblitz
WS-DAIR (Web Service Database Access and Integration-
The Relational Realization) is the extension of DAIS family
of specification for relational database access on the Grid
recommended by OGF Expected to become a standard
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Sending result sets in chunks, using iterations
Caching result sets
Scrolling in result set, in any direction
Third party delivery of results
Standard encoding of data (Java WebRowSet)
Using GAP application for Avian Flu Drug Discovery as a testbed
– The results have been demonstrated in
Super Computing 07, San Diego, Nov. 2007, by Birger
EGEE User Forum, Clermont Ferrand, Feb. 2008 by me
10
100
1000
10000
100000
1 10 100 1000 10000 100000 1e+06
Que
ries:
thro
ughp
ut [e
ntrie
s/se
c]
# Entries
Text streamingDirect access
Indirect access
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 10 100 1000 10000 100000 1e+06
Que
ries:
thro
ughp
ut [e
ntrie
s/se
c]
Size of chunks
Query size= 5000 Query size= 10000 Query size= 20000 Query size= 50000
Query size= 100000
First implementation of WS-DAIR
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
William Ollivier
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
• Starting a 6-month Tech Student contract• Background
– French, 22 years old
– After high school, 2 years of intensive mathematics and physics courses
– Then 2½ years in a Telecommunications Engineering school, with a 1-year computer science specialization
• The position is part of my school work
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
My work at CERN
• 2 phases
• 1st phase: Working on improving Dashboard
– Installing and discovering how dashboard works
– Adding new functionalities
• ATLAS SAM
• 2nd phase: Dashboard
– Details to be seen (when ATLAS SAM is in place)
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
Dan Van Der Ster
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Daniel van der Ster• About me:
– Dutch/Canadian from Vancouver area
– Education: University of Victoria, Computer Engineering
• Research Experience:
– Thesis: “Resource Allocation and Scheduling Strategies using Utility and the Knapsack
Problem on Computational Grids”
• The grid scheduler has many options for allocating a particular task: location,
performance, malleability, reliability.
• Each option has utility to the user, resource owner, or others.
• To optimize the global utility sum, we formulate as a 0-1 MMKP, and use heuristics to
solve the problem.
• Provides an effective system for enforcing policies on the grid (QoS, Economic,
Timeliness, Reliability)
• Practical Grid Experience:
– Worked with ATLAS and BaBar (SLAC) computing community in Canada.
– GridX1: GT2 grid, Condor-G RB, GRAM i/f to RB. The GridX1 RB was an LCG CE.
– Gavia: GT4 grid with WS-GRAM interface to Condor-G
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
Challenges
• Become a Ganga expert in no time
– Easy ;)
• Replace Dietrich
– More difficult!!!
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
CERN IT Department
CH-1211 Genève 23
Switzerlandwww.cern.ch/
it
InternetServices
l
WLCG
GS/DMA
ITIT/GS
ATLAS and CMS
GS/DMA
LHCb
ATLAS
LHCb
ALICE
CMSALICE