The Technical SideThe Technical Side
John GordonJohn Gordon
John Gordon - RAL
OutlineOutline
How will we manage the techncal side of the project?How will we manage the techncal side of the project?
Formal StructureFormal Structure Pragmatic StructurePragmatic Structure
Tasks to be doneTasks to be done
MegaTasksMegaTasks Important TasksImportant Tasks
Immediate FutureImmediate Future
John Gordon - RAL
Experiments Board
John Gordon - RAL
A A BIGBIG Board Board
ChairChair
Deputy ChairDeputy Chair
Chairs of each Working GroupChairs of each Working Group
Co-ordinator of the EU DataGridCo-ordinator of the EU DataGrid
2 CERN contacts2 CERN contacts
Co-ordinators of Tier-1 and all Tier-2 centres Co-ordinators of Tier-1 and all Tier-2 centres
2 cross-members from the Experiments Board2 cross-members from the Experiments Board
From the proposalFrom the proposal
1111111111225522---------- 2323
John Gordon - RAL
Too Big?Too Big?
A board of this size can bring a wide range of A board of this size can bring a wide range of technical knowledge to the projecttechnical knowledge to the project
Useful in steering the technical work and reviewing Useful in steering the technical work and reviewing progress....progress....
.....but not nimble enough to push the direct day-to-.....but not nimble enough to push the direct day-to-day work forward day work forward
Reduce its sizeReduce its size Or introduce yet another body Or introduce yet another body
John Gordon - RAL
First Consider WorkgroupsFirst Consider Workgroups
Have we got the right number?Have we got the right number? covering the right areas?covering the right areas? With the right expertise?With the right expertise?
John Gordon - RAL
Information FlowInformation FlowInformation Flow
John Gordon - RAL
Dissemination/ Collaborations (Roger Jones)
Software Experiments TestbedSupport (Nick Brook) (Dave Newbold)(Andrew McNab)
Information Services Monitoring Mass StorageWorkload and Services andManagement Data Management (Robin Middleton) Fabric Management (Dave Colling) (Alex Martin) (Themis Bowcock)
Security Networking (Dave Kelsey) (Pete Clarke)
CERN (Les Robertson)
GridPP TeamGridPP Team
John Gordon - RAL
Dissemination/ Collaborations (Roger Jones)
Software Experiments TestbedSupport (Nick Brook) (Dave Newbold)(Andrew McNab)
Information Services Monitoring Mass StorageWorkload and Services andManagement Data Management (Robin Middleton) Fabric Management (Dave Colling) (Alex Martin) (Themis Bowcock)
Security Networking (Dave Kelsey) (Pete Clarke)
CERN
(Les Robertson)
GridPP TeamGridPP Team
John Gordon - RAL
Dissemination/ Collaborations (Roger Jones)
Software Experiments TestbedSupport (Nick Brook) (Dave Newbold)(Andrew McNab)
Information Services Monitoring Mass StorageWorkload and Services andManagement Data Management (Robin Middleton) Fabric Management (Dave Colling) (Alex Martin) (Themis Bowcock)
Security Networking (Dave Kelsey) (Pete Clarke)
CERN (Les Robertson)
GridPP TeamGridPP Team
John Gordon - RAL
What have we learned from EU-DataGrid?What have we learned from EU-DataGrid?
They missed whole areas of work They missed whole areas of work (security)(security) Topics were spread across WP Topics were spread across WP (information service)(information service) Topics were duplicated across WP Topics were duplicated across WP (information service)(information service)
and needed rationalising
WPs didn’t understand what the others were doing WPs didn’t understand what the others were doing WPs didn’t understand what WPs didn’t understand what theythey were doing! were doing! Some topics required meetings/workshops with most Some topics required meetings/workshops with most
other WPsother WPs lots of bi- and tri-lateral meetings requiredlots of bi- and tri-lateral meetings required more WPs = more meetingsmore WPs = more meetings
UKUK
John Gordon - RAL
What have we learned from EU-DataGrid?What have we learned from EU-DataGrid?
PTB too big to be effectivePTB too big to be effective A small body of techies (ATF) can make progressA small body of techies (ATF) can make progress Need to expose work to experiments regularlyNeed to expose work to experiments regularly
but not every experimenter
Regular lightweight meetings that don’t go into detail Regular lightweight meetings that don’t go into detail but just action the relevant people to progress but just action the relevant people to progress outside meeting have proven effective (executive outside meeting have proven effective (executive group) group)
UKUK
John Gordon - RAL
Proposal SubmittedProposal Submitted
11 workgroups (9 UK technical + CERN + Dessemination)11 workgroups (9 UK technical + CERN + Dessemination)
John Gordon - RAL
Dissemination/ Collaborations (Roger Jones)
Software Experiments TestbedSupport (Nick Brook) (Dave Newbold)(Andrew McNab)
Information Services Monitoring Mass StorageWorkload and Services andManagement Data Management (Robin Middleton) Fabric Management (Dave Colling) (Alex Martin) (Themis Bowcock)
Security Networking (Dave Kelsey) (Pete Clarke)
CERN (Les Robertson)
GridPP TeamGridPP Team
John Gordon - RAL
Fewer Workgroups?Fewer Workgroups?
Experiments
“Middleware”Workload Management
Data Management Information ServicesMonitoring ServicesFabric Management
Mass Storage
“Testbed”Prototype Grid, Software Support
NetworkingSecurity, Networking
John Gordon - RAL
Fewer Workgroups?Fewer Workgroups?
Experiments
NetworkingSecurity, Networking
“Site Management”
Fabric ManagementMass Storage
“Grid Management”
Workload Management Data Management
Information ServicesMonitoring Services
“Testbed”Prototype Grid, Software Support
John Gordon - RAL
Do we cover everything?Do we cover everything?
Other groups?Other groups? ATF?ATF? Cross-group themesCross-group themes
John Gordon - RAL
Still A Still A BIGBIG Board Board
ChairChair
Deputy ChairDeputy Chair
Chairs of each Working GroupChairs of each Working Group
Co-ordinator of the EU DataGridCo-ordinator of the EU DataGrid
2 CERN contacts2 CERN contacts
Co-ordinators of Tier-1 and all Tier-2 centres Co-ordinators of Tier-1 and all Tier-2 centres
2 cross-members from the Experiments Board2 cross-members from the Experiments Board
Reducing to 3 workgroupsReducing to 3 workgroups
11113311225522---------- 1515
John Gordon - RAL
CompromiseCompromise
No idealNo ideal Too many groups, too many interfacesToo many groups, too many interfaces Too few groups, too big a job to lead themToo few groups, too big a job to lead them
John Gordon - RAL
How to work?How to work?
Keep proposed Technical Board reflecting any Keep proposed Technical Board reflecting any changes agreed to groupschanges agreed to groups
Workgroups work themselves to produce their Workgroups work themselves to produce their deliverables deliverables
Small team to interface workgroups, reconcile overall Small team to interface workgroups, reconcile overall plan and deal with problemsplan and deal with problems
Small team to meet weekly/monthly by phone/videoSmall team to meet weekly/monthly by phone/video Representatives of the 3/4 supergroups plus TB chair Representatives of the 3/4 supergroups plus TB chair
and PMand PM improve the way we use:- improve the way we use:-
video email web
John Gordon - RAL
Proposal DeliverablesProposal Deliverables
Prototype IPrototype I Mar 2002Mar 2002 Performance and Performance and scalability testing of components of the computing fabric scalability testing of components of the computing fabric (clusters, disk storage, mass storage system, system (clusters, disk storage, mass storage system, system installation, system monitoring) using straightforward installation, system monitoring) using straightforward physics applications. Testing of the job scheduling and data physics applications. Testing of the job scheduling and data replication software from the first DataGrid release.replication software from the first DataGrid release.
Prototype IIPrototype II Mar 2003Mar 2003 Prototyping of the Prototyping of the integrated local computing fabric, with emphasis on scaling, integrated local computing fabric, with emphasis on scaling, reliability and resilience to errors. Performance testing of reliability and resilience to errors. Performance testing of LHC applications. Distributed HEP and other science LHC applications. Distributed HEP and other science application models using the second DataGrid release.application models using the second DataGrid release.
Prototype IIIPrototype III Mar 2004Mar 2004 Full scale testing of the Full scale testing of the LHC computing model with fabric management and Grid LHC computing model with fabric management and Grid management software for Tier-0 and Tier-1 centres, with management software for Tier-0 and Tier-1 centres, with some Tier-2 components. This is the prototype system that some Tier-2 components. This is the prototype system that will be used to define the parameters for the acquisition of will be used to define the parameters for the acquisition of the initial LHC production system. This will use the software the initial LHC production system. This will use the software from the final DataGrid release.from the final DataGrid release.
John Gordon - RAL
158 Workgroup Deliverables158 Workgroup Deliverables WGWG NameName
AA Planning and ManagementPlanning and Management
AA Installation and test job submission via schedulerInstallation and test job submission via scheduler
AA Develop JCL/JDLDevelop JCL/JDL
BB Develop Project Plan, Coord. + MangeDevelop Project Plan, Coord. + Mange
BB Schema RepositorySchema Repository
BB Releases AReleases A
CC Release for Testbed-1Release for Testbed-1
CC Release for Testbed-2Release for Testbed-2
CC Release for Testbed-3Release for Testbed-3
CC Evaluation ReportEvaluation Report
DD Develop project planDevelop project plan
DD COTS systems development BCOTS systems development B
DD Integration of existing fabricIntegration of existing fabric
DD Fabric benchmarking/evaluation Fabric benchmarking/evaluation
DD User PortalsUser Portals
DD Fabric demonstrator(s)Fabric demonstrator(s)
DD Evaluation and API DesignEvaluation and API Design
DD Prototype APIPrototype API
DD Further Refinement and testing of APIFurther Refinement and testing of API
DD Definition of MetadataDefinition of Metadata
DD Prototype MetadataPrototype Metadata
DD Metadata refinement and testingMetadata refinement and testing
EE Gather requirementsGather requirements
EE Survey and track technologySurvey and track technology
EE Design, implement and testDesign, implement and test
EE Integrate with other WG/GridsIntegrate with other WG/Grids
EE Management of WGManagement of WG
EE DataGrid SecurityDataGrid Security
FF Net-1-ANet-1-A
FF Net-1-BNet-1-B
FF Net-2-ANet-2-A
FF Net-2-BNet-2-B
FF Net-4-ANet-4-A
FF Net-4-BNet-4-B
GG GRID ISGRID IS
GG Network opsNetwork ops
GG Tier-1 centre opsTier-1 centre ops
GG ManagementManagement
HH Deployment toolsDeployment tools
HH Globus supportGlobus support
HH Testbed teamTestbed team
HH ManagementManagement
JJ Begin foundational package ABegin foundational package A
KK Support prototypesSupport prototypes
KK Extension of Castor for LHC capacity, performanceExtension of Castor for LHC capacity, performance
KK Fabric network management, and resilienceFabric network management, and resilience
KK Support fabric prototypesSupport fabric prototypes
KK High bandwidth WAN – file transfer/access performanceHigh bandwidth WAN – file transfer/access performance
KK WAN traffic instrumentation & monitoring WAN traffic instrumentation & monitoring
KK Grid authentication – PKI Grid authentication – PKI
KK Authorisation infrastructure for Grid applications – PMIAuthorisation infrastructure for Grid applications – PMI
KK Base technology for collaborative toolsBase technology for collaborative tools
KK Support for grid prototypesSupport for grid prototypes
KK Evaluation of emerging object relational technologyEvaluation of emerging object relational technology
AA Installation and test job submission via schedulerInstallation and test job submission via scheduler
BB Query Optimsation and Data Mining AQuery Optimsation and Data Mining A
CC Technology EvaluationTechnology Evaluation
CC Evaluation ReportEvaluation Report
DD Implementation of Production APIImplementation of Production API
DD Implementation of production metadataImplementation of production metadata
EE Production phaseProduction phase
FF Net-2-CNet-2-C
FF Net-2-DNet-2-D
FF Net-2-GNet-2-G
FF Net-3-ANet-3-A
GG Security operationsSecurity operations
GG GRID ISGRID IS
GG Tier-1 centre opsTier-1 centre ops
HH Upper middleware/application supportUpper middleware/application support
JJ Focus and engage AFocus and engage A
KK LAN performanceLAN performance
KK High bandwidth firewall/defencesHigh bandwidth firewall/defences
AA Modify SAMModify SAM
AA Further testing and refinementFurther testing and refinement
AA Profiling HEP jobs and scheduler optimisationProfiling HEP jobs and scheduler optimisation
AA Super scheduler developmentSuper scheduler development
BB Directory ServicesDirectory Services
BB Distributed SQL DevelopmentDistributed SQL Development
BB Data ReplicationData Replication
BB Query Optimsation and Data Mining BQuery Optimsation and Data Mining B
BB Releases BReleases B
BB LiasonLiason
CC Architecture & DesignArchitecture & Design
CC Technology EvaluationTechnology Evaluation
CC Release for Testbed-2Release for Testbed-2
CC Release for Testbed-3Release for Testbed-3
DD Fabric Management ModelFabric Management Model
DD Establish ICT-industry leader partnershipsEstablish ICT-industry leader partnerships
DD COTS systems development ACOTS systems development A
DD Proprietary systems developmentProprietary systems development
DD FM information disseminationFM information dissemination
DD Evaluation ReportEvaluation Report
DD Tape Exchange evaluation & designTape Exchange evaluation & design
DD Design RefinementDesign Refinement
DD Tape Exchange Prototype VersionTape Exchange Prototype Version
DD Tape Exchange Production VersionTape Exchange Production Version
EE ArchitectureArchitecture
EE Security developmentSecurity development
EE DataGrid Security developmentDataGrid Security development
FF Net-2-ENet-2-E
FF Net-2-FNet-2-F
FF Net-3-BNet-3-B
HH Globus developmentGlobus development
HH S/w development supportS/w development support
HH Upper middleware/application supportUpper middleware/application support
JJ Begin production phase ABegin production phase A
JJ QCDGrid – full Grid access of lattice datasetsQCDGrid – full Grid access of lattice datasets
KK Scalable fabric error and performance monitoring systemScalable fabric error and performance monitoring system
KK Automated, scalable installation systemAutomated, scalable installation system
KK Automated software maintenance systemAutomated software maintenance system
KK Scalable, automated (re-)configuration systemScalable, automated (re-)configuration system
KK Automated, self-diagnosing and repair systemAutomated, self-diagnosing and repair system
KK Implement grid-standard APIs, meta-data formatsImplement grid-standard APIs, meta-data formats
KK Data replication and synchronisationData replication and synchronisation
KK Performance and monitoring of wide area data transferPerformance and monitoring of wide area data transfer
KK Integration of LAN and Grid-level monitoringIntegration of LAN and Grid-level monitoring
KK Adaptation of databases to Grid replication and cachingAdaptation of databases to Grid replication and caching
KK Preparation of training courses, materialPreparation of training courses, material
KK Adaptation of application – science AAdaptation of application – science A
KK Adaptation of application – science BAdaptation of application – science B
II ATLASATLAS
II CMSCMS
II LHCbLHCb
II ALICEALICE
II BaBarBaBar
II UKDMCUKDMC
II H1H1
II CDFCDF
II D0D0
KK Provision of basic physics environment for prototypesProvision of basic physics environment for prototypes
KK Support of grid testbedsSupport of grid testbeds
KK Adaptation of physics core software to the grid environmentAdaptation of physics core software to the grid environment
KK Exploitation of the grid environment by physics applicationsExploitation of the grid environment by physics applications
KK Support for testbedsSupport for testbeds
KK Middleware support for other sciencesMiddleware support for other sciences
KK Bibliographic metadataBibliographic metadata
II ATLASATLAS
II CMSCMS
II LHCbLHCb
II BaBarBaBar
II CDFCDF
II D0D0
JJ Begin foundational package BBegin foundational package B
JJ Focus and engage BFocus and engage B
JJ Begin production phase BBegin production phase B
JJ Begin exploitation phaseBegin exploitation phase
JJ Expand exploitationExpand exploitation
JJ Value added through Comp. Sci. AValue added through Comp. Sci. A
KK Lambda switching prototypesLambda switching prototypes
KK Security monitoring in a Grid environmentSecurity monitoring in a Grid environment
KK Portal prototypingPortal prototyping
KK Integration of & performance issues with mass storage management at different testbed sitesIntegration of & performance issues with mass storage management at different testbed sites
KK Support of the simulation frameworkSupport of the simulation framework
KK Development of the simulation frameworkDevelopment of the simulation framework
KK Adaptation to and exploitation of grid environmentAdaptation to and exploitation of grid environment
KK Development of portal componentsDevelopment of portal components
KK Development of the base frameworkDevelopment of the base framework
KK Middleware packaging for other sciencesMiddleware packaging for other sciences
John Gordon - RAL
DeliverablesDeliverables
GridPPGridPP DataGrid UK PP Grid (non-DataGrid)
US LEP/HERA? Other HEP
non PP grid (other RC, industry)
John Gordon - RAL
Priority DeliverablesPriority Deliverables
Not in order, and not necessarily independentNot in order, and not necessarily independent EU DataGridEU DataGrid
Prototype Grid Tier1
Running US experimentsRunning US experiments BaBar, CDF, D0 all have feasible grid demonstrators
that will be of long-term use to them GridPP must foster this work
UK e-science demonstratorsUK e-science demonstrators Show Hey, Halliday, Taylor, CS & industry that we can
deliver. Vital for continued funding, spending review
John Gordon - RAL
DeliveryDelivery
Experiment Board and Technical Boards need to work Experiment Board and Technical Boards need to work together to get workgroups delivering these, together to get workgroups delivering these, whatever they are, as soon as possible.whatever they are, as soon as possible.
John Gordon - RAL
DiscussDiscuss
John Gordon - RAL
Technical collaborationsTechnical collaborations
People have asked about technical collaborationsPeople have asked about technical collaborations Contacts can be made at political levels but Contacts can be made at political levels but
collaborations need to be between technical groupscollaborations need to be between technical groups Coordination or just information exchange between Coordination or just information exchange between
technical groups can reduce duplicationtechnical groups can reduce duplication The more contacts, the more meetings, the less workThe more contacts, the more meetings, the less work A definite tension exists A definite tension exists