the lab’s computing support strategy for cdf and d0 victoria white, associate lab director for...

13
The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October 3, 2011

Upload: aspen

Post on 25-Feb-2016

50 views

Category:

Documents


2 download

DESCRIPTION

The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October 3, 2011. Our job in the Computing Sector. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

The Lab’s Computing Support Strategy for CDF and D0

Victoria White, Associate Lab Director for Computing and CIO

October 3, 2011

Page 2: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Our job in the Computing Sector

• Is to enable science and to optimize the support (human and technological) of the scientific programs of the lab (including the Experiment program)

Within funding and resource contraints In the face of growing demands To meet emerging needs To deal with rapidly changing technology

• We also have to provide computing to support the lab’s operations and provide all the standard services that an organization needs (and often expects 24x7)

Computing Support Strategy for CDF and D02

Page 3: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Computing Division -> Computing Sector

Computing Support Strategy for CDF and D03

Page 4: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Computing Support Strategy for CDF and D0

Fermilab Computing Facilities

4

•Lattice Computing Center (LCC)• High Performance Computing (HPC)• Accelerator Simulation, Cosmology nodes• No UPS

•Feynman Computing Center (FCC)• High availability services – e.g. core

network, email, etc.• Tape Robotic Storage (3 10000 slot

libraries)• UPS & Standby Power Generation• ARRA project: upgrade cooling and

add HA computing room - completed

•Grid Computing Center (GCC)• High Density Computational

Computing• CMS, RUNII, Grid Farm batch worker

nodes• Lattice HPC nodes• Tape Robotic Storage (4 10000 slot

libraries)• UPS & taps for portable generatorsEPA Energy

Star award 2010

Page 5: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Computing Support Strategy for CDF and D0

Facilities: more than just space power and cooling – continuous planning

5

ARRA funded new high availability computer

room in Feynman Computing Center

Page 6: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Cooling problems at GCC this summer• Soaker hoses to cool concrete condenser pad• Increased computer room operating temperatures• Numerous air management improvements inside the

computer room, including cold aisle containment test• Extended monitoring outside to the condenser pad• Executed load shed plan twice during hottest days• Rented portable air conditioning for use in CRB & outside

under the condensers (the latter was effective, not efficient)

The air intake to the condensers can reach temps of 120F causing the cooling to shutdown (20-25F above ambient on pad)• $650–950k to move condensers to a

platform for Comp.Rooms B and C h Rough estimate from FESSh Does not include Computer Room Ah Better estimate when the study is

complete in NovemberComputing Support Strategy for CDF and D06

Page 7: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Need to fix Grid Computing Center quickly – ready for next summer

• Need to be able to use the computer rooms as designed and plan for that going forward.

• Need to move forward with CRA renovations for greater power per rack.

• We cannot do this and run everyone ragged and be unreliable every summer

Computing Support Strategy for CDF and D07

Page 8: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Computing Support Strategy for CDF and D0

Run II Computing Strategy

• Production processing and Monte-Carlo production capability after the end of data taking

Ability to do some reprocessing if needed Monte Carlo production at the current rate through mid-

2013? • Analysis computing capability for at least 5 years, but

diminishing after end of 2012 Push for 2012 conferences for many results –no large drop

in computing requirements through this period• Continued support for up to 5 years for

Code management and science software infrastructure Data handling for production (+MC) and Analysis Operations

• Curation of the data: > 10 years with possibly some support for continuing analyses

8

Page 9: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

We have pushed/insisted on sharing strategies for computing for many years –why?

• Cost• Coherent technical approaches and

architectures• Support over the entire lifecycle of an

experiment/project

Computing Support Strategy for CDF and D09

Page 10: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Experiment/Project Lifecycle and funding

Computing Support Strategy for CDF and D010

Early PeriodR&D, SimulationsLOI,Proposals

Shared services

Mature phase

Construction, Operations, Analysis

Shared services

Expt or Project specific

Final data-takingand beyondFinal analysis,Data preservationand access

Shared services

Project specific

Shared services

Page 11: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Computing Support Strategy for CDF and D0

Sharing via the Grid – FermiGrid

11

TeraGrid WLCG NDGF

User Login & Job

Submission

GRIDFarm

3284 slots

CMS7485 slots

CDF5600 slots

D06916 slots

FermiGridMonitorin

g/Accountin

gServices

FermiGridInfrastructu

reServices

FermiGridSite

Gateway

FermiGridAuthenticati

on/Authorizatio

nServices

Open Science

Grid

Page 12: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

Budget/resource allocation for 2012 +

• There is always upward pressure for computing more disk and more cpu leads to faster results and greater

flexibility more help with software & operations is always requested

• Within a fixed budget each experiment can usually optimize between tape drives, tapes, disk, cpu, servers

assuming basic shared services are provided.• With so many experiments in so many different stages

we intend to convene a “Scientific Computing Portfolio Management Team” to examine the needs/computing models of the different Fermilab based experiments and help in allocating the finite dollars to optimize scientific output.

Computing Support Strategy for CDF and D012

Page 13: The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October  3,  2011

“Data Preservation” for Tevatron data• Data will be stored and migrated to new tape

technologies for ~ 10 years Eventually 16 PB of data will seem modest

• If we want to maintain the ability to reprocess and do analysis on the data there is a lot of work to be done to keep the entire environment viable

Code, access to databases, libraries, I/O routines, Operating Systems, documentation…..

• If there is a goal to provide “open data” that scientists outside of CDF and Dzero could use there is even more work to do.

• 4th Data Preservation Workshop at Fermilab in May• The collaboration has to decide – soon if we need to do

more than maintain data for collaboration use.Computing Support Strategy for CDF and D013