the osg open facility: a sharing ecosystemboj/chep-osg-jayatilaka.pdf · the osg open facility: a...

15
The OSG Open Facility: A sharing ecosystem Bodhitha Jayatilaka, Tanya Levshina, Chander Sehgal, Marko Slyz Fermilab Mats Rynge ISI/USC CHEP 2015 Okinawa, Japan April 13, 2015

Upload: others

Post on 29-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

The OSG Open Facility: A sharing ecosystem

Bodhitha Jayatilaka, Tanya Levshina, Chander Sehgal, Marko Slyz!Fermilab!

Mats Rynge!ISI/USC!

!CHEP 2015!

Okinawa, Japan!April 13, 2015

Bo Jayatilaka April 13, 2015

The Open Science Grid

• Goal: Advance science through open distributed high throughput computing (DHTC) in the US!− Operates as a partnership between resource providers (sites) and

stakeholders (scientific communities of users)!− OSG project provides middleware and R&D effort!− Over 120 sites (computing and storage elements)!

▪Mix of university and national labs; LHC experiment computing sites and campus clusters !

− Uses the Virtual Organization (VO) model !▪Most VOs correspond to large HEP experiments and university

communities!• Funded by the US Dept. of Energy and National Science Foundation!− Second 5 year period of support

2

Bo Jayatilaka April 13, 2015

State of the OSG

3

http://display.grid.iu.edu

Bo Jayatilaka April 13, 2015

Growth of the OSG

4

April 2010-April 2015

Bo Jayatilaka April 13, 2015

Expanding usage of the OSG

• Most use of the OSG is by large physics experiments (e.g. ATLAS and CMS) and individual university communities!− Primarily operate on their own sites!

▪ Utilization is high but not 100% 24/7!− 85% of OSG hours in 2011 were from large

HEP experiments!• Want to connect unused cycles with

researchers who are not part of large VOs!• Reaching other sciences takes more work!− Few have the organized computing efforts

HEP experiments have!− Not every research community can have

their own VO (although many do)

5

2011

atlas

cms

ligocdf dzero

Bo Jayatilaka April 13, 2015

The OSG open facility

• Implement OSG VO as a VO for individual researchers!− Allow any researcher from US

institutions !− Resources accessed by the OSG

VO are entirely opportunistic!− All OSG sites encouraged to

support OSG VO running!• An XSEDE service provider!− Currently the only DHTC provider

6

OSG VO Growth 2011-2014

Wal

l Hou

rs

0E+00

3E+07

5E+07

8E+07

1E+08

Year2011 2012 2013 2014

400,568 3,168,025

47,931,106

97,009,409

http://xsede.org

Bo Jayatilaka April 13, 2015

On-ramp to the open facility

7March 14, 2014

Easier “On-Ramp” to the OSG DHTC Fabric

•4

OSG DHTC Fabric >100 sites

OSG Flocking Node

Interactive Login Node

XSEDE Users

OSG-Direct Users

OSG-Connect Duke-Connect

iPlant

Virginia Tech

BakerLab

ISI

Others .

All access operates under the OSG VO using glideinWMS

GlideinWMS

Bo Jayatilaka April 13, 2015

OSG Connect

• Make OSG seem like a virtual campus cluster!− Login host!− Environmental

modules!− Job scheduling!− Job data (including

monitoring)!• Jobs run on the OSG

as part of the OSG VO

8

Bo Jayatilaka April 13, 2015

Science on the open facility

9

30-40 unique projects use the Open Facility

every month !

23 peer-reviewed publications in 2014

Bo Jayatilaka April 13, 2015

Projects using the open facility

10

https://oim.grid.iu.edu/oim/project

154 registered projects (39 XSEDE)

Bo Jayatilaka April 13, 2015

Science on the open facility

11

Bo Jayatilaka March 25, 2015

Detector Design• Non-invasive diagnosis of coronary artery

disease requires imaging of the myocardium !− Predominantly with SPECT (single photon

emission computed tomography)!− Radio-pharmaceuticals+collimators and

radiation detectors allow 3D imaging of the myocardium!

− Literally see areas not getting enough blood!!

• Design of new C-SPECT optimized for cardiac imaging (cheaper x2 and more sensitive x2-3)!− Extensive Monte Carlo simulations utilized in

design studies as well as in calibrating prototypes

16

John Strologas, Wei Chang Rush University !

Bo Jayatilaka March 25, 2015

Agent-based modeling of disease transmission in white-tailed deer

• Chronic wasting disease (CWD) in white-tailed deer !− Prion disease (like mad cow) causing fatal

degeneration of the brain!− Not much known about the disease and its long-term

effects on deer population!• Simulate spread of CWD with an agent-based model

(DeerLandscapeDisease)!− Each deer modeled separately with its own

behavioral rules (landscapes, empirically-based movement, deer behavior, disease)!

− OSG used for CPU!• DLD shows transmission may be a mix of direct (contact)

and indirect (environmental) transmission!− Potential for coexistence!− Landscape and social structure play a major part in

CWD transmission17

Lene Jung Kjaer Southern Illinois University !

Bo Jayatilaka March 25, 2015

C5-free Ramsey ColoringsI A (G1,G2, . . . ,Gk ; n) coloring is a coloring of the edges of the

complete graph on n vertices with k colors such that the i-thcolor does not contain the graph Gi .

I Since R3 (C5) = 17, we ask:I How many (C5,C5,C5; 16) colorings are there?I What is their structure?I How are they di↵erent from the canonical construction?

Figure: Canonical construction of a (C5,C5,C5; 16) coloring

Enumeration of all (C5,C5,C5;n) Colorings

• A (G1,G2,...,Gk;n) coloring is a coloring of the edges of the complete graph on n vertices with k colors such that the ith color does not contain the graph Gi!

• R3(C5)=17. How many (C5,C5,C5;16) colorings are there?!

• Start with smaller C5 colorings!− Initial input of 140M n=12 colorings!− Iterated by adding one vertex at a

time!• Result: 1,701,746,176 Ramsey 3-

colorings avoiding C5, the cycle of length 5, in all 3 colors

20

Results

I More than 4 (wall time) months of processing

I More than 1.7⇥ 109 colorings (around 67 GB of data)

David Narváez, Stanislaw Radziszowski

Rochester Institute of Technology !

Bo Jayatilaka March 25, 2015

Collider phenomenology on the grid

• LHC final states are very complex and interesting signals are often in corners of phase space!

• Precise theoretical prediction of known processes crucial in finding new physics!

• OSG resources used for refining modeling of SM processes and reducing theory uncertainties!− gluon fusion Higgs+jets production!− top quark pairs+jets!− Both cases ≤2 jets@NLO and 3 jets @LO!

• Uses Sherpa+Rivet+MCFM+OpenLoops!• Each process used O(500k) hours over 1-2

weeks

21

Stefan Höche SLACSimulation of Standard Model Higgs-Boson Production

[Krauss,Schonherr,SH] arXiv:1401.7971

Sherpa MePs@Nlo

µR = µCKKWµR = mhµR = H0

TVBF cuts

10

�6

10

�5

10

�4

10

�3

10

�2

Transverse momentum of the H j1 j2 system

ds/d

p ?[p

b/G

eV]

0 50 100 150 200 250 300

0.60.8

1

1.21.41.61.8

p?(hj1 j2)

Rat

ioto

µR=

mh

Shower VariationEvol 1 / Kin 0

Evol 0 / Kin 0

Evol 0 / Kin 1

Evol 1 / Kin 1

10

�3

10

�2

10

�1

1

Higgs boson transverse momentum (njet = 0)

ds/d

p ?[p

b/G

eV]

0 20 40 60 80 100

0.6

0.8

1

1.2

1.4

p?(H) [GeV]

Rat

io

I Combines NLO QCD calculations for pp ! h + 0, 1&2-jet plus 3-jet at LOI Three di↵erent functional forms of scale + individual uncertainty estimatesI Resummation uncertainty remains in jet-vetoed region relevant for VBF

Stefan Hoche Pheno on the Grid 9

Simulation of Top Quark Pairs

I First matched/merged sim for tt+2jfull result has tt+0,1,2j@NLO, 3j@LO

I Largely reduced theory uncertaintyfor both for measurement (pT , Njet)and BSM search (HT ) observables

Sherpa+OpenLoops

[email protected] ⇥ MEPS@LO

S-MC@NLO

10

�5

10

�4

10

�3

Total transverse energy

ds/d

Hto

tT

[pb/

GeV

]

200 400 600 800 1000 1200

0.5

1

1.5

HtotT [GeV]

Rat

ioto

MEP

S@N

LO

Sherpa+OpenLoops

1st jet

2nd jet

3rd jet

[email protected] ⇥ MEPS@LOS-MC@NLO

10

�8

10

�7

10

�6

10

�5

10

�4

10

�3

Light jet transverse momenta

ds/d

p T[p

b/G

eV]

1st jet0.5

1

1.5

Rat

ioto

MEP

S@N

LO

2nd jet0.5

1

1.5

2

Rat

ioto

MEP

S@N

LO

3rd jet

40 50 100 200 500

0.51

1.52

2.5

pT (light jet) [GeV]

Rat

ioto

MEP

S@N

LO

[Krauss,Maierhofer,Pozzorini,Schonherr,Siegert,SH] arXiv:1402.6293

Stefan Hoche Pheno on the Grid 12“Science on the OSG in 2014” OSG AHM 3/25/15 !

Bo Jayatilaka April 13, 2015

Science on the open facility

12

How many compounds in our database are stable?!

The rate of compound discovery (total and stable) within the ICSD by year.

OQMD%database:%•  Total%297,099%compounds%

•  19,757%T=0K%stable%

•  16,118%from%ICSD%

•  3487%“prototype”%structures%

Each of these cases represents a prediction of a system where new compounds should exist! Many gaps in our

current knowledge of phase stability…

All ICSD Structures

Stable in OQMD

The$Measurement$Con$nuously)record)the)extracranial)magne$c)field)while)

the)volunteer)performs)a)cogni$ve)task)…)

Combatting Concussions using “Virtual Recordings” (MEG)

D. Krieger (Pittsburgh) !

Building the Open Quantum Materials Database

C. Wolverton (Northwestern) !

Bo Jayatilaka April 13, 2015

Introducing StashCache• Caching infrastructure based

on SLAC Xrootd server & xrootd protocol.

• Each VO has a origin server.

• Cache servers are placed at several strategic cache locations across the OSG.

• Jobs utilize “nearby” cache, for some definition of nearby.

OSG DataFederation

OSG-XDSource

OSG-ConnectSource

IFSource

GLOWSource

OSGRedirector

CachingProxy

CachingProxy

CachingProxy

CachingProxy

JobJob

Job

DownloadRedirect

Discovery

Future growth and technologies• Open Facility now provides more than 100M CPU

hours per year !• Ongoing challenges!− Understanding available resources after LHC

Run 2 starts!− Finding appropriate users to provide continued

demand!• Developing auxiliary technologies useful for

opportunistic users!− Data movement is a key limitation !− Currently inefficient beyond ~10GB/job!

• Developing XRootD-based caching system “StashCache”!− Increases effective data movement to ~1TB/job

13

StashCache

Bo Jayatilaka April 13, 2015

Conclusions

• Opportunistic resources and their consumption continue to grow on the OSG !− OSG Open Facility provides more than 100M CPU hours/year!

• A vibrant opportunistic user community co-exists with large (site-owning) VOs!− A wide range of research is conducted (and published) using

results obtained using OSG resources!• OSG Consortium committed to supporting these users in addition

to large VOs!− Principle of sharing is key to OSG vision and goals!− Enshrined in by-laws: Consortium members recognize that the

OSG is a sharing eco-system and strive to maximize the sharing of computing resources, software, and other assets to enable science.

14

Backup