data placement: hpc and portals - sc18sc18.supercomputing.org/proceedings/panel/panel... · •...

99
Data Placement: HPC and Portals Eli Dart, Science Engagement Energy Sciences Network (ESnet) Lawrence Berkeley National Laboratory Panel: HPC in Cloud or Cloud in HPC SC18, Dallas, TX November 13, 2018

Upload: others

Post on 04-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Data Placement: HPC and Portals

Eli Dart, Science EngagementEnergy Sciences Network (ESnet)Lawrence Berkeley National Laboratory

Panel: HPC in Cloud or Cloud in HPC

SC18, Dallas, TX

November 13, 2018

Page 2: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

The Importance of Data Placement

• Data placement is a key capability• Whether running on HPC, a local cluster, or cloud, data must

be accessible to running code• Older methods of data placement don’t scale• Modern capabilities composed of several components– Architecture– Tools/Platforms– Interconnected facilities

11/19/182

Page 3: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Science DMZ in HPC Facility

© 2014, Energy Sciences Network3 – ESnet Science Engagement ([email protected]) - 11/19/18

Routed

Border Router

WAN

Core Switch/Router

Firewall

Offices

perfSONAR

perfSONAR

perfSONAR

Supercomputer

Parallel Filesystem

Front endswitch

Data Transfer Nodes

Front endswitch

High Latency WAN Path

Low Latency LAN Path

http://fasterdata.es.net/science-dmz/

Page 4: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Building On The Science DMZ

• Enhanced cyberinfrastructure substrate now exists– Wide area networks (ESnet, GEANT, Internet2, Regionals)– Science DMZs connected to those networks– DTNs in the Science DMZs

• What does the scientist see?– Scientist sees a science application• Data transfer, Data portal, Data analysis• User interface is critical to productivity

– Science applications are the user interface to networks and DMZs• Large-scale data-intensive science requires that we build larger structures

on top of those components– Platforms scale better than tools– What does the scientist spend time on? Tool integration or science??

11/19/184

Page 5: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

DTN Cluster Performance – HPC Facilities (2017)

21.2/22.6/24.5Gbps

23.1/33.7/39.7Gbps

26.7/34.7/39.9Gbps

33.2/43.4/50.3Gbps

35.9/39.0/40.7Gbps

29.9/33.1/35.5Gbps

34.6/47.5/56.8Gbps

44.1/46.8/48.4Gbps

41.0/42.2/43.9Gbps

33.0/35.0/37.8Gbps

43.0/50.0/56.3Gbps

55.4/56.7/57.4Gbps

DTN

DTN

DTN

DTN

NERSC DTN clusterGlobus endpoint: nersc#dtnFilesystem: /project

Data set: L380Files: 19260Directories: 211Other files: 0Total bytes: 4442781786482 (4.4T bytes)Smallest file: 0 bytes (0 bytes)Largest file: 11313896248 bytes (11G bytes)Size distribution:

1 - 10 bytes: 7 files10 - 100 bytes: 1 files100 - 1K bytes: 59 files1K - 10K bytes: 3170 files10K - 100K bytes: 1560 files100K - 1M bytes: 2817 files1M - 10M bytes: 3901 files10M - 100M bytes: 3800 files100M - 1G bytes: 2295 files1G - 10G bytes: 1647 files10G - 100G bytes: 3 files

Petascale DTN Project

November 2017L380 Data Set

Gigabits per second(min/avg/max), threetransfers

ALCF DTN clusterGlobus endpoint: alcf#dtn_miraFilesystem: /projects

OLCF DTN clusterGlobus endpoint: olcf#dtn_atlasFilesystem: atlas2

NCSA DTN clusterGlobus endpoint: ncsa#BlueWatersFilesystem: /scratch

11/19/185

Page 6: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Science Data Portals

• Large repositories of scientific data

– Climate data

– Sky surveys (astronomy, cosmology)

– Many others

– Data search, browsing, access

• Many scientific data portals were designed 15+ years ago

– Single-web-server design

– Data browse/search, data access, user awareness all in a single system

– All the data goes through the portal server

• In many cases by design

• E.g. embargo before publication (enforce access control)

11/19/186

Page 7: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Legacy Portal Design

10GE

Border Router

WAN

Firewall

Enterprise

perfSONAR

perfSONAR

Filesystem(data store)

10GE

Portal Server

Browsing pathQuery pathData path

Portal server applications:· web server· search· database· authentication· data service

11/19/187

• Very difficult to improve performance without architectural change– Software components all tangled

together– Difficult to put the whole portal in a

Science DMZ because of security– Even if you could put it in a DMZ, many

components aren’t scalable• What does architectural change mean?

Page 8: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Next-Generation Portal Leverages Science DMZ

10GE10GE

10GE

10GE

Border Router

WAN

Science DMZSwitch/Router

Firewall

Enterprise

perfSONAR

perfSONAR

10GE

10GE

10GE10GE

DTN

DTN

API DTNs(data access governed

by portal)

DTN

DTN

perfSONAR

Filesystem (data store)

10GE

Portal Server

Browsing pathQuery path

Portal server applications:· web server· search· database· authentication

Data Path

Data Transfer Path

Portal Query/Browse Path

11/19/188

https://peerj.com/articles/cs-144/

Page 9: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

JGI Data Portal

11/19/189

Page 10: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

11/19/1810

Page 11: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Links

– ESnet fasterdata knowledge base• http://fasterdata.es.net/

– Science DMZ paper• http://www.es.net/assets/pubs_presos/sc13sciDMZ-final.pdf

– Science DMZ email list• https://gab.es.net/mailman/listinfo/sciencedmz

– perfSONAR• http://fasterdata.es.net/performance-testing/perfsonar/• http://www.perfsonar.net

© 2014, Energy Sciences Network11 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 12: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Thanks!

Eli DartEnergy Sciences Network (ESnet)Lawrence Berkeley National Laboratory

http://my.es.net/

http://www.es.net/

http://fasterdata.es.net/

Page 13: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

ESnet - the basic facts:High-speed international networking facility, optimized for data-intensive science: • connecting 50 labs, plants and facilities with >150 networks,

universities, research partners globally

• supporting every science office, and serving as an integral extension of many instruments

• 400Gbps transatlantic extension in production since Dec 2014• >1.3 Tbps of external connectivity, including high speed access

to commercial partners such as Amazon

• growing number university connections to better serve LHC science (and eventually: Belle II)

• older than commercial Internet, growing ~twice as fastAreas of strategic focus: software, science engagement.

• Engagement effort now 12% of staff • Software capability critical to next-generation network

13

Page 14: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Extra Slides – Science DMZ Talk

11/19/1814

Page 15: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Overview

• Science DMZ Motivation and Introduction

• Science DMZ Architecture

• Performance Monitoring

• Data Transfer Nodes & Applications

• Science DMZ Security

• Wrap Up

15 – ESnet Science Engagement ([email protected]) - 11/19/18 © 2014, Energy Sciences Network

Page 16: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

• Networks are an essential part of data-intensive science– Connect data sources to data analysis– Connect collaborators to each other– Enable machine-consumable interfaces to data and analysis resources

(e.g. portals), automation, scale

• Performance is critical– Exponential data growth– Constant human factors– Data movement and data analysis must keep up

• Effective use of wide area (long-haul) networks by scientists has historically been difficult

Motivation

© 2014, Energy Sciences Network16 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 17: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

The Central Role of the Network

• The very structure of modern science assumes science networks exist: high performance, feature rich, global scope

• What is “The Network” anyway?– “The Network” is the set of devices and applications involved in the use of a

remote resource• This is not about supercomputer interconnects• This is about data flow from experiment to analysis, between facilities, etc.

– User interfaces for “The Network” – portal, data transfer tool, workflow engine– Therefore, servers and applications must also be considered

• What is important? Ordered list:1. Correctness2. Consistency3. Performance

© 2014, Energy Sciences Network17 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 18: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

TCP – Ubiquitous and Fragile

• Networks provide connectivity between hosts – how do hosts see the network?– From an application’s perspective, the interface to “the other end” is a

socket– Communication is between applications – mostly over TCP

• TCP – the fragile workhorse– TCP is (for very good reasons) timid – packet loss is interpreted as

congestion– Packet loss in conjunction with latency is a performance killer– Like it or not, TCP is used for the vast majority of data transfer

applications (more than 95% of ESnet traffic is TCP)

© 2014, Energy Sciences Network18 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 19: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

A small amount of packet loss makes a huge difference in TCP performance

Metro Area

Local(LAN)

Regional

Continental

International

Measured (TCP Reno) Measured (HTCP) Theoretical (TCP Reno) Measured (no loss)

With loss, high performance beyond metro distances is essentially impossible

© 2014, Energy Sciences Network19 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 20: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Working With TCP In Practice

• Far easier to support TCP than to fix TCP– People have been trying to fix TCP for years – only some success

• RFC1323• Buffer autotuning

– Like it or not we’re stuck with TCP in the general case

• Pragmatically speaking, we must accommodate TCP– Sufficient bandwidth to avoid congestion– Zero packet loss– Verifiable infrastructure

• Networks are complex• Must be able to locate problems quickly• Small footprint is a huge win – small number of devices so that problem

isolation is tractable

© 2014, Energy Sciences Network20 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 21: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Putting A Solution Together

• Effective support for TCP-based data transfer– Design for correct, consistent, high-performance operation– Design for ease of troubleshooting

• Easy adoption is critical– Large laboratories and universities have extensive IT deployments– Drastic change is prohibitively difficult

• Cybersecurity – defensible without compromising performance

• Borrow ideas from traditional network security– Traditional DMZ

• Separate enclave at network perimeter (“Demilitarized Zone”)• Specific location for external-facing services• Clean separation from internal network

– Do the same thing for science – Science DMZ

© 2014, Energy Sciences Network21 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 22: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Dedicated

Systems for Data

Transfer

Network

Architecture

Performance

Testing &

Measurement

Data Transfer Node

• High performance

• Configured specifically

for data transfer

• Proper tools

Science DMZ

• Dedicated network

location for high-speed

data resources

• Appropriate security

• Easy to deploy - no need

to redesign the whole

network

perfSONAR

• Enables fault isolation

• Verify correct operation

• Widely deployed in ESnet

and other networks, as

well as sites and facilities

The Science DMZ Design Pattern

© 2014, Energy Sciences Network22 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 23: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Abstract or Prototype Deployment

• Add-on to existing network infrastructure– All that is required is a port on the border router– Small footprint, pre-production commitment

• Easy to experiment with components and technologies– DTN prototyping– perfSONAR testing

• Limited scope makes security policy exceptions easy– Only allow traffic from partners– Add-on to production infrastructure – lower risk

© 2014, Energy Sciences Network23 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 24: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Science DMZ Design Pattern (Abstract)

10GE

10GE

10GE

10GE

10G

Border Router

WAN

Science DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

High performanceData Transfer Node

with high-speed storage

Per-service security policy control points

Clean, High-bandwidth

WAN path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

perfSONAR

© 2014, Energy Sciences Network24 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 25: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Local And Wide Area Data Flows

10GE

10GE

10GE

10GE

10G

Border Router

WAN

Science DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

High performanceData Transfer Node

with high-speed storage

Per-service security policy control points

Clean, High-bandwidth

WAN path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

High Latency WAN Path

Low Latency LAN Path

perfSONAR

© 2014, Energy Sciences Network25 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 26: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Supercomputer Center Deployment

• High-performance networking is assumed in this environment– Data flows between systems, between systems and storage, wide area, etc.– Global filesystem often ties resources together

• Portions of this may not run over Ethernet (e.g. IB)• Implications for Data Transfer Nodes

• “Science DMZ” may not look like a discrete entity here– By the time you get through interconnecting all the resources, you end up

with most of the network in the Science DMZ– This is as it should be – the point is appropriate deployment of tools,

configuration, policy control, etc.• Office networks can look like an afterthought, but they aren’t

– Deployed with appropriate security controls– Office infrastructure need not be sized for science traffic

© 2014, Energy Sciences Network26 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 27: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Supercomputer Center

© 2014, Energy Sciences Network27 – ESnet Science Engagement ([email protected]) - 11/19/18

Routed

Border Router

WAN

Core Switch/Router

Firewall

Offices

perfSONAR

perfSONAR

perfSONAR

Supercomputer

Parallel Filesystem

Front endswitch

Data Transfer Nodes

Front endswitch

Page 28: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Supercomputer Center Data Path

© 2014, Energy Sciences Network28 – ESnet Science Engagement ([email protected]) - 11/19/18

Routed

Border Router

WAN

Core Switch/Router

Firewall

Offices

perfSONAR

perfSONAR

perfSONAR

Supercomputer

Parallel Filesystem

Front endswitch

Data Transfer Nodes

Front endswitch

High Latency WAN Path

Low Latency LAN Path

Page 29: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Common Threads

• Two common threads exist in these examples

• Accommodation of TCP– Wide area portion of data transfers traverses purpose-built path– High performance devices that don’t drop packets

• Ability to test and verify– When problems arise (and they always will), they can be solved if the

infrastructure is built correctly– Small device count makes it easier to find issues– Multiple test and measurement hosts provide multiple views of the data

path• perfSONAR nodes at the site and in the WAN• perfSONAR nodes at the remote site

© 2014, Energy Sciences Network29 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 30: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Overview

• Science DMZ Motivation and Introduction

• Science DMZ Architecture

• Performance Monitoring

• Data Transfer Nodes & Applications

• Science DMZ Security

• Wrap Up

© 2014, Energy Sciences Network30 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 31: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Performance Monitoring• Everything may function perfectly when it is deployed• Eventually something is going to break

– Networks and systems are complex– Bugs, mistakes, …– Sometimes things just break (this is why we buy support contracts!)

• Must be able to find and fix problems when they occur• Must be able to find problems in other networks (your network may

be fine, but someone else’s problem can impact your users)

• TCP was intentionally designed to hide all transmission errors from the user:– “As long as the TCPs continue to function properly and the internet

system does not become completely partitioned, no transmission errors will affect the users.” (From RFC793, 1981)

© 2014, Energy Sciences Network31 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 32: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Soft Network Failures – Hidden Problems

• Hard failures are well-understood– Link down, system crash, software crash– Traditional network/system monitoring tools designed to quickly find hard

failures– Routing protocols reconverge automatically

• Soft failures result in degraded capability– Connectivity exists– Performance impacted– Typically something in the path is functioning, but not well

• Soft failures are hard to detect with traditional methods– No obvious single event– Sometimes no indication at all of any errors

• Independent testing is the only way to reliably find soft failures

© 2014, Energy Sciences Network32 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 33: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Rebooted router with full route table

Gradual failure of optical line card

Sample Soft Failures

Gb/s

normal performance

degrading performance

repair

one month

© 2014, Energy Sciences Network33 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 34: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Testing Infrastructure – perfSONAR

• perfSONAR is:– A widely-deployed test and measurement infrastructure

• ESnet, Internet2, US regional networks, international networks• Laboratories, supercomputer centers, universities

– A suite of test and measurement tools– A collaboration that builds and maintains the software tools

• By installing perfSONAR, a site can leverage over 2000 test servers deployed around the world

• perfSONAR is ideal for finding soft failures– Alert to existence of problems– Fault isolation– Verification of correct operation

© 2014, Energy Sciences Network34 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 35: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Lookup Service Directory Search: http://stats.es.net/ServicesDirectory/

© 2014, Energy Sciences Network35 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 36: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Overview

• Science DMZ Motivation and Introduction

• Science DMZ Architecture

• Performance Monitoring

• Data Transfer Nodes & Applications

• Science DMZ Security

• Wrap Up

© 2014, Energy Sciences Network36 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 37: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Dedicated Systems – Data Transfer Node

• The DTN is dedicated to data transfer

• Set up specifically for high-performance data movement

– System internals (BIOS, firmware, interrupts, etc.)

– Network stack

– Storage (global filesystem, Fibrechannel, local RAID, etc.)

– High performance tools

– No extraneous software

• Limitation of scope and function is powerful– No conflicts with configuration for other tasks

– Small application set makes cybersecurity easier

© 2014, Energy Sciences Network37 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 38: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Data Transfer Tools For DTNs

• Parallelism is important– It is often easier to achieve a given performance level with four parallel

connections than one connection– Several tools offer parallel transfers, including Globus/GridFTP

• Latency interaction is critical– Wide area data transfers have much higher latency than LAN transfers– Many tools and protocols assume a LAN

• Workflow integration is important

• Key tools: Globus Online, HPN-SSH

© 2014, Energy Sciences Network38 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 39: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Data Transfer Tool Comparison

• In addition to the network, using the right data transfer tool is critical• Data transfer test from Berkeley, CA to Argonne, IL (near Chicago). RTT = 53

ms, network capacity = 10Gbps.

Tool Throughput

scp: 140 MbpsHPN patched scp: 1.2 Gbpsftp 1.4 Gbps

GridFTP, 4 streams 5.4 GbpsGridFTP, 8 streams 6.6 Gbps

Note that to get more than 1 Gbps (125 MB/s) disk to disk requires properly

engineered storage (RAID, parallel filesystem, etc.)

© 2014, Energy Sciences Network39 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 40: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Overview

• Science DMZ Motivation and Introduction

• Science DMZ Architecture

• Performance Monitoring

• Data Transfer Nodes & Applications

• Science DMZ Security

• Wrap Up

© 2014, Energy Sciences Network40 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 41: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

• Goal: Disentangle security policy and enforcement for science flows from security for business systems

• Rationale– Science data traffic is simple from a security perspective– Narrow application set on Science DMZ• Data transfer, data streaming packages• No printers, document readers, web browsers, building control

systems, financial databases, staff desktops, etc. – Security controls that are typically implemented to protect business

resources often cause performance problems• Separation allows each to be optimized

Science DMZ Security

Page 42: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Science DMZ as Security Architecture

• Allows for better segmentation of risks, more granular application of controls to those segmented risks.– Limit risk profile for high-performance data transfer applications– Apply specific controls to data transfer hosts– Avoid including unnecessary risks, unnecessary controls

• Remove degrees of freedom – focus only on what is necessary– Easier to secure– Easier to achieve performance– Easier to troubleshoot

Page 43: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Performance is a Core Requirement

• Core information security principles– Confidentiality, Integrity, Availability (CIA)

• In data-intensive science, performance is an additional core mission requirement: CIA à PICA– CIA principles are important, but if the performance isn’t there the

science mission fails – This isn’t about “how much” security you have, but how the security is

implemented– Need to appropriately secure systems without performance

compromises

Page 44: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Science DMZ Placement Outside the Enterprise Firewall

• Why? For performance reasons– Specifically: Science DMZ traffic does not traverse the firewall data

plane– This has nothing to do with whether packet filtering is part of the

security enforcement toolkit• Lots of heartburn over this, especially from the perspective of a

conventional firewall manager– Organizational policy directives can mandate firewalls– Firewalls are designed to protect converged enterprise networks– Why would you put critical assets outside the firewall?

• The answer: Firewalls are typically a poor fit for high-performance science applications

Page 45: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Firewall Capabilities and Science Traffic

• Firewalls have a lot of sophistication in an enterprise setting– Application layer protocol analysis (HTTP, POP, MSRPC, etc.)– Built-in VPN servers– User awareness

• Data-intensive science flows don’t match this profile– Common case – data on filesystem A needs to be on filesystem Z• Data transfer tool verifies credentials over an encrypted channel• Then open a socket or set of sockets, and send data until done (1TB, 10TB,

100TB, …)– One workflow can use 10% to 50% or more of a 10G network link

• Do we have to use a firewall?

Page 46: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Firewalls as Access Lists

• What does a firewall admin ask for when asked to allow data transfers?– IP address of your host– IP address of the remote host– Port range– That looks like an ACL to me – I can do that on the router

• No special config for advanced protocol analysis – just address/port

• Router ACLs do not drop traffic permitted by policy, while enterprise firewalls can (and often do)

Page 47: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Firewall Performance Example• Observed performance, via perfSONAR, through a firewall:

• Observed performance, via perfSONAR, bypassing firewall:

Almost 20 times slower through the firewall

Huge improvement without the firewall

47 – ESnet Science Engagement ([email protected]) - 11/19/18

© 2015, The Regents of the University of California, through Lawrence Berkeley National Laboratory and is licensed under CC BY-

NC-ND 4.0

Page 48: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Security Without Enterprise Firewalls

• Data intensive science traffic interacts poorly with enterprise firewalls• Does this mean we ignore security? NO!– We must protect our systems– We need to find a way to do security that does not prevent us from

getting the science done• Key point – security policies and mechanisms that protect the Science

DMZ should be implemented so that they do not compromise performance

• Traffic permitted by policy should not experience performance impact as a result of the application of policy

• This gets back to network segmentation, and the value that separation provides

Page 49: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Other Security Mechanisms: ACLs And Applications

• Aggressive access lists– More useful with project-specific DTNs– Exchanging data with a small set of remote collaborators = ACL is fairly

easy to manage– Large-scale data distribution servers = difficult/time consuming to handle

(but then, the firewall ruleset for such a service would be, too)

• Limitation of the application set– Makes it easier to protect– Keep unnecessary applications off the DTN (and watch for them anyway

using a host IDS – take violations seriously)

Page 50: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Other Security Mechanisms: Network IDS

• Intrusion Detection Systems (IDS)– One example is Bro – http://bro-ids.org/– Bro is high-performance and battle-tested• Bro protects several high-performance national assets• Bro can be scaled with clustering: http://www.bro-

ids.org/documentation/cluster.html– Other IDS solutions also available

Page 51: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Other Security Mechanisms: Host IDS

• Using a Host IDS is recommended for hosts in a Science DMZ

• Several open source solutions exist:• OSSec: http://www.ossec.net/• Rkhunter: http://rkhunter.sourceforge.net (rootkit detection + FIM)• chkrootkit: http://chkrootkit.org/• Logcheck: http://logcheck.org (log monitoring)• Fail2ban: http://www.fail2ban.org/wiki/index.php/Main_Page• denyhosts: http://denyhosts.sourceforge.net/

Page 52: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Collaboration Within The Organization

• All stakeholders should collaborate on Science DMZ design, policy, and enforcement

• The security people have to be on board– Political cover for security officers– If the deployment of a Science DMZ is going to jeopardize the job of

the security officer, expect pushback• The Science DMZ is a strategic asset, and should be understood by the

strategic thinkers in the organization– Changes in security models– Changes in operational models– Enhanced ability to compete for funding– Increased institutional capability – greater science output

Page 53: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Overview

• Science DMZ Motivation and Introduction

• Science DMZ Architecture

• Performance Monitoring

• Data Transfer Nodes & Applications

• Science DMZ Security

• Wrap Up

© 2014, Energy Sciences Network53 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 54: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Wrap Up

• The Science DMZ design pattern provides a flexible model for supporting

high-performance data transfers and workflows

• Key elements:

– Accommodation of TCP

• Sufficient bandwidth to avoid congestion

• Loss-free IP service

– Location – near the site perimeter if possible

– Test and measurement

– Dedicated systems

– Appropriate security

• Support for advanced capabilities (e.g. SDN) is much easier with a Science

DMZ

54 – ESnet Science Engagement ([email protected]) -11/19/18 © 2014, Energy Sciences Network

Page 55: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Science DMZ Applications

11/19/18 Footer55

Page 56: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Context: Science DMZ Adoption

• DOE National Laboratories– Supercomputer centers, LHC sites, experimental facilities

– Both large and small sites

• NSF CC* programs have funded many Science DMZs– Large investments across the US university complex: over $100M

– Significant strategic importance

• Other US agencies– National Institutes of Health

– US Department of Agriculture

• Outside the USA– Australia https://www.rdsi.edu.au/dashnet

– Brazil

– Netherlands

– UK11/19/1856

Page 57: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Strategic Impacts• What does this mean?– We are in the midst of a significant cyberinfrastructure upgrade– Enterprise networks need not be unduly perturbed J

• Significantly enhanced capabilities compared to 5 years ago– Terabyte-scale data movement is much easier– Petabyte-scale data movement possible outside the LHC experiments• ~3.1Gbps = 1PB/month• ~14Gbps = 1PB/week

– Widely-deployed tools are much better (e.g. Globus)• Metcalfe’s Law of Network Utility– Value of Science DMZ proportional to the number of DMZs• n2 or n(logn) doesn’t matter – the effect is real

– Cyberinfrastructure value increases as we all upgrade

11/19/1857

Page 58: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Next Steps – Building On The Science DMZ

• Enhanced cyberinfrastructure substrate now exists– Wide area networks (ESnet, GEANT, NRENs, Internet2, Regionals)– Science DMZs connected to those networks– DTNs in the Science DMZs

• What does the scientist see?– Scientist sees a science application• Data transfer• Data portal• Data analysis

– Science applications are the user interface to networks and DMZs

• Large-scale data-intensive science requires that we build larger structures on top of those components

11/19/1858

Page 59: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

HPC Centers Matter

• Computing centers are special– Centers of excellence / expertise– Data repositories– Computing for simulation, data analysis

• Really though – the people + cyberinfrastructure combo is special– People who know how computers, networking, and storage work– Enough resources to make things happen

• Computing facilities are anchors for many collaborations– Common pattern: multi-institution team with access to one HPC center– Shared data, analysis, simulation platform

11/19/1859

Page 60: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Data And HPC: The Petascale DTN Project• Built on top of the Science DMZ• Effort to improve data transfer performance between the DOE

ASCR HPC facilities at ANL, LBNL, and ORNL, and also NCSA.– Multiple current and future science projects need to transfer data

between HPC facilities– Performance goal of 15 gigabits per second (equivalent to 1PB/week)– Realize performance goal for routine Globus transfers without special

tuning

• Reference data set is 4.4TB of cosmology simulation data• Use performant, easy-to-use tools with production options on– Globus Transfer service (previously Globus Online)– Use GUI just like a user would, with default options• E.g. integrity checksums enabled, as they should be

11/19/1860

Page 61: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

DTN Cluster Performance – HPC Facilities

21.2/22.6/24.5Gbps

23.1/33.7/39.7Gbps

26.7/34.7/39.9Gbps

33.2/43.4/50.3Gbps

35.9/39.0/40.7Gbps

29.9/33.1/35.5Gbps

34.6/47.5/56.8Gbps

44.1/46.8/48.4Gbps

41.0/42.2/43.9Gbps

33.0/35.0/37.8Gbps

43.0/50.0/56.3Gbps

55.4/56.7/57.4Gbps

DTN

DTN

DTN

DTN

NERSC DTN clusterGlobus endpoint: nersc#dtnFilesystem: /project

Data set: L380Files: 19260Directories: 211Other files: 0Total bytes: 4442781786482 (4.4T bytes)Smallest file: 0 bytes (0 bytes)Largest file: 11313896248 bytes (11G bytes)Size distribution:

1 - 10 bytes: 7 files10 - 100 bytes: 1 files100 - 1K bytes: 59 files1K - 10K bytes: 3170 files10K - 100K bytes: 1560 files100K - 1M bytes: 2817 files1M - 10M bytes: 3901 files10M - 100M bytes: 3800 files100M - 1G bytes: 2295 files1G - 10G bytes: 1647 files10G - 100G bytes: 3 files

Petascale DTN Project

November 2017L380 Data Set

Gigabits per second(min/avg/max), threetransfers

ALCF DTN clusterGlobus endpoint: alcf#dtn_miraFilesystem: /projects

OLCF DTN clusterGlobus endpoint: olcf#dtn_atlasFilesystem: atlas2

NCSA DTN clusterGlobus endpoint: ncsa#BlueWatersFilesystem: /scratch

11/19/1861

Page 62: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Petascale DTN Lifts All Boats

• Petascale DTN project benefits all projects which use the HPC facility DTNs

• Modern science data portal architecture– Data portals which use modern architecture benefit from DTN

improvements– DTN scaling/improvements benefit all data portals which use the same

pool

• Globus API supports this – see Globus World Tour– https://www.globusworld.org/tour/

11/19/1862

Page 63: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Science Data Portals

• Large repositories of scientific data

– Climate data

– Sky surveys (astronomy, cosmology)

– Many others

– Data search, browsing, access

• Many scientific data portals were designed 15+ years ago

– Single-web-server design

– Data browse/search, data access, user awareness all in a single system

– All the data goes through the portal server

• In many cases by design

• E.g. embargo before publication (enforce access control)

– Better than old command-line FTP, but outdated by today’s standards

11/19/1863

Page 64: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Legacy Portal Design

10GE

Border Router

WAN

Firewall

Enterprise

perfSONAR

perfSONAR

Filesystem(data store)

10GE

Portal Server

Browsing pathQuery pathData path

Portal server applications:· web server· search· database· authentication· data service

11/19/1864

• Very difficult to improve performance without architectural change– Software components all tangled

together– Difficult to put the whole portal in a

Science DMZ because of security– Even if you could put it in a DMZ, many

components aren’t scalable• What does architectural change mean?

Page 65: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Next-Generation Portal Leverages Science DMZ

10GE10GE

10GE

10GE

Border Router

WAN

Science DMZSwitch/Router

Firewall

Enterprise

perfSONAR

perfSONAR

10GE

10GE

10GE10GE

DTN

DTN

API DTNs(data access governed

by portal)

DTN

DTN

perfSONAR

Filesystem (data store)

10GE

Portal Server

Browsing pathQuery path

Portal server applications:· web server· search· database· authentication

Data Path

Data Transfer Path

Portal Query/Browse Path

11/19/1865

https://peerj.com/articles/cs-144/

Page 66: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Put The Data On Dedicated Infrastructure

• We have separated the data handling from the portal logic• Portal is still its normal self, but enhanced– Portal GUI, database, search, etc. all function as they did before– Query returns pointers to data objects in the Science DMZ– Portal is now freed from ties to the data servers (run it on Amazon if you

want!)• Data handling is separate, and scalable– High-performance DTNs in the Science DMZ– Scale as much as you need to without modifying the portal software

• Outsource data handling to computing centers– Computing centers are set up for large-scale data– Let them handle the large-scale data, and let the portal do the orchestration

of data placement• https://peerj.com/articles/cs-144/ - Modern Research Data Portal paper

11/19/1866

Page 67: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

NCAR RDA Data Portal

• Let’s say I have a nice compute allocation at NERSC – climate science

• Let’s say I need some data from NCAR for my project

• https://rda.ucar.edu/

• Data sets (there are many more, but these are two):

• https://rda.ucar.edu/datasets/ds199.1/

• https://rda.ucar.edu/datasets/ds313.0/

• Download to NERSC (could also do ALCF or NCSA or OLCF)

11/19/1867

Page 68: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

11/19/1868

Page 69: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

11/19/1869

Page 70: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

11/19/1870

Page 71: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

11/19/1871

Page 72: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Portal creates a Globus transfer job for us

11/19/1872

Page 73: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Submit the transfer job, go about our business

11/19/1873

Page 74: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Data Transfer from RDA Portal – Results

11/19/1874

Page 75: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

NCAR RDA Performance to DOE HPC Facilities

13.9 Gbps 16.6 Gbps 11.9 Gbps

DTN

nersc#dtnNERSC

DTN

olcf#dtn_atlasOLCF

DTN

alcf#dtn_miraALCF

DTN

NCAR RDArda#datashare

11/19/1875

• 1.5TB data set• 1121 files

Page 76: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Advanced Light Source

11/19/1876

Page 77: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

ALS Science DMZ implementation

11/19/1877

Beamline Camera &Data Acquisition

BeamlineSwitch

Short Term Cache

Long TermStorage

ALS Router

Campus Zone

Router

ALSScience DMZ

Router

LBNL Border

Router 1

LBNL Border

Router 2Building Switch

Analysis Workstations

@ 1Gbps

10Gbps Link1Gbps Link

100Gbps Link

Science DMZ / WANCampus LANBeamline LAN

Supercomputing Facility

Beamline Workstations

@ 10Gbps

Page 78: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Superfacility: Computing, experiments, networks in a single ecosystem

- 78 -

HipGISAXS & RMC

GISAXS

Slot-die printing of Organic photovoltaics

Liu et al, “Fast printing and in situ morphology …”. Adv Mater. 2015

Page 79: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

79

Page 80: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Getting The Data To HPC

• The data source is one end – the HPC facility is the other end

• Connecting facilities together is done by means of a network (e.g. ESnet)

• Several ways to do this from a data perspective– Move data to persistent storage before the job starts• Move data into filesystem• Move data into burst buffer

– Remote data access during the job• Remote I/O• RDMA

11/19/1880

Page 81: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Abstract HPC Facility

11/19/1881

ComputeNodes

Burst BufferNodes

HPCFacilityNetwork

GatewayNodes

DTN

DTN

DTN

DTN

DTNCluster

Interconnect

External Networks

(e.g. ESnet)

Filesystem /Object Store

• No facility looks exactly like this– Abstract cartoon only– Contains essential elements for

reasoning about data ingest• Gateway nodes and DTNs have

external connectivity– DTNs are external interface for

filesystem / object store– Gateway nodes provide external

connectivity for interior components of the machine

• Different workflows touch different parts, with performance and software stack implications

Page 82: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Data Transfer Via DTNs and Filesystem

11/19/1882

ComputeNodes

Burst BufferNodes

HPCFacilityNetwork

GatewayNodes

DTN

DTN

DTN

DTN

DTNCluster

Interconnect

External Networks

(e.g. ESnet)

Filesystem /Object Store

Data Ingest PathCompute Access Path

• Common data ingest method for HPC

facilities

– Data transferred to filesystem via

DTNs

– Compute nodes access data from

filesystem at run time

• Persistent storage allows decoupling

data ingest from compute job

• Well-understood workflow with well-

developed stacks (Globus, others)

• Filesystem performance may be an

issue going forward

Page 83: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Data Transfer Via Burst Buffer

11/19/1883

ComputeNodes

Burst BufferNodes

GatewayNodes

DTN

DTN

DTN

DTN

DTNCluster

Interconnect

External Networks

(e.g. ESnet)

Filesystem /Object Store

Data Ingest PathCompute Access Path

HPCFacilityNetwork

• Data transferred to burst buffer instead of to persistent storage– Higher performance than filesystem– No copy on persistent storage unless

accomplished after ingest• High performance access from compute• Burst buffer allows decoupling data

ingest from compute job (assuming facility permits staging to burst buffer)

• Tools and workflows less developed – Burst buffer Globus endpoint?– Automation of path through gateway

nodes? (under current development)

Page 84: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Streaming Access From Compute Job

11/19/1884

ComputeNodes

Burst BufferNodes

GatewayNodes

DTN

DTN

DTN

DTN

DTNCluster

Interconnect

External Networks

(e.g. ESnet)

Filesystem /Object Store

Data Ingest PathCompute Access Path

HPCFacilityNetwork

• Compute job reads data into memory– Access from remote via the network– No data staging at all before job runs– Increased flexibility: fetch what you

need when you need it from wherever it is

• No file transfer semantics at all

• Network is in the critical path of job execution

• LHC experiments do this today using pull model from job (XRootD)

• Other communities expressing interest and doing trials (e.g. via RDMA)

Page 85: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Long-Term Vision

ESnet(Big Science facilities,

DOE labs)Internet2 + Regionals

(US Universities and affiliated institutions)

International Networks(Universities and labs in Europe, Asia, Americas, Australia, etc.)

High-performance feature-rich science network ecosystem

Commercial Clouds(Amazon, Google,

Microsoft, etc.)

Agency Networks(NASA, NOAA, etc.)

Campus HPC+

Data

11/19/1885

Page 86: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

It’s All A Bunch Of Science DMZs

High-performance feature-rich science network ecosystem

DTN

DTN

DTN

DTN

DMZDMZ

DMZDMZ

DTNDATA

DTN

DMZDMZ

DTN

DTN

DMZDMZDATA

DTN DTN

DMZDMZ

Parallel Filesystem

DTN DTN

DTN

DTNDMZDMZ

DATA

DTN

DTN

DMZDMZ

Experiment Data Archive

DTN

DTN

DTN

DTN

DTN DMZDMZ

DTN DTN

DMZDMZ

DATA

11/19/1886

Page 87: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

It’s All A Bunch Of Science DMZs

High-performance feature-rich science network ecosystem

DTN

DTN

DTN

DTN

DMZDMZ

DMZDMZ

DTNDATA

DTN

DMZDMZ

DTN

DTN

DMZDMZDATA

DTN DTN

DMZDMZ

Parallel Filesystem

DTN DTN

DTN

DTNDMZDMZ

DATA

DTN

DTN

DMZDMZ

Experiment Data Archive

DTN

DTN

DTN

DTN

DTN DMZDMZ

DTN DTN

DMZDMZ

DATAHPC Facilities

Single Lab

Experiments

DataPortal

LHC Experiments

UniversityComputing

11/19/1887

Page 88: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

In conclusion – ESnet’s vision:

Scientific progress will be completely unconstrained by the physical location of instruments, people, computational

resources, or data. 88

Page 89: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Links

– ESnet fasterdata knowledge base• http://fasterdata.es.net/

– Science DMZ paper• http://www.es.net/assets/pubs_presos/sc13sciDMZ-final.pdf

– Science DMZ email list• https://gab.es.net/mailman/listinfo/sciencedmz

– perfSONAR• http://fasterdata.es.net/performance-testing/perfsonar/• http://www.perfsonar.net

© 2014, Energy Sciences Network89 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 90: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Extra Slides

© 2014, Energy Sciences Network90 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 91: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Real-World Example – Using perfSONAR• Methodology is important

• Segment-to-segment testing is unlikely to be helpful– TCP dynamics will be different, and in this case all the pieces do not equal the whole

• E.g. high throughput on a 1ms path with high packet loss vs. the same segment in a longer 20ms path

– Problem links can test clean over short distances– An exception to this is hops that go thru a firewall

• Run long-distance tests– Run the longest clean test you can, then look for the shortest dirty test that includes

the path of the clean test• In order for this to work, the testers need to be already deployed when you start

troubleshooting– ESnet has at least one perfSONAR host at each hub location.

• Many (most?) R&E providers in the world have deployed at least 1– Deployment of test and measurement infrastructure dramatically reduces time to

resolution• Otherwise the problem resolution is burdened by a deployment effort

91 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 92: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

A small amount of packet loss makes a huge difference in TCP performance

Metro Area

Local(LAN)

Regional

Continental

International

Measured (TCP Reno) Measured (HTCP) Theoretical (TCP Reno) Measured (no loss)

With loss, high performance beyond metro distances is essentially impossible

© 2014, Energy Sciences NetworkSource: Brian Tierney, ESnet

Page 93: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Wide Area Testing – User Problem Statement

10GE

10GE

10GE

Nx10GE

10GE

10GE

perfSONARperfSONARBorder perfSONAR Science DMZ perfSONAR

perfSONARBorder perfSONAR

perfSONARScience DMZ perfSONAR

PoorPerformance

WAN

University CampusNational Laboratory

93 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 94: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Wide Area Testing – Full Context

10GE

10GE

10GE10GE 10GE10GE

10GE10GE

10GE

10GE

Nx10GE

Nx10GE

100GE

100GE

10GE

10GE

10GE

10GE

10GE

100GE100GE

100GE

perfSONAR

perfSONAR

perfSONARBorder perfSONAR Science DMZ perfSONAR

perfSONAR

perfSONARperfSONAR perfSONAR perfSONAR

perfSONAR

10GE

perfSONAR

perfSONARBorder perfSONAR

perfSONARScience DMZ perfSONAR

Internet2 path~15 msec

ESnet path~30 msec

RegionalPath

~2 msec

Campus~1 msecLab

~1 msec

PoorPerformance

94 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 95: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Wide Area Testing – Long Clean Test

10GE

10GE

10GE10GE 10GE10GE

10GE10GE

10GE

10GE

Nx10GE

Nx10GE

100GE

100GE

10GE

10GE

10GE

10GE

10GE

100GE100GE

100GE

perfSONAR

perfSONAR

perfSONAR

48 msec

Border perfSONAR Science DMZ perfSONAR

perfSONAR

perfSONARperfSONAR perfSONAR perfSONAR

perfSONAR

10GE

perfSONAR

perfSONARBorder perfSONAR

perfSONARScience DMZ perfSONAR

Internet2 path~15 msec

Clean,FastClean,

Fast

ESnet path~30 msec

RegionalPath

~2 msec

Campus~1 msecLab

~1 msec

95 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 96: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

10GE

10GE

10GE10GE 10GE10GE

10GE10GE

10GE

10GE

Nx10GE

Nx10GE

100GE

100GE

10GE

10GE

10GE

10GE

10GE

100GE100GE

100GE

perfSONAR

perfSONAR

perfSONAR

48 msec

Border perfSONAR Science DMZ perfSONAR

perfSONAR

perfSONARperfSONAR perfSONAR perfSONAR

perfSONAR

10GE

perfSONAR

perfSONARBorder perfSONAR

perfSONARScience DMZ perfSONAR

49 msec

49 msec

Internet2 path~15 msec

Clean,Fast

Clean,FastClean,

Fast

Dirty,Slow

Dirty,Slow

Clean,Fast

ESnet path~30 msec

RegionalPath

~2 msec

Campus~1 msecLab

~1 msec

Wide Area Testing – Dirty Tests

96 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 97: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

10GE

10GE

10GE10GE 10GE10GE

10GE10GE

10GE

10GE

Nx10GE

Nx10GE

100GE

100GE

10GE

10GE

10GE

10GE

10GE

100GE100GE

100GE

perfSONAR

perfSONAR

perfSONAR

48 msec

Border perfSONAR Science DMZ perfSONAR

perfSONAR

perfSONARperfSONAR perfSONAR perfSONAR

perfSONAR

10GE

perfSONAR

perfSONARBorder perfSONAR

perfSONARScience DMZ perfSONAR

49 msec

49 msec

Internet2 path~15 msec

Clean,Fast

Clean,FastClean,

Fast

Dirty,Slow

Dirty,Slow

Clean,Fast

ESnet path~30 msec

RegionalPath

~2 msec

Campus~1 msecLab

~1 msec

Wide Area Testing – Problem Localization

97 – ESnet Science Engagement ([email protected]) - 11/19/18

Slow tests indicate likely problem

area

Page 98: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

Lessons From The Example

• This testing can be done quickly if perfSONAR is already deployed

• Huge productivity

– Reasonable hypothesis developed quickly

– Probable administrative domain with problem identified

– Testing time can be short – an hour or so at most

• Without perfSONAR cases like this are very challenging

– Time to resolution measured in months (I’ve worked those cases too)

• In order to be useful for data-intensive science, the network must be fixable

quickly, because problems happen in all networks

• The Science DMZ model allows high-performance use of the network, but

perfSONAR is necessary to ensure the whole kit functions well

98 – ESnet Science Engagement ([email protected]) - 11/19/18

Page 99: Data Placement: HPC and Portals - SC18sc18.supercomputing.org/proceedings/panel/panel... · • “Science DMZ” may not look like a discrete entity here – By the time you get

The Science DMZ in 1 SlideConsists of three key components, all required:

• “Friction free” network path– Highly capable network devices (wire-speed, deep queues)– Virtual circuit connectivity option– Security policy and enforcement specific to science workflows– Located at or near site perimeter if possible

• Dedicated, high-performance Data Transfer Nodes (DTNs)– Hardware, operating system, libraries all optimized for transfer– Includes optimized data transfer tools such as Globus Online and GridFTP

• Performance measurement/test node– perfSONAR

• To foster adoption, Engagement with end users is critical

Details at http://fasterdata.es.net/science-dmz/

© 2013 Wikipedia

© 2014, Energy Sciences Network99 – ESnet Science Engagement ([email protected]) - 11/19/18