16 science dmz dart - globusworld · pdf filescience dmz design pattern (abstract) 10ge ge...
TRANSCRIPT
![Page 1: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/1.jpg)
DataPortals
EliDart,NetworkEngineerESnetScienceEngagementLawrenceBerkeleyNationalLaboratory
GlobusWorld
Chicago,IL
April11,2017
![Page 2: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/2.jpg)
Overview
4/15/172
• ScienceDMZandDataPortals
• ThisassumesyoualreadyhaveaScienceDMZ– Ifyoudon’thaveone,wecanchatabouthowyoumightbuildone– Ifitwouldbehelpful,Icantalktoyoursystemsandnetworkingfolks– Orcheckoutthefasterdataknowledgebase:
• http://fasterdata.es.net/science-dmz/
![Page 3: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/3.jpg)
ScienceDMZDesignPattern(Abstract)
10GE
10GE
10GE
10GE
10G
Border Router
WAN
Science DMZSwitch/Router
Enterprise Border Router/Firewall
Site / CampusLAN
High performanceData Transfer Node
with high-speed storage
Per-service security policy control points
Clean, High-bandwidth
WAN path
Site / Campus access to Science
DMZ resources
perfSONAR
perfSONAR
perfSONAR
3 – ESnet Science Engagement ([email protected]) - 4/15/17 ©2015,EnergySciencesNetwork
![Page 4: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/4.jpg)
HPCCenterDataPath
©2014,EnergySciencesNetwork4 – ESnet Science Engagement ([email protected]) - 4/15/17
Routed
Border Router
WAN
Core Switch/Router
Firewall
Offices
perfSONAR
perfSONAR
perfSONAR
Supercomputer
Parallel Filesystem
Front endswitch
Data Transfer Nodes
Front endswitch
High Latency WAN Path
Low Latency LAN Path
![Page 5: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/5.jpg)
NextSteps– BuildingOnTheScienceDMZ
• Enhancedcyberinfrastructuresubstratenowexists– Wideareanetworks(ESnet,GEANT,Internet2,Regionals)– ScienceDMZsconnectedtothosenetworks– DTNsintheScienceDMZs
• Whatdoesthescientistsee?– Scientistseesascienceapplication
• Datatransfer• Dataportal• Dataanalysis
– ScienceapplicationsaretheuserinterfacetonetworksandDMZs
• Large-scaledata-intensivesciencerequiresthatwebuildlargerstructuresontopofthosecomponents
4/15/175
![Page 6: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/6.jpg)
ScienceDataPortals
• Largerepositoriesofscientificdata– Climatedata– Skysurveys(astronomy,cosmology)– Manyothers– Datasearch,browsing,access
• Manyscientificdataportalsweredesigned15+yearsago– Single-web-serverdesign– Databrowse/search,dataaccess,userawarenessallinasinglesystem– Allthedatagoesthroughtheportalserver
• Inmanycasesbydesign• E.g.embargobeforepublication(enforceaccesscontrol)
4/15/176
![Page 7: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/7.jpg)
LegacyPortalDesign
10GE
Border Router
WAN
Firewall
Enterprise
perfSONAR
perfSONAR
Filesystem(data store)
10GE
Portal Server
Browsing pathQuery pathData path
Portal server applications:· web server· search· database· authentication· data service
4/15/177
• Verydifficulttoimproveperformancewithoutarchitecturalchange– Softwarecomponentsalltangledtogether
– DifficulttoputthewholeportalinaScienceDMZbecauseofsecurity
– EvenifyoucouldputitinaDMZ,manycomponentsaren’tscalable
• Whatdoesarchitecturalchangemean?
![Page 8: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/8.jpg)
ExampleofArchitecturalChange– CDN
• Let’slookatwhatContentDeliveryNetworksdidforwebapplications
• CDNsareawell-deployeddesignpattern(Netflix,etc)• WhatdoesaCDNdo?
– Storestaticcontentinaseparatelocationfromdynamiccontent• Complexityisn’tinthestaticcontent– it’sintheapplicationdynamics• Webapplicationsarecomplex,full-featured,andslow• Dataserviceforstaticcontentissimplebycomparison
– Separationofapplicationanddataserviceallowseachtobeoptimized
4/15/178
![Page 9: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/9.jpg)
ClassicalWebServerModel
4/15/179
• Webbrowserfetchespagesfromwebserver– Allcontentstoredonthewebserver– Webapplicationsrunonthewebserver– Webserversendsdatatoclientbrowseroverthenetwork
• Perceivedclientperformancechangeswithnetworkconditions– Severalproblemsinthegeneralcase– Latencyincreasestimetopagerender– Packetloss+latencycauseproblemsforlargestaticobjects
HostingProvider
TransitNetwork
Residential BroadbandWEB
Long Distance / High Latency
Web Server
Browser
![Page 10: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/10.jpg)
Solution:PlaceLargeStaticObjectsNearClient
HostingProvider
TransitNetwork
Residential BroadbandWEB
Long Distance / High Latency
CDN
DATA
Short Distance / Low Latency
Web Server
CDN Data Server
Browser
4/15/1710
• CDNprovidesstaticcontent“close”toclient
• Webserverstillmanagescomplexbehavior
• Latencygoesdown– Timetopagerendergoesdown– Staticcontentperformancegoesup
• Loadonwebservergoesdown(noneedtoservestaticcontent)
• Significantwinforwebapplicationperformance
![Page 11: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/11.jpg)
ClientSimplySeesIncreasedPerformance
4/15/1711
• Clientdoesn’tseetheCDNasaseparatething– Webcontentisallstillviewedinabrowser
• Browserfetcheswhatthepagetellsittofetch• Differentcontentcomesfromdifferentplaces• Userdoesn’tknow/care
• CDNsprovideanarchitecturalsolutiontoaperformanceproblem– Notbrute-force– Worksmarter,notharder
The‘NetWEB
Browser
Web Server
Rich, Slow
DATA
CDN Data Server
Simple,Fast
The‘NetWEB
Browser
Web Server
![Page 12: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/12.jpg)
ArchitecturalExaminationofDataPortals
• Commondataportalfunctions(mostportalshavethese)– Search/query/discovery– Datadownloadmethodfordataaccess– GUIforbrowsingbyhumans– APIformachineaccess– ideallyincorporatessearch/query+download
• Performancepainisprimarilyinthedatahandlingpiece– Rapidincreaseindatascaleeclipsedlegacysoftwarestackcapabilities– Portalserversoftenstuckinenterprisenetwork
• Canwe“disassemble”theportalandputthepiecesbacktogetherbetter?– UseScienceDMZasaplatformforthedatapiece– AvoidplacingcomplexsoftwareintheScienceDMZ
4/15/1712
![Page 13: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/13.jpg)
LegacyPortalDesign
10GE
Border Router
WAN
Firewall
Enterprise
perfSONAR
perfSONAR
Filesystem(data store)
10GE
Portal Server
Browsing pathQuery pathData path
Portal server applications:· web server· search· database· authentication· data service
4/15/1713
![Page 14: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/14.jpg)
Next-GenerationPortalLeveragesScienceDMZ
10GE10GE
10GE
10GE
Border Router
WAN
Science DMZSwitch/Router
Firewall
Enterprise
perfSONAR
perfSONAR
10GE
10GE
10GE10GE
DTN
DTN
API DTNs(data access governed
by portal)
DTN
DTN
perfSONAR
Filesystem (data store)
10GE
Portal Server
Browsing pathQuery path
Portal server applications:· web server· search· database· authentication
Data Path
Data Transfer Path
Portal Query/Browse Path
4/15/1714
![Page 15: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/15.jpg)
PutTheDataOnDedicatedInfrastructure
• Wehaveseparatedthedatahandlingfromtheportallogic• Portalisstillitsnormalself,butenhanced
– PortalGUI,database,search,etc.allfunctionastheydidbefore– QueryreturnspointerstodataobjectsintheScienceDMZ– Portalisnowfreedfromtiestothedataservers(runitonAmazonifyouwant!)
• Datahandlingisseparate,andscalable– High-performanceDTNsintheScienceDMZ– Scaleasmuchasyouneedtowithoutmodifyingtheportalsoftware
• Outsourcedatahandlingtocomputingcenters– Computingcentersaresetupforlarge-scaledata– Letthemhandlethelarge-scaledata,andlettheportaldotheorchestrationofdataplacement
4/15/1715
![Page 16: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/16.jpg)
ScalabilityExample– PetascaleDTNProject
10.0 Gbps
17.6 Gbps
14.8 Gbps
19.3 Gbps
17.4 Gbps 17.0 Gbps
32.4 Gbps
25.3 Gbps
18.3 Gbps
16.3 Gbps
24.1 Gbps
24.0 Gbps
DTN
DTN
DTN
DTN
alcf#dtn_miraALCF
nersc#dtnNERSC
olcf#dtn_atlasOLCF
ncsa#BlueWatersNCSA
Data set: L380Files: 19260Directories: 211Other files: 0Total bytes: 4442781786482 (4.4T bytes)Smallest file: 0 bytes (0 bytes)Largest file: 11313896248 bytes (11G bytes)Size distribution:
1 - 10 bytes: 7 files10 - 100 bytes: 1 files100 - 1K bytes: 59 files1K - 10K bytes: 3170 files10K - 100K bytes: 1560 files100K - 1M bytes: 2817 files1M - 10M bytes: 3901 files10M - 100M bytes: 3800 files100M - 1G bytes: 2295 files1G - 10G bytes: 1647 files10G - 100G bytes: 3 files
March 2017L380 Data Set
4/15/1716
![Page 17: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/17.jpg)
LinksandLists
– ESnetfasterdataknowledgebase• http://fasterdata.es.net/
– ScienceDMZpaper• http://www.es.net/assets/pubs_presos/sc13sciDMZ-final.pdf
– ScienceDMZemaillist• [email protected] withsubject"subscribeesnet-sciencedmz”
– perfSONAR• http://fasterdata.es.net/performance-testing/perfsonar/• http://www.perfsonar.net
– Globus• https://www.globus.org/
17 – ESnet Science Engagement ([email protected]) - 4/15/17 ©2015,EnergySciencesNetwork
![Page 18: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/18.jpg)
Thanks!
[email protected](ESnet)LawrenceBerkeleyNationalLaboratory
http://fasterdata.es.net/
http://my.es.net/
http://www.es.net/
![Page 19: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/19.jpg)
ExtraSlides
4/15/1719
![Page 20: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/20.jpg)
DTNClusterDetail
10GE10GE
10GE10GE
10GE
10GE
Border Router
WAN
Science DMZSwitch/Router
Firewall
Enterprise
perfSONAR
perfSONAR
10GE10GE
10GE
10GE
10GE10GE
DTN
DTN
Filesystem
HEAD
“Sealed” DTNs(Globus only, no
shell access)
ClusterHead/Login
Nodes
DTN
DTN
Cluster compute nodes
HEAD
perfSONAR
Configure as DTN Cluster
4/15/1720
![Page 21: 16 Science DMZ Dart - GlobusWorld · PDF fileScience DMZ Design Pattern (Abstract) 10GE GE 10GE 10GE 10G Router WAN DMZ er l Campus LAN performance e storage Per-service policy points](https://reader034.vdocuments.net/reader034/viewer/2022042800/5a71c0f67f8b9abb538d0e60/html5/thumbnails/21.jpg)
DTNClusterDesign
• ConfigureallfourDTNsasasingleGlobusendpoint– Globushasdocsonhowtodothis– https://support.globus.org/entries/71011547-How-do-I-add-multiple-I-O-nodes-to-a-Globus-endpoint-
• Recentoptionsforincreasedperformance– Useadditionalparallelconnections– DistributetransfersacrossmultipleDTNs(GlobusI/ONodes)– Critical– onlydothiswhenallDTNsintheendpointmountthesamesharedfilesystem
• UsetheGlobusCLIcommandendpoint-modify – Usethe--network-useoption– Adjustsconcurrencyandparallelism– Moreinfoatglobus.org (http://dev.globus.org/cli/reference/endpoint-modify/)
4/15/1721