experiences with high-bandwidth networks

15
Streaming Exascale Data over 100Gbps Networks Mehmet Balman Scien.fic Data Management Group Computa.onal Research Division Lawrence Berkeley Na.onal Laboratory

Upload: balmanme

Post on 13-Jan-2017

283 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Experiences with High-bandwidth Networks

Streaming  Exa-­‐scale  Data  over  100Gbps  Networks  

Mehmet  Balman    

Scien.fic  Data  Management  Group  Computa.onal  Research  Division  

Lawrence  Berkeley  Na.onal  Laboratory  

Page 2: Experiences with High-bandwidth Networks

ESG  (Earth  Systems  Grid)  

• Over  2,700  sites  •  25,000  users  

 

•  IPCC  Fi3h  Assessment  Report  (AR5)  2PB    

•  IPCC  Forth  Assessment  Report  (AR4)  35TB  

Page 3: Experiences with High-bandwidth Networks

Applications’  Perspective  

•  Increasing  the  bandwidth  is  not  sufficient  by  itself;  we  need  careful  evaluaLon  of  high-­‐bandwidth  networks  from  the  applicaLons’  perspecLve.    

 •  Data  distribu.on  for  climate  science  

•   How  scien*fic  data  movement  and  analysis  between  geographically  disparate  supercompu*ng  facili*es  can  benefit  from  high-­‐bandwidth  networks?  

Page 4: Experiences with High-bandwidth Networks

Climate  Data  over  100Gbps  

•  Data  volume  in  climate  applicaLons  is  increasing  exponenLally.  

•  An  important  challenge  in  managing  ever  increasing  data  sizes  in  climate  science  is  the  large  variance  in  file  sizes.    •  Climate  simulaLon  data  consists  of  a  mix  of  relaLvely  small  and  large  files  with  irregular  file  size  distribuLon  in  each  dataset.    

• Many  small  files  

Page 5: Experiences with High-bandwidth Networks

Keep  the  data  channel  full  

FTP RPC

request a file

request a file

send file

send file

request data

send data

•  Concurrent  transfers  •  Parallel  streams  

Page 6: Experiences with High-bandwidth Networks

lots-­‐of-­‐small-­‐Ciles  problem!  Cile-­‐centric  tools?    

l  Not  necessarily  high-­‐speed  (same  distance)  -  Latency  is  sLll  a  problem  

100Gbps pipe 10Gbps pipe

request a dataset

send data

Page 7: Experiences with High-bandwidth Networks

Framework  for  the  Memory-­‐mapped  Network  Channel  

memory  caches  are  logically  mapped  between  client  and  server    

Page 8: Experiences with High-bandwidth Networks

Moving  climate  Ciles  efCiciently  

Page 9: Experiences with High-bandwidth Networks

Advantages  of    MemzNet  •  Decoupling  I/O  and  network  operaLons  

•  front-­‐end  (I/O    processing)  •  back-­‐end  (networking  layer)    

•  Not  limited  by  the  characterisLcs  of  the  file  sizes    On  the  fly  tar  approach,    bundling  and  sending    many  files  together  

•  Dynamic  data  channel  management    Can  increase/decrease  the  parallelism  level  both    in  the  network  communicaLon  and  I/O  read/write    operaLons,  without  closing  and  reopening  the    data  channel  connecLon  (as  is  done  in  regular  FTP    variants).    

Page 10: Experiences with High-bandwidth Networks

ANI  100Gbps    testbed  

ANI 100G Router

nersc-diskpt-2

nersc-diskpt-3

nersc-diskpt-1

nersc-C2940 switch

4x10GE (MM)

4x 10GE (MM)

Site Router(nersc-mr2)

anl-mempt-2

anl-mempt-1

anl-app

nersc-app

NERSC ANL

Updated December 11, 2011

ANI Middleware Testbed

ANL Site Router

4x10GE (MM)

4x10GE (MM)

100G100G

1GE

1 GE

1 GE

1 GE

1GE

1 GE

1 GE1 GE

10G

10G

To ESnet

ANI 100G Router

4x10GE (MM)

100G 100G

ANI 100G Network

anl-mempt-1 NICs:2: 2x10G Myricom

anl-mempt-2 NICs:2: 2x10G Myricom

nersc-diskpt-1 NICs:2: 2x10G Myricom1: 4x10G HotLava

nersc-diskpt-2 NICs:1: 2x10G Myricom1: 2x10G Chelsio1: 6x10G HotLava

nersc-diskpt-3 NICs:1: 2x10G Myricom1: 2x10G Mellanox1: 6x10G HotLava

Note: ANI 100G routers and 100G wave available till summer 2012; Testbed resources after that subject funding availability.

nersc-asw1

anl-C2940 switch

1 GE

anl-asw1

1 GE

To ESnet

eth0

eth0

eth0

eth0

eth0

eth0

eth2-5

eth2-5

eth2-5

eth2-5

eth2-5

eth0

anl-mempt-3

4x10GE (MM)

eth2-5 eth0

1 GE

anl-mempt-3 NICs:1: 2x10G Myricom1: 2x10G Mellanox

4x10GE (MM)

10GE (MM)10GE (MM)

SC11  100Gbps    demo  

Page 11: Experiences with High-bandwidth Networks

Disadvantage  of  many  TCP  Streams  

(a) total throughput vs. the number of concurrent memory-to-memory transfers, (b) interface traffic, packages per second (blue) and bytes per second, over a single NIC with different number of concurrent transfers. Three hosts, each with 4 available NICs, and a total of 10 10Gbps NIC pairs were used to saturate the 100Gbps pipe in the ANI Testbed. 10 data movement jobs, each corresponding to a NIC pair, at source and destination started simultaneously. Each peak represents a different test; 1, 2, 4, 8, 16, 32, 64 concurrent streams per job were initiated for 5min intervals (e.g. when concurrency level is 4, there are 40 streams in total).  

   

Page 12: Experiences with High-bandwidth Networks

ANI testbed 100Gbps (10x10NICs, three hosts): Interrupts/CPU vs the number of concurrent transfers [1, 2, 4, 8, 16, 32 64 concurrent jobs - 5min intervals], TCP buffer size is 50M

Effects  of  many  streams  

Page 13: Experiences with High-bandwidth Networks

MemzNet’s  Performance    

TCP  buffer  size  is  set  to  50MB    

MemzNet GridFTP

SC11 demo

ANI Testbed

Page 14: Experiences with High-bandwidth Networks

MemzNet’s  Architecture  for  data  streaming  

Page 15: Experiences with High-bandwidth Networks

Acknowledgements  Eric  Pouyoul,  Yushu  Yao,  E.  Wes  Bethel,  Burlen  Loring,  Prabhat,  John  Shalf,  Alex  Sim,  Brian  L.  Tierney,  Peter  Nugent,  Zarija  Lukic   ,  Patrick  Dorn,   Evangelos   Chaniotakis,   John   Christman,   Chin   Guok,   Chris  Tracy,  Lauren  Rotman,  Jason  Lee,  Shane  Canon,  Tina  Declerck,  Cary  Whitney,  Ed  Holohan,    Adam  Scovel,  Linda  Winkler,  Jason  Hill,  Doug  Fuller,     Susan   Hicks,   Hank   Childs,   Mark   Howison,   Aaron   Thomas,  John  Dugan,  Gopal  Vaswani