high performance cyberinfrastructure discovery tools for data intensive research

18
High Performance High Performance Cyberinfrastructure Cyberinfrastructure Discovery Tools Discovery Tools for Data Intensive Research for Data Intensive Research Larry Smarr Larry Smarr Prof. Computer Science and Prof. Computer Science and Engineering Engineering Director, Calit2 (UC San Director, Calit2 (UC San Diego/UC Irvine) Diego/UC Irvine)

Upload: emory

Post on 01-Feb-2016

51 views

Category:

Documents


0 download

DESCRIPTION

High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research. Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC San Diego/UC Irvine). Abstract. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

High Performance Cyberinfrastructure High Performance Cyberinfrastructure

Discovery Tools Discovery Tools for Data Intensive Researchfor Data Intensive Research

High Performance Cyberinfrastructure High Performance Cyberinfrastructure

Discovery Tools Discovery Tools for Data Intensive Researchfor Data Intensive Research

Larry SmarrLarry SmarrProf. Computer Science and EngineeringProf. Computer Science and EngineeringDirector, Calit2 (UC San Diego/UC Irvine)Director, Calit2 (UC San Diego/UC Irvine)

Page 2: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

AbstractAbstractAbstractAbstract

High performance cyberinfrastructure (10Gbps dedicated optical channels end-to-end) enable new levels of discovery for data-intensive research projects. I will use several different examples of large data projects drawn from cosmological simulations, ocean observing, and microbial metagenomics. I will discuss why local campus high performance clouds are essential for this sort of work in academia--as a high bandwidth, high I/O fast storage, large RAM compute augmentation to the remote commercial cloud.

Page 3: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

Dedicated 10,000Mbps (10Gbps) Supernetworks Dedicated 10,000Mbps (10Gbps) Supernetworks Enable Remote Visual Analysis of Big DataEnable Remote Visual Analysis of Big DataDedicated 10,000Mbps (10Gbps) Supernetworks Dedicated 10,000Mbps (10Gbps) Supernetworks Enable Remote Visual Analysis of Big DataEnable Remote Visual Analysis of Big Data

Also

NLR 80 x 10Gb Wavelengths

Page 4: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

NSF’s OptIPuter Project: Using Supernetworks NSF’s OptIPuter Project: Using Supernetworks to Meet the Needs of Data-Intensive Researchersto Meet the Needs of Data-Intensive ResearchersNSF’s OptIPuter Project: Using Supernetworks NSF’s OptIPuter Project: Using Supernetworks to Meet the Needs of Data-Intensive Researchersto Meet the Needs of Data-Intensive Researchers

OptIPortal– Termination Device

for the OptIPuter 10Gbps Backplane

Page 5: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

Exploring Cosmology With Supercomputers, Exploring Cosmology With Supercomputers, Supernetworks, and SupervisualizationSupernetworks, and SupervisualizationExploring Cosmology With Supercomputers, Exploring Cosmology With Supercomputers, Supernetworks, and SupervisualizationSupernetworks, and Supervisualization

• Supercomputer Output– 148 TB Movie Output

(0.25 TB/file)– 80 TB Diagnostic Dumps

(8 TB/file)

• Connected at 10Gbps– Oak Ridge to ANL to SDSC

• Supercomputer Output– 148 TB Movie Output

(0.25 TB/file)– 80 TB Diagnostic Dumps

(8 TB/file)

• Connected at 10Gbps– Oak Ridge to ANL to SDSC

Science: Norman, Harkness, Paschos SDSCVisualization: Insley, ANL; Wagner SDSC

ANL * Calit2 * LBNL * NICS * ORNL * SDSC

Intergalactic Medium on 2 Billion Light Year Scale

Page 6: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

Providing End-to-End 10Gbps Cyberinfrastructure Providing End-to-End 10Gbps Cyberinfrastructure for Petascale End Usersfor Petascale End UsersProviding End-to-End 10Gbps Cyberinfrastructure Providing End-to-End 10Gbps Cyberinfrastructure for Petascale End Usersfor Petascale End Users

Mike Norman, SDSCAnalyzing Super Data

log of gas temperature log of gas density

Page 7: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

Calit2 Microbial Metagenomics Cluster-Calit2 Microbial Metagenomics Cluster-Next Generation Optically Linked Science Data ServerNext Generation Optically Linked Science Data ServerCalit2 Microbial Metagenomics Cluster-Calit2 Microbial Metagenomics Cluster-Next Generation Optically Linked Science Data ServerNext Generation Optically Linked Science Data Server

512 Processors ~5 Teraflops

~ 200 Terabytes Storage

Source: Phil Papadopoulos,

SDSC, Calit2

Nearly 4000 UsersOver 75 Countries

Page 8: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

Using 10 Gbps Big Data Access and Analysis-Using 10 Gbps Big Data Access and Analysis-Collaboration Between Calit2 and U WashingtonCollaboration Between Calit2 and U WashingtonUsing 10 Gbps Big Data Access and Analysis-Using 10 Gbps Big Data Access and Analysis-Collaboration Between Calit2 and U WashingtonCollaboration Between Calit2 and U Washington

Ginger Armbrust’s Diatom Chromosomes

Photo Credit: Alan Decker

Feb. 29, 2008

iHDTV: 1500 Mbits/sec Calit2 to UW Research Channel Over NLR

Page 9: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

MIT’s Ed DeLong & Darwin Project Team MIT’s Ed DeLong & Darwin Project Team Using OptIPortal to Analyze 10km Using OptIPortal to Analyze 10km Coupled Ocean Microbial SimulationCoupled Ocean Microbial Simulation

MIT’s Ed DeLong & Darwin Project Team MIT’s Ed DeLong & Darwin Project Team Using OptIPortal to Analyze 10km Using OptIPortal to Analyze 10km Coupled Ocean Microbial SimulationCoupled Ocean Microbial Simulation

Page 10: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

The NSF-Funded Ocean Observatory Initiative– a The NSF-Funded Ocean Observatory Initiative– a Complex System of Systems CyberinfrastructureComplex System of Systems CyberinfrastructureThe NSF-Funded Ocean Observatory Initiative– a The NSF-Funded Ocean Observatory Initiative– a Complex System of Systems CyberinfrastructureComplex System of Systems Cyberinfrastructure

Source: Matthew Arrott, Calit2

Program Manager for OOI CI

Page 11: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

Taking Sensornets to the Ocean Floor:Taking Sensornets to the Ocean Floor:Remote Interactive HD Imaging of Deep Sea VentRemote Interactive HD Imaging of Deep Sea VentTaking Sensornets to the Ocean Floor:Taking Sensornets to the Ocean Floor:Remote Interactive HD Imaging of Deep Sea VentRemote Interactive HD Imaging of Deep Sea Vent

Source: John Delaney and Research

Channel, U Washington

1 cm.

Page 12: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

NSF OOI is a $400M Program NSF OOI is a $400M Program -OOI CI is $34M Part of OOI-OOI CI is $34M Part of OOINSF OOI is a $400M Program NSF OOI is a $400M Program -OOI CI is $34M Part of OOI-OOI CI is $34M Part of OOI

Source: Matthew Arrott, Calit2 Program Manager for OOI CI

30-40 Software EngineersHoused at Calit2@UCSD

Page 13: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

OOI CI is Built on National LambdaRail’sOOI CI is Built on National LambdaRail’sand Internet2’s DCN Optical Infrastructureand Internet2’s DCN Optical InfrastructureOOI CI is Built on National LambdaRail’sOOI CI is Built on National LambdaRail’sand Internet2’s DCN Optical Infrastructureand Internet2’s DCN Optical Infrastructure

Source: John Orcutt, Matthew Arrott,

SIO/Calit2

Page 14: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

High Definition Video Connected OptIPortals:High Definition Video Connected OptIPortals:Virtual Working Spaces for Data Intensive ResearchVirtual Working Spaces for Data Intensive ResearchHigh Definition Video Connected OptIPortals:High Definition Video Connected OptIPortals:Virtual Working Spaces for Data Intensive ResearchVirtual Working Spaces for Data Intensive Research

Source: Falko Kuester, Kai Doerr

Calit2; Michael Sims, NASA

Page 15: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

Analyzing Big Data in 3D Stereo:Analyzing Big Data in 3D Stereo:The NexCAVE OptIPortalThe NexCAVE OptIPortalAnalyzing Big Data in 3D Stereo:Analyzing Big Data in 3D Stereo:The NexCAVE OptIPortalThe NexCAVE OptIPortal

Source: Tom DeFanti, Calit2@UCSD

Page 16: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

““Blueprint for the Digital University”--Report of the Blueprint for the Digital University”--Report of the UCSD Research Cyberinfrastructure Design TeamUCSD Research Cyberinfrastructure Design Teamresearch.ucsd.edu/documents/rcidt/RCIDTReportFinal2009.pdfresearch.ucsd.edu/documents/rcidt/RCIDTReportFinal2009.pdf

““Blueprint for the Digital University”--Report of the Blueprint for the Digital University”--Report of the UCSD Research Cyberinfrastructure Design TeamUCSD Research Cyberinfrastructure Design Teamresearch.ucsd.edu/documents/rcidt/RCIDTReportFinal2009.pdfresearch.ucsd.edu/documents/rcidt/RCIDTReportFinal2009.pdf

April 24, 2009

Page 17: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

California and Washington Universities Are Testing California and Washington Universities Are Testing a 10Gbps Connected Commercial Data Clouda 10Gbps Connected Commercial Data CloudCalifornia and Washington Universities Are Testing California and Washington Universities Are Testing a 10Gbps Connected Commercial Data Clouda 10Gbps Connected Commercial Data Cloud

• Amazon Experiment for Big Data– Only Available Through CENIC and

Pacific NW GigaPOP• Private 10Gbps Peering Path

– Includes Amazon Computing and Storage Services

• Amazon Experiment for Big Data– Only Available Through CENIC and

Pacific NW GigaPOP• Private 10Gbps Peering Path

– Includes Amazon Computing and Storage Services

Page 18: High Performance Cyberinfrastructure  Discovery Tools  for Data Intensive Research

You Can Download This Presentation You Can Download This Presentation at lsmarr.calit2.netat lsmarr.calit2.netYou Can Download This Presentation You Can Download This Presentation at lsmarr.calit2.netat lsmarr.calit2.net