cyberinfrastructure for data intensive science (dis)

12
Cyberinfrastructure for Data Intensive Science (DIS) Follow-on panel to DIS session at Internet2/ESCC Joint Techs Conference Baton Rouge – January 24, 2012

Upload: edison

Post on 22-Feb-2016

50 views

Category:

Documents


0 download

DESCRIPTION

Cyberinfrastructure for Data Intensive Science (DIS). Follow-on panel to DIS session at Internet2/ESCC Joint Techs Conference Baton Rouge – January 24, 2012 . Joint Techs Winter 2012 Focus. Data intensive science focus session Input from many groups in the community - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Cyberinfrastructure for Data Intensive Science (DIS)

Cyberinfrastructure for Data Intensive Science (DIS)

Follow-on panel to DIS session at Internet2/ESCC Joint Techs Conference

Baton Rouge – January 24, 2012

Page 2: Cyberinfrastructure for Data Intensive Science (DIS)

Joint Techs Winter 2012 Focus

• Data intensive science focus session– Input from many groups in the community• Multiple science disciplines• Multiple infrastructure areas (networks,

supercomputers, laboratory environments, mission agencies)

– Success stories illustrated effective DIS support• The intent was to integrate the needs, context,

and commonalities in a white paper

Page 3: Cyberinfrastructure for Data Intensive Science (DIS)

DIS Focus Area Presenters• Bill St. Arnaud, Green IT• Matthew Trunnell, Broad

Institute• Don Middleton, NCAR• Rich Carlson, DOE Office of

Science• Kevin Thompson, NSF OCI• Mike Ackerman, NIH NLM• Gary Jung, LBNL• Gwen Jacobs, Montana

State/Hawai’i

• Ruth Marinshaw, UNC-Chapel Hill

• Eli Dart, ESnet• Brent Draney, NERSC• Ron Hutchins, Georgia

Tech• Joe Breen, Utah• Tad Reynales, Calit2-

UCSD• Jim Bottum, ClemsonDIS Steering Committee: Scott Brim, Eric Boyd, Steve Corbató, Eli Dart,

Susan Evett, Kate Mace, Jim Pepin, Dan Schmiedt, Steve Wolff

Page 4: Cyberinfrastructure for Data Intensive Science (DIS)

Joint Techs 2012 – What We Heard

• Need for effective cyberinfrastructure voiced by multiple communities and disciplines– Genomics– Climate– Supercomputer centers

• Success stories outlined the path forward– Science DMZ model– Effective communication between cyberinfrastructure

providers, science disciplines, funding agencies

Page 5: Cyberinfrastructure for Data Intensive Science (DIS)

Rapidly Evolving Context• Things are moving quickly now

– NSF CC-NIE call focused on improving campus networks– Federal Big Data initiative

• This stuff is for real – it’s not just talk– Infrastructure funding– Grant funding

• The direction is not in doubt – the only thing to decide is the actions to take– Institutions that are aggressive in this space are likely to acquire first-

mover advantage– The wide area infrastructure is available now

• The need for a white paper has passed

Page 6: Cyberinfrastructure for Data Intensive Science (DIS)

Solutions Required for Research Institutions

• Means by which campuses can connect to science services outside their borders– Collaboration– Computation– Data sources and services

• Support data-intensive collaboration– Foster environment for grants, projects– Attract new faculty, new programs

• Refresh science infrastructure

Page 7: Cyberinfrastructure for Data Intensive Science (DIS)

Science Infrastructure Refresh

• NSF call reinvestment in foundations of data intensive science

• Architecture that has been shown to work: Science DMZ• In addition to technology, people and processes must be

included in the refresh– Science programs, infrastructure providers and security

officers must all be on board– Communication and a common vision are very important– Staff need the skills to manage high-performance science

flows and the infrastructure to support them

Page 8: Cyberinfrastructure for Data Intensive Science (DIS)

The Science DMZ – Refresher• The Science DMZ is two things

– An element of network architecture– A model for supporting data-intensive science at a research

institution• Architecture

– Portion of the network, at or near the site perimeter– Devoted exclusively for science support– Built with capable hardware– Dedicated resources for data transfer, network measurement– Appropriate security applied, application set restricted so that

security controls, risk, and science mission are all aligned– http://fasterdata.es.net/science-dmz/science-dmz-architecture/

Page 9: Cyberinfrastructure for Data Intensive Science (DIS)

The Science DMZ Model

• In general, the Science DMZ model is a framework for cyberinfrastructure – Explicitly accommodates science mission– Builds in flexibility to adopt tools and technologies for

science support– Establishes appropriate security infrastructure to both

enable and protect science• Must balance security, usability, and performance• The science mission is given what it needs to

succeed

Page 10: Cyberinfrastructure for Data Intensive Science (DIS)

Integration of Campus with wider infrastructure

• Science DMZ enables a campus to connect local scientists and resources in a frictionless manner to other sites and services– Science networks– Advanced services

• Virtual circuit services, network overlays• Internet2 Innovation Platform• http://fasterdata.es.net/science-dmz/advanced-services/

– Science DMZ resources at other campuses• This is a critical point – remember Metcalfe’s Law• Value of a Science DMZ increases as others deploy them

• The data-intensive era is upon us – the infrastructure must evolve to keep pace

Page 11: Cyberinfrastructure for Data Intensive Science (DIS)
Page 12: Cyberinfrastructure for Data Intensive Science (DIS)

Conclusions• The time to act is now• Lots of movement in this space – dynamic, evolving• Create a coalition of the willing

– Set of Universities and National Labs of sufficient critical mass to create transformative environment to support DIS

– Must create environment to encourage innovation while encouraging coherence to support scientific disciplines scattered across the globe

• Infrastructure pieces are well-understood– Hence the NSF call for campus activities– Get these deployed now