open science data cloud (ieee cloud 2011)

11
OCC Open Science Data Cloud (www.opensciencedatacloud.o rg) Robert Grossman University of Chicago Open Cloud Consortium Open Data Group July 5, 2011 1

Upload: robert-grossman

Post on 21-Aug-2015

1.817 views

Category:

Technology


1 download

TRANSCRIPT

1

OCC Open Science Data Cloud(www.opensciencedatacloud.org)

Robert GrossmanUniversity of Chicago

Open Cloud ConsortiumOpen Data Group

July 5, 2011

I’ll describe a new project (the Open Science Data Cloud) and

three research questions generated by the project.

Open Science Data Cloud

The OCC is a not-for-profit supporting the scientific community by operating cloud infrastructure.

The OSDC is a hosted distributed facility managed by the OCC that:

• Manages & archives medium and large size datasets.• Provides computational resources to analyze them.• Provides networking to share the datasets with your

colleagues and with the public.

Proof of Concept2008 - 2010

Phase 12011 - 2014

Phase 22015-2020

• 4 locations• 10G networks• 450+ nodes• 3000 cores• 2 PB

• 6+ locations• 100G networks• $1M - $2M

hardware/year• Sept, 2011

• Build a data center for science.

• Drive the the 4th paradigm.

Why Another Cloud Project?

Small Medium to Large Very Large

Data Size

Low

Med

Wide

Variety of analysis

No infrastructure Dedicated infrastructureGeneral infrastructure

Scientist with laptop

Open Science Data Cloud

High energy physics, astronomy

OSDC Perspective• Take a long term point of view (think

like an underfunded library not a cloud service provider).

• Manage both the data and the analysis environment.

• Develop open architecture that interoperates with other private and public clouds.

• Operate vendor neutral infrastructure at the scale of a small data center.

Project 1. Bionimbus

www.bionimbus.org (biological data)

Project Matsu 2: An Elastic Cloud For Earth Science Data

10

matsu.opencloudconsortium.org

Research Questions

1. Develop technology to encapsulate a scientist’s data and analysis tools and to export, save and move these between clouds.

2. Develop protocols, utilities, and applications so that new racks and containers can be added to data clouds with minimal human involvement.

3. Develop technology to support the long term, low cost preservation of data in clouds.