computational storage services (wp7 forgetit 1st year review)

19
Concise Preservation by combining Managed Forgetting and Contextualized Remembering

Upload: forgetit-project

Post on 06-Aug-2015

130 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: Computational Storage Services (WP7 ForgetIT 1st year review)

Concise Preservation by combining Managed

Forgetting and Contextualized Remembering

Page 2: Computational Storage Services (WP7 ForgetIT 1st year review)
Page 3: Computational Storage Services (WP7 ForgetIT 1st year review)

Simona Rabinovici-Cohen

IBM Research - Haifa

WP 7 PresentationComputational Storage Services

ForgetIT 1st Review Meeting, April 29-30, 2014

Kaiserslautern, Germany

Page 4: Computational Storage Services (WP7 ForgetIT 1st year review)

WP Objectives

• Increase the value and outcome of preserved information over time

–Provide additional incentive for preservation

–Increase return-on-investment (ROI)

• Transform the generic storage service to a richer service with

potentially higher business value and automated preservation

processes

Focus of Year 1

• Build a consolidated platform for objects and computational

processes (storlets) that will be defined, triggered and executed

close to the data

• Utilize the OpenStack Swift open source for cloud storage

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Objectives of WP and Year 1 Focus

Page 5: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Role in Preserve-or-Forget Architecture

Page 6: Computational Storage Services (WP7 ForgetIT 1st year review)

Leveraged PDS and Storlet Engine adding:

Adapt Preservation Engine for ForgetIT

Rules mechanism

Storlets at interface proxy servers and local object servers

Multiple programming languages for storlets

New storlets:

image transformation storlet

fixity storlet

concept detection storlet

Searchable metadata contributions to OpenStack community

Integration with whole ForgetIT framework

Co-chair LTR group in SNIA to develop SIRF

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Achievements in Year 1

Page 7: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Preservation DataStores (PDS)

� PDS offloads some archiving

functionality to:

�Decrease probability of data loss

�Simplify the applications

�Provide improved performance and

robustness

�Supports automation of archiving

processes

�Provides computational storage via

Storlet Engine

�PDS was also storage infrastructure of EU research projects CASPAR and ENSURE

with partners: Europe Space Agency, Maccabi HMO, Tessella, Philips and more

Page 8: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

PDS in OAIS

Functional Model

AIP

• OAIS is ISO standard reference model for preservation (ISO:14721:2002)

• Provide fundamental ideas, concepts and a reference model for long-term archives

• Archival Information Package (AIP) - a logical structure for the preservation object that needs to be stored to enable future interpretation

• Content Data Object (CDO) –raw data to be preserved

PDS

Page 9: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

DSpace and PDS

Page 10: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

PDS Data Model

Docket

Costa Rica 2013

Docket

Edinburgh

Object (AIP)

Aggregation

Business Photos (silver)

Object (AIP)

Aggregation

Private Photos (gold)

Tenant

Peter Stainer

Hierarchical data model

Tenant � Aggregation and Tenant �Docket � object (AIP)

Flexible organization of assets in collections with varied preservation policies (gold,

silver, bronze)

Aggregations support dynamic and transparent configuration of data management

Metadata:aggregation=Private

Metadata:aggregation=Business

Docket

Toy Conference 2014

Object (AIP) Object (AIP)

Aggregation

Press Releases (gold)

Tenant

Spielwarenmessen

Metadata:aggregation=Press

Metadata:aggregation=Press

Page 11: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

The Need for Computational Object Storage

• “Data is the new Oil”– In its raw form, oil has little value – Once processed and refined, it helps power the world

• Data deluge of content depots and unstructured data – Documents, medical images, photos, videos, etc.– The fastest growing type of storage by volume– Object storage is ideal for this type of data

• Object storage for content depots generally:– Utilizes large bandwidth to serve big data over the WAN – Uses server-based storage with under utilized CPUs

• Process and refine the data where it is stored

– Create a computational object storage with storlets

“Data is the new

oil.”

Clive Humby

Page 12: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Client Value for Using Storlets

Reduce bandwidth – reduce the number of bytes transferred over the

WAN

�e.g. Analytics storlet

Enhance security – reduce exposure of sensitive data

�e.g. De-identification storlet

Save costs – consolidate generic functions that can be used by many

applications while saving infrastructure at the client side

�e.g. Curation storlet

Support compliance – monitor and document the changes to the

objects and improve provenance tracking

�e.g. Transformation storlet

Page 13: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Storlet Engine Architecture

Page 14: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Rules Mechanism

Enables automatic conditional invocation of storlets

Explicit storlet activation overrides implicit activation

Rules kept as per tenant editable object, with specified access

control

Configured by tenant, user, role, container, object,

content_type

Wildcards (“*”) allowed in a rule (high flexibility)

The first rule that matches the input is activated – prioritized

list of rules

Examples:

De-Identification (per Role)

Transformation (per Content Type)

Fixity (per docket)

Page 15: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Storlets at proxy node and object node

L2 Rack Switch1GB Ethernet

account node - SSD

L2 Rack Switch1GB Ethernet

L3 Switch10GB Ethernet

Virtual IP

L3 Switch10GB Ethernet

container node -SSD

object node - HDD

object node - HDD

proxy nodeproxy node

Swift Object Node

object

service

Swift Proxy Node

Storlet Engineproxy

service

Storlet Engine

Page 16: Computational Storage Services (WP7 ForgetIT 1st year review)

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Fixity Storlet

16

Page 17: Computational Storage Services (WP7 ForgetIT 1st year review)

• Papers

• S. Rabinovici-Cohen, E. Henis, J. Marberg, K. Nagin, “Storlet Engine: Performing

Computations in Cloud Storage”, to be submitted

• S. Rabinovici-Cohen, R. Cummings, S. Fineberg, “Self-contained Information

Retention Format For the, to be submitted

• Posters

• S. Rabinovici-Cohen (IBM), M. Baker (HP), R. Cummings (Antesignanus), S. Fineberg

(HP), E. Henis (IBM), "Self-contained Information Retention Format (SIRF) in

ForgetIT EU Project", 6th International Systems and Storage Conference (SYSTOR),

2013

• Other Dissemination Activities

• The Storage Networking Industry Association (SNIA) published in its March 2013

Newsletter that SNIA Long Term Retention group formed a liaison with ForgetIT

• The tutorial "Combining SNIA Cloud, Tape and Container Format Technologies for

the Long Term Retention of Big Data" is given at several SNIA conferences

• Deliverables

• D7.1: Foundation of Computational Storage Services

• D7.2: Computational Storage Services First Release

ForgetIT Project GA600826, 1st Review Meeting, Kaiserslautern, April 2014

Publications

Page 18: Computational Storage Services (WP7 ForgetIT 1st year review)
Page 19: Computational Storage Services (WP7 ForgetIT 1st year review)

Thank you for your attention!