© 2012 ibm corporation 1 ensure: enabling knowledge sustainability, usability and recovery for...
TRANSCRIPT
© 2012 IBM Corporation1
ENSURE: Enabling kNowledge Sustainability, Usability and Recovery for Economic value
Presenter: Michael Factor
The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 270000
© 2011 IBM Corporation2
Enabling kNowledge Sustainability, Usability and Recovery for Economic value
34 INNOVATIONS USE CASES
Healthcare
Clinical Studies
Financial Services
EVALUATE Cost and Value
AUTOMATE Preservation Lifecycle
SCALE using ICT innovations
PROTECT Content-aware data protection
A 3-year IP project started Feb 2011www.ensure-fp7.eu
© 2011 IBM Corporation3
ENSURE: Key Technical InnovationsEvaluate Automate Scale Protect
Requirements
Evaluate Automate Scale Protect
Access
Deploy
External Events
FlowEvents
Ontology
CostValue
Quality
CloudVirtual
appliance
Anonymi-zation
© 2011 IBM Corporation4
ENSURE: Key Technical InnovationsEvaluate Automate Scale Protect
Requirements
Evaluate Automate Scale Protect
Access
Deploy
External Events
FlowEvents
Ontology
CostValue
Quality
CloudVirtual
appliance
Anonymi-zation
© 2012 IBM Corporation5 5
Evaluate Cost and Value – InputEvaluate Cost and Value – Output
© 2012 IBM Corporation6
Evaluate Cost and Value – Process
Configurator
Economic Performance
Engine
Preservation Plan Optimizer
TranslationRules
QualityEngine
Cost/riskEngine
Data Repositories
Vendor Region Size Price
AWS
AWS
EU
AP TOK
<1TB .125/GB
<1TB
GOOG <1TB .120/GB
.130/GB
--
Vendor Region Size Price
AWS
AWS
EU
AP TOK
<1TB .125/GB
<1TB
GOOG <1TB .120/GB
.130/GB
--
Vendor Region Size Price
AWS
AWS
EU
AP TOK
<1TB .125/GB
<1TB
GOOG <1TB .120/GB
.130/GB
--
Gold Rule:Two fixity checks every 30 daysTwo clouds
Silver Rule:One fixity check every 60 daysOne cloud
. . .
Gold Rule:Two fixity checks every 30 daysTwo clouds
Silver Rule:One fixity check every 60 daysOne cloud
. . .
ConfigurationSelection
Adm
inistrator
Requirements
(Re)Deploy SolutionGold Rule:Two fixity checks every 30 daysTwo clouds
Silver Rule:One fixity check every 60 daysOne cloud
. . .
Gold Rule:Two fixity checks every 30 daysTwo clouds
Silver Rule:One fixity check every 60 daysOne cloud
. . .
Vendor Region Size Price
AWS
AWS
EU
AP TOK
<1TB .125/GB
<1TB
GOOG <1TB .120/GB
.130/GB
--
Vendor Region Size Price
AWS
AWS
EU
AP TOK
<1TB .125/GB
<1TB
GOOG <1TB .120/GB
.130/GB
--
Vendor Region Size Price
AWS
AWS
EU
AP TOK
<1TB .125/GB
<1TB
GOOG <1TB .120/GB
.130/GB
--
ENSURE
Automate
© 2012 IBM Corporation7
Evaluate cost and value: Preservation Plan Optimizer
COE
QOE
• Genetic algorithm generates results based upon engines
• Really n-dimensions• The user chooses a
solution from the Pareto frontier
• No dimension can be improved without degrading at least one other dimension
Quality
Cost
© 2012 IBM Corporation8
ENSURE: Key Technical InnovationsEvaluate Automate Scale Protect
Requirements
Evaluate Automate Scale Protect
Access
Deploy
External Events
FlowEvents
Ontology
CostValue
Quality
CloudVirtual
appliance
Anonymi-zation
© 2012 IBM Corporation9
Automate Preservation Lifecycle: Preservation Data Aware Lifecycle Management (PDALM) Workflow Engine
9
PDALM: Controls system activities
– Manage workflow of the information being preserved
– Execute preservation plan (built by the Configurator)
– Handle notifications and interaction with the administrator
Example: Workflow for ingest
© 2012 IBM Corporation10
Automate Preservation Lifecycle: Event engine
Configurator
EventEngine
Event Engine • Manages,
concurrency, priority and impact/severity of events
• Listens for preservation related events
• Notifies relevant ENSURE components
PDALM
Monitored system behavior
Economic
Data/format
Regulatory
Standards
Feeds
Scale
© 2012 IBM Corporation11
Automate preservation lifecycle: ontology updateSelect ontology to update
Upload a new version and display potential system impacts
Apply new ontology and update system
© 2012 IBM Corporation12
ENSURE: Key Technical InnovationsEvaluate Automate Scale Protect
Requirements
Evaluate Automate Scale Protect
Access
Deploy
External Events
FlowEvents
Ontology
CostValue
Quality
CloudVirtual
appliance
Anonymi-zation
© 2012 IBM Corporation13
Scale: What is a cloud, why is it interesting, and what are the issues?
“Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources … that can be rapidly provisioned and released with minimal management effort or service provider interaction.”
– US National Institute of Standards and Technology, Information Technology Laboratory
Benefits Cost Savings
– Economies of scale, utilization improvement and standardization
Speed and Agility Pay-as-you-go for usage
Issues for preservation Rich metadata support, e.g., no search Differences in security models Encryption may limit preservation actions Compute near the storage (storlets) Logical connections among objects in the
same and different clouds Standards
Enterprise A
Enterprise B
Enterprise C
Community Cloud Services
User A
User B
User C
User D
User E
Public Cloud Services
EnterpriseData Center
Private Cloud
Cloud Delivery Models
© 2012 IBM Corporation14
Map OAIS AIPs and the links among AIPs to the cloud data model
Manage object’s inter-relationship and referential integrity
Map objects to one or more clouds
Scale: Mapping Data to Multiple Clouds
Cloud A
Cloud B
Protect
© 2012 IBM Corporation15
Request to access content with VA
Instantiate VA
Compute Cloud
Private Application Library
Storage Cloud
Extract contentInto VA
ENSURE
Give user access to VA with content
Scale: Accessing Content with a Virtual Appliance (VA)
© 2012 IBM Corporation16
ENSURE: Key Technical InnovationsEvaluate Automate Scale Protect
Requirements
Evaluate Automate Scale Protect
Access
Deploy
External Events
FlowEvents
Ontology
CostValue
Quality
CloudVirtual
appliance
Anonymi-zation
© 2012 IBM Corporation17
Content-aware data protection: Masked/Anonymized Data
Data Owner Requirement:– Data should be anonymized and cannot be associated with a specific individual
Example:– Living people from London who fought in WWII is becoming more and more identifiable
hospital
bank
factory
Data ReceiversData Owners
Telco
Medical Research
Softwaretesting
StatisticalAnalysis
Pharma Research
Full data Masked data
Masking Services
© 2012 IBM Corporation18
Summary
Architect and build the next generation preservation system, ensuring knowledge is sustained and can be recovered for future value
Key Innovations:– Evaluate Cost and Value supporting business decisions– Automate Preservation Lifecycle– Scale using ICT innovations– Content-aware data protection
Three use cases to demonstrate future preservation– Healthcare, clinical trials, and finance use
Status– Initial end to end demo of two use cases in the first year– Emphasis on evolution along time for the second year
www.ensure-fp7.eu
© 2012 IBM Corporation19
Thank You
© 2012 IBM Corporation20
Backup