eprints and the cloud

Post on 22-Jun-2015

1.568 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

EPrints capabilities in the Cloud; a presentation at the EduServ "Repositories and the Cloud" event. For more info see http://repcloud.eventbrite.com/

TRANSCRIPT

EPrintsCloud Visions

What is EPrints For?

EPrints offers a safe, open and useful place

to store, share and manage material in the

pursuit of research and educational

agendas.administrative reporting, collaboration, data sharing,

digital profile enhancement , e-learning, e-publishing, e-

research, marketing, open access,

preservation, publicity, research assessment, research management, scholarly collections

Research Curation, Researcher Support

Researchers’ environment supported by repository

Research data managed by repository

Research community assisted by repository

What is a Repository

Safe, secure, persistent, managed storage for files

Safe, secure, persistent management of shareable FRBR works

Safe, secure, persistent, management of scholarly & scientific working

Lead

ing

to…

Science 2.0 / The Fourth Paradigm / Data Intensive ScienceThe challenge is not cloud computing but cloud thinking

Bio-Diversity

Current EPrints Cloud Capabilities

Amazon Elastic Compute Machine Images (AMIs) Small (Single Core / 1.7Gb) Large (64 Bit / Quad Core / 7.5Gb) Extra Large (64 Bit / 8 Core / 15Gb)

EPrints 3.2 is 64 Bit Enabled

Persistent Database & Storage Really Excited - Super Fast / Cheap / Easy!

Cloud to Desktop Storage

Data can be stored on multiple storage services

Local disk, SAN, NAS, Honeycomb, Cloud

Researchers can mount repository objects as a networked filesystem

Service usage and preservation risks can be monitored and analysed.

Hybrid Storage In EPrints

A single storage solution has drawbacks.

Cost vs. Speed vs. Reliability Repositories need to be

agile: to utilize and be able to migrate to new platforms

Leverage the benefits of each solution without losing control of your digital objects.

Local Disk Storage

No local bandwidth costs Hard to expand Locally Managed High overheads cost Requires space and cooling Tied closely to the software S

TO

RA

GE

EC

OS

YS

TEM

Local Archival Storage

Specialist Expensive to purchase Locally Managed Space and running costs Expandable

STO

RA

GE

EC

OS

YS

TEM

Cloud Storage

Scalable Externally controlled Known Costings Unclear retention policy Re-Useable (using simple APIs) Global Scale

STO

RA

GE

EC

OS

YS

TEM

But Clouds Blow Away

Recently: Yahoo Briefcase XDrive AOL Pictures HP Upline Sony Image Station

Source: Tom Spring - PCWorld

Why use Hybrid Storage

Use the best features of each storage type

Performance Scaling-up bandwidth

Optimisation Large-file handling Multimedia streaming

Localised Delivery Local delivery from the cloud

EPrints Storage Controller

• The storage controller decides where to put a file.

• Rule-based policy defined by XML configuration file

• Large binary files of scientific data (raw machine result data) can be stored in a large disk (slower access) system and sent to a tape company for long term storage.

• Processed results can be stored locally and in the cloud ready for rapid delivery to end points.

Architecture Diagram

Controller Ruleset

<choose> <when test="datasetid = 'document'"> <choose> <when test="$parent{relation_type} =

'isVolatileVersionOf'"> <plugin name="Local"/> </when> <otherwise> <plugin name="AmazonS3"/> </otherwise> </choose> </when> <otherwise> <plugin name="Local"/> </otherwise> </choose>

EPrints Storage Manager

Amazon S3 Localisation (1)

Amazon S3 Localisation (2)

Preservation Services

Object Classification

Risk Analysis

Mitigation and Migration

EPrintsForthcoming Development

EPrints Cloud Services

Web based repository setup Much like getting started with a blog. Fill in a form and obtain a repository. Coming to EPrints core in next major release.

Enterprise Support for Cloud Solutions Full Setup & Configuration Global Distribution Auto Upgrade & Patching Trusted Backup

EPrints 3.2

Plug-ins / Modules Everything builds on the core layer Major part of v3.2 is strengthening

the core and adding more abstraction layers

Improved data model Enhanced data facilities Enhanced metadata facilities Improved programming & API

EPrints 3.2 Structure

Community Driven Development

There are many abstraction layers. Display Manipulation Upload Handlers Custom Datasets Import / Export Plug-ins Transcoding Plug-ins Database Plug-ins Storage Plug-ins

One API

Storage Plug-ins

Local NFS Amazon S3 Sun Cloud Storage Service Microsoft Azure Any others based on the S3 API…. (the last 3 all are)

5 Call API (about 30mins to write a plug-in)

Our Development Vision

Empower the Community with a simple API API in 3.2

Give the community a platform to test their code

Use the Cloud!

Give the community a distribution mechanism

The EPrints Bazaar (beta)

EPrints Bazaar

Similar in concept to Apple’s App Store

Every install of EPrints will have access to the Bazaar

Single click install/uninstall of plug-ins

EPrints Services Approved Plug-ins Enterprise support for limited 3rd party plug-ins

Summary

EPrints provides the professional, enterprise level application for resource management

Including cloud support at many levels Repository-in-the-cloud Storage-in-the-cloud Services-in-the-cloud

top related