digital curation or digital data? the impact of services and federation phil lord newcastle...

23
Digital Curation or Digital Data? The impact of Services and Federation Phil Lord Newcastle University

Post on 21-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Digital Curation or Digital Data? The impact of Services and

FederationPhil Lord

Newcastle University

Take Home Messages

• Curation is important for the CARMEN project and neuroinformatics

• To enable repeatability and rerunability, curation of both services and data are of equal importance

• To enable federation and autonomy, data release, license and other policies need to be operated over computationally.

Research Challenge

Understanding the brain may be the greatest

informatics challenge of the 21st century

Worldwide >100,000 neuroscientists(~ 5,000 in UK) are generating vast amounts of data

Principal experimental data formats:

molecular (genomic/proteomic)

neurophysiological (time-series electrical measures of activity)

anatomical (spatial)

behavioural

Neuroinformatics concerns how these data are handled and integrated, including the application of computational modelling

Need for Cooperation

Understanding the brain may be the greatest

informatics challenge of the 21st century

OECD Neuroinformatics Working Group identified the need to work cooperativelyin order to achieve major advances

Cooperation will permit:

development of common processes

best value from data, including long term curation

‘mega-analysis’ of large data sets

integration of data sets across different scales and different approaches

interdisciplinary research

CARMEN – Focus on Neural Activity

resolving the ‘neural code’ from the timing of action potential activity

Understanding the brain may be the greatest

informatics challenge of the 21st century

neurone 1

neurone 2

neurone 3

raw voltage signal data collected by patch-clamp and single & multi- electrode array recording novel optical recording, particularly the activity dynamics of large networks

• CARMEN is a new e-Science Pilot Project, (UK research council funded) in Neuroinformatics.

• To create a grid-enabled, real time ‘virtual laboratory’ environment for neurophysiological data

• To develop an extensible ‘toolkit’ for data extraction, analysis and modelling

• To provide a repository for archiving, sharing, integration and discovery of data

• To achieve wide community and commercial engagement in developing and using CARMEN– CARMEN is a 4 year project: if it is to last longer, it must become

financially self-sufficient.• See http://www.carmen.org.uk

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

CARMEN Active Information Repository Node

Dynamic Service Deployment - Dynasoar

R

C WSP

req

res

1

Compute Machines

node 1s2, s5

node 2

node ns2

Web Server

3

2: service fetch &deploy

SR

Service Repository

Client

CAIRN

Distribution and Federation

Initially, we plan to have two CAIRNS

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Distribution and Federation

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

• What about digital curation?

Courtesy of Wikipedia

CARMEN’s perspective

• We wish to store data, store it’s provenance, store it’s usage.

• We need release policies, we need retention policies, we need to understand ownership

What do we get from this?

• Replicability: one scientist should be able to repeat another’s experiment, under equivalent conditions, at a different time.

• Rerunability: a scientist should be able to apply an

equivalent technique under new circumstances.

• The addition of services into this mix complicate the issue.

New DataOld Data

Replicability Rerunability

New Data

Old Data Old Services

New ServicesReplicability

Rerunability

Is the specification of what

happened actually right?

Has the state of the world advanced since previously?

Has the world changed, in a comparable way?

Has the service changed in a comparable way?

Error-Prone

Neuroscientist

Eager Neuroscientist

Neurosciensist comparing to existing work

Tool Builder

So, what is problem?

• I would like to rerun this experiment and release the results. Can I?

• Is the new data available? • Is the new data public? • Does the license allow derived results?• Who owns the derived results?

– data license– software license

So, whats the problem?

• Can I compare how new data would have changed the results? – Is that data available? (New and Old)– Is that data public? (New and Old) etc…

• Is it embargoed – will it become public later?

– Do the licenses allow derived results? – Who owns the derived results?

• The licenses may conflict

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

CARMEN Active Information Repository Node

Whose release policy?

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Data

Metadata

Core ServicesExternal Client

..............

External Client

Sec

urity

Service 1

Service 2

Service n

Service 1

Service 2

Service n

Client Dynamically Deployed Services

Workflow Enactment

Engine

Registry

Policy Issues

• One of the main purposes of the CAIRN is to hide the distribution.

• What if the CAIRNs have different release policies? What if they have different licenses?

• We cannot inflict these differences on the user. • Therefore, we must be able to compute over

policies• We must be able to represent justifications back

to the users

An Example: Licensing

• Computationally amenable licenses are available

• Take, for example, Creative Commons

Take Home Messages

• Curation is important for the CARMEN project and neuroinformatics

• To enable repeatability and rerunability, curation of services and data are of equal importance

• To enable federation and autonomy, data release, license and other policies need to be operated over computationally.

AcknowledgementsProfessor Colin Ingram, Professor Jim Austin, Professor Leslie Smith, Professor Paul Watson Dr. Stuart Baker,Professor Roman Borisyuk, Dr. Stephen Eglen, Professor Jianfeng Feng, Dr. Kevin Gurney, Dr. Tom Jackson Dr. Marcus Kaiser, Dr. Phillip Lord, Dr. Paul Overton, Dr. Stefano Panzeri, Dr. Rodrigio Quian Quiroga, Dr. Simon Schultz, Dr. Evelyne Sernagor, Dr. V. Anne Smith, Dr. Tom Smulders Professor Miles Whittington, Christoph Echtermeyer, Martyn Fletcher, Frank Gibson, Mark Jessop Dr. Bojian Liang, Juan Martinez-Gomez, Dr. Chris Mountford, Agah Ogungboye, Georgios Pitsilis, Dr. Daniel Swan

University ofSt Andrews

TheUniversity OfSheffield