deroure repo3

Post on 24-May-2015

247 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

How Repositories can avoid the Failings of the Grid

David De Roure

IEEE e-Science 2008

“But the Grid is successful!”

So why are there three projects addressing lack of uptake?

Adoption of e-Research Technologies

...and a theme in the e-Science Institute?

How did we get here?!

Early adopter successThen rollout of infrastructure servicesAnd then wondering where the users are

Heard at another repositories event...

“How do we persuade researchers to populate our repositories?”

Due to the complexity of the software and the backend infrastructural requirements, e-Science projects usually involve large teams managed and developed by research laboratories, large universities or governments.

e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.

What are we really trying to achieve here?

Not just accelerated but new

A. Everyone using the Grid/Repositories?

B. Research advances on an everyday basis that would not have happened otherwise?

How do we move from heroic scientists doing heroic science with heroic infrastructure to everyday scientists doing science they couldn’t do before?humanists

archaeologistsgeographersmusicologists...researchers!

research

It’s the democratisation of e-Research

Jim Downing came up with the idea of “Long Tail Science”... So we are exploring how big science and long-tail science work together to communicate their knowledge. Long-tail science needs its domain repositories - I am not sanguine that IRs can provide the metalayers (search, metadata, domain-specific knowledge, domain data) that are needed for effective discovery and re-use.

Peter Murray-Rust

Technical Reports

Reprints

Peer-Reviewed Journal &

Conference Papers

Preprints &

Metadata

researchers

LocalWeb

Repositories

Graduate Students

Undergraduate Students

Virtual Learning Environment

Certified Experimental

Results & Analyses

experimentation

Data, Metadata Provenance WorkflowsOntologies

Digital Libraries

The social process of science 2.0

Everyday researchers doing everyday research

• Not just a specialist few doing heroic science with heroic infrastructure

• Chemists are blogging the lab• Everyone is mashing up• Everday hardware – multicore

machines and mobile devices

1

A data-centric perspective, like researchers

• Data is large, rich, complex and real-time

• There is new value in data, through new digital artefacts and through metadata e.g. context, provenance, workflows

• This isn’t “anti-computation” –design interaction around data

2

Collaborative and participatory

• The social process of science revisited in the digital age

• Collaborative tools – blogsand Wikis

• e-Science now focuseson publishing as well as consuming

• Scholarly lifecycle perspective

3

Benefitting from the scale of digital science activity to support science

• This is new and powerful!• Community intelligence• Review• Usage informing

recommendation• e.g. OpenWetWare• e.g. myExperiment

4

Increasingly open

• Preprints servers and institutional repositories

• Open journals• Open access to data• Science Commons• Object Reuse & Exchange

5

Better not Perfect

• The technologies people are using are not perfect

• They are better• They are easy to use• They are chosen by

scientists

6

Empowering researchers

• The success stories come from the researchers who have learned to use ICT

• Domain ICT experts are delivering the solutions

• Anything that takes away autonomy will be resisted

7

About pervasive computing

• e-Science is about the intersection of the digital and physical worlds

• Sensor networks• Mobile handheld

devices

8

• e-Research is now enabling researchers to do some completely new stuff!

• As the individual pieces become easy to use, researchers can bring them together in new ways and ask new questions

• “The next level”

Onward and Upward

“Standing on theshoulders of giants”

www.w3.org/2007/Talks/www2007-AnsweringScientificQuestions-Ruttenberg.pdf

(Everyday researchers are giants too)

Repositories

• Absolutely key role in future research. So think of a better word!

• Think of a park / reserve / gardens / zoo– Visitors, rangers, wardens, gardeners, experts,

security, volunteers, ...– Curation by providers,

experts and consumers

Repositories

Those 8 Repository points

www.oreilly.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html

1. Not just a specialist few doing heroic science with heroic infrastructure – repositories for all!

2. There is new value in data, through new digital artefacts and through metadata e.g. context, provenance, workflows

3. e-Science now focuses on publishing as well as consuming4. Usage informing recommendation5. Researchers work with collections - Object Reuse &

Exchange6. They are easy to use7. Anything that takes away autonomy will be resisted8. e-Science is about the intersection of the digital and physical

worlds (not 1970s library catalogue interfaces)

Curation of process• Find a process based on what it and find copies or

similar services usable as alternates.• Understand how and when it works, how to

operate it correctly and predict its performance.• Know the conditions for use: permissions, licenses,

platforms, and costs.• Judge the benefits of adoption based on its

reputation, provenance and validation by peers.• Estimate the risk of adoption based on its

reliability and stability.• Get assistance for its incorporation into

applications and workflows.

Go

ble

& D

e R

ou

re E

du

cau

se R

evi

ew S

ep

/Oct

200

8And we need to curate processes too!

• To understand where we’re going, look at communities which have been early to embrace new technology.

• e-Science is one. What can we learn?• Incidentally, so is music and broadcast!

– Vinyl was like books– Now the process is digital from the studio through to

playback on an iPod– People create content– People publish content– Has the business adapted?

Transformation is already underway

Note to Reader. The next slides are not intended to be anti-grid. Everyone working on Grid is doing great work.

Don’t think rollout of technologies...

Think roll-in of researchers...

MassUse byResearchers

MassUse byResearchers

Knowledge co-production vs Service Delivery!

N2

N

N

Without middleware we need lots of bits of software to join things together

One Middleware2N

N

N

With middleware there are fewer arrows!

Middleware?

N

N

Middleware

Middleware

Middleware

Middleware

MiddlewarePolynomial involving N1,N2 and M

But this is what happened. Now the picture with lots of thin arrows isn’t quite so scary!

Grid

use Web 2.0 here

Gridcloud HPC

Web is being embraced for usability and programmability e.g. mashups

And Grid is trying to come to terms with multicore and clouds!

How would this repository ecosystemself-organise to support Research 2.0?

Imagine Eprints/Dspace/Fedora isn’t something you download and run on a local server Imagine instead that you just go to the cloud and make one*

Would there be institutional repositories?

A Thought Experiment

* (Actually you can!)

Tension between data being “out on the Web” (user view) or in an institutional machine room (provider view)What is the curator view?Issues perceived differently for metadata servers and data servers

Is it a wave or is it a particle?web

Linked Data

1. Understand what the users will need by going on the journey together

2. Be open-minded: are we solving the right problem? (Don’t forget curation of process!)

3. Don’t create artificial distinctions from Web4. Beware standards as a barrier to adoption5. Think cloud, outside the institutional box:

imagine the repository factory6. Think of a new name for repositories!

How Repositories can avoid Failing like the Grid

Contact

David De Rouredder@ecs.soton.ac.uk

Thanks

Carole GobleJeremy FreySimon Coles

Peter Murray-Rust

top related