Transcript
Page 1: Exploring  ‘Workspaces’

Exploring ‘Workspaces’

Tom Visser, SARA compute and networking services, Amsterdam

Garching Workshop 21st September 2010

Page 2: Exploring  ‘Workspaces’

• Background• Overview of cases• Technical possibilities• Opportunities and risks • Expected results• Proposed approach

Page 3: Exploring  ‘Workspaces’

The CLARIN-NL connection• Seeking to create an infrastructure for language

resources• Providing access to tools and technologies• CLARIN-NL and BiG Grid are exploring possibilities• The WHOLE pipeline

– Creating– Curation– Collecting– DO SCIENCE– Depositing

Page 4: Exploring  ‘Workspaces’

Already• SARA has developed a client implementation of a

Persistent Identifier Service (HANDLE) and has become an EPIC consortium member

• Instance of service currently hosted at SARA• BiG Grid / SURFNET pilot with Short lived

credential service• Activities with Computational Linguistics (e.g.

Named Entity Recognition) & forthcoming Computational Humanities institute (KNAW)

• Series of workshop to find a common ground between BiG Grid and the CLARIN infrastructure

Page 5: Exploring  ‘Workspaces’
Page 6: Exploring  ‘Workspaces’
Page 7: Exploring  ‘Workspaces’
Page 8: Exploring  ‘Workspaces’

Questions of today• When is a user workspace service?• Why do we need user workspaces?• What are their characteristics in a distributed

environment?• How do we support processing chains in

distributed environments driven by community environments

• Are there generic frameworks for the execution of distributed processing chains and deployment of web-services

Page 9: Exploring  ‘Workspaces’

Core problems• Where to store • How to store• How to access• How to foster collaboration amongst people• How to support: Data discovery, exploration and

exploitation• How to realize such a service• What SLA / service description / responsibilities

Page 10: Exploring  ‘Workspaces’

What it should be• A temporary storage place (days, weeks, years)

– Global home / global scratch– A ‘logical mount point’

• Accessible by web services• Meaningfully accessible by a human• Autonomy to communities

– Instantiate– Content– Control

• Identifiable• Store digital objects and metadata• Journaling (register interactions)

Page 11: Exploring  ‘Workspaces’

• Create• Read• Write• Update• Grant access to (Authorization)• List contents• Search contents

– Adopting & offering known best practices and services in the ecosystem

• …

Page 12: Exploring  ‘Workspaces’

Considered technical possibilities

• iRODS• Cloud platform (SNIA/CDMI)• HADOOP implementation• AMAZON S3 / OpenCloud / Azure /

Page 13: Exploring  ‘Workspaces’

Risks and opportunities• Creating something that is only generic - specific• Looking uphill, but what will you know when

you’ve climbed the hill• Knowledge of the community• Epistemological problems• Bootstrapping• Trust

• Proces focus: we are starting a small scale pilot within 1 month, short iterations, keeping everyone involved.

Page 14: Exploring  ‘Workspaces’

Approach: BiG Grid and Dutch partners

• Many interesting addressable cases– Keyword extraction from dutch audio and film institute– MPI video repository annotations– City of Den Haag government proceedings: minutes and

video alignment (feature extraction)– OCR & Machine learning on dutch handwritings

• Expected results– Common understanding of a workspace service– Bootstrap implementation vertically crossing all layers

Page 15: Exploring  ‘Workspaces’

• When is a user workspace service?– When it is used and has become an indispensible tool

• Why do we need user workspaces?– To be able to flexibly work with data– Initiate collaborations– Have a trustable storage resource availble

• What are their characteristics in a distributed environment?– Clear core functionality, many service providers, integration

with identity providers • How do we support processing chains in distributed

environments driven by community environments– By having open, known, and easily accessible services

• Are there generic frameworks for the execution of distributed processing chains and deployment of web-services– Yes!

Page 16: Exploring  ‘Workspaces’

THANK YOU


Top Related