clarin-nl jan odijk clarin-nl kick-off meeting utrecht, 27 may 2009

Post on 31-Mar-2015

221 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CLARIN-NL

Jan Odijk

CLARIN-NL Kick-off Meeting

Utrecht, 27 May 2009

• CLARIN Goals

• CLARIN-NL Goals+Activities

• Phasing

• Work Plan for 2009

• Governance

• Budget & Funding

• Caveats

Overview

• Research Infrastructure for HSS researchers– Design– Construction– Validation– Exploitation

eScience working environment

CLARIN Goals

• Research Infrastructure includes– Language data that are needed for carrying out research– Software operating on data for

• Enriching them• Accessing them for exploration• Selecting information from them• Visualizing (selections of) data

– Metadata and views on metadata so that the existence and properties of resources are easily determined

• All distributed, web-based and accessible• in an interoperable way

CLARIN Goals

• So that– Researcher can work from anywhere– Researchers can create ‘virtual collections’ by

combining parts of existing data sources – Researchers can apply software in the form of

‘web services’ operating in combination

• User-friendly, i.e. no technical skills

CLARIN Goals

• Make the community aware

• Enable the community to use it – Dissemination activities– Training Sessions– Educational Programme

CLARIN Goals

• Focus on humanities (esp. linguistics)

• NL-Line

• EU-Line

CLARIN-NL Goals

• Ensure compatibility of CLARIN-prep specifications with needs of the national research community by

– Making an inventory and analyzing national needs– Broad NL participation in standards definition– Conducting validation projects to establish the requirements

• Build and exploit the NL part of CLARIN as a best practice example for other CLARIN partners

• Act as a world class service centre in two domains• Set up CLARIN-NL national coordination point and

support office

CLARIN-NL Goals (NL-Line)

• technical prototype infrastructure as specified by CLARIN-prep– building the grid-based structure – providing the generic services– operating and validating the prototype

• data infrastructure – surveying existing data and specifying the essential data set – constructing the essential set by means of conversion or (if needed)

new digitization actions, and in parallel developing tools to facilitate this process for other resources of the same type.

– agreeing on representation standards within CLARIN-prep– validating the prototype on the basis of concrete usage cases– arranging for NL-specific IPR issues, especially related to existing

data with IPR restrictions

CLARIN-NL Activities (NL-Line)

• language technology service infrastructure – specifying a set of essential tools and services– constructing the essential set by means of encapsulation or (if

needed) porting or building– agreeing on inter-operability standards

• establishing user needs– surveying current practice in the Netherlands– carrying out pilots and demonstrators with humanities

researchers, preferably in an international setting– establishing governance procedures for eliciting, prioritising

and selecting extensions and improvements

CLARIN-NL Activities (NL-Line)

• creation and operation of two centres of expertise

• creation and operation of dissemination, education and awareness facilities

• setting up and operating national coordination point and develop a business model that guarantees the long term sustainability of the CLARIN infrastructure

CLARIN-NL Activities (NL-Line)

• Consolidate NL’s leading position

• Seamless transition from preparatory phase to construction phase

• NL a main hub in the European infrastructure

• Position CLARIN-NL outside of EU

CLARIN-NL Goals (EU-Line)

• implement and host the governance structure recommended by CLARIN-prep

• set up and host a main CLARIN Office

• set up and host a main European CLARIN Technical Centre to build and maintain the technical infrastructure

CLARIN-NL Activities (EU-Line)

• set up and host the central CLARIN Coordination point for

– development and maintenance of standards– harmonization of IPR issues– education, dissemination and promotion

• set up an international example infrastructure, e.g. with

– Germany (MPG/MPI-link)– Flanders & South Africa (via the Dutch Language Union)

• maintain close connections with other relevant players (EU and non-EU)

CLARIN-NL Activities (EU-Line)

1. 2009-2010 Preparation Phase

2. 2011-2012 Construction Phase

3. 2013-2014 Operation Phase

Phases

• Technical– Requirements (from actual use cases)– Specification, design (user-centered)– Initial prototype

• Organisational– Relation with other EU-countries– Role of NL in the CLARIN infrastructure– Governance structure

Preparation Phase

• Technical– Implementation– Initial Operation– Extensive testing

• Validation• Evaluation by user groups

• Organisational– Strategy for sustainability after project end

Construction Phase

• Technical– Exploitation of the infrastructure– Extending the infrastructure with new data /

tools/ services should be normal business– Maintenance and updating for new

developments / new requests

• Organisational– Implement sustainable structure

Operation Phase

• Technical– Plan for prototype building and general services – Initials steps in its implementation– Active participation in the first network of centres on a

European scale (MPI, INL, Meertens, …)

• Data– Open Call for projects on data conversion and curation – Support for influencing determination of CLARIN standards

and best practices– Support for evaluating CMDI metadata proposal – Support for the inclusion of all existing NL data and tools in

CLARIN data and tool repositories.

Work plan for 2009

• Tools & Services & User Needs– Open call for Demonstration projects – Inventory of user needs to get to a specification of a

set of essential tools and services

• National Coordination Point– The national coordination point will be set up. – The governance structure will be implemented.

Work plan for 2009

• Education, Training and Awareness– Plan being elaborated– Events

• Launching Event, Brokerage activities• Possibly IPR-event (in conjunction with STEVIN) with

publishers and content-owners• …

– Support for events related to CLARIN-NL or CLARIN (workshops, conferences, master classes. …)

– Website and Wiki– Newsletter

Work plan for 2009

• Executive Board– Executes

• Board– Decides

• National Advisory Panel– Advises (national perspective)

• International Advisory Panel– Advises (international perspective)

• Partners– Carry out subprojects

Governance

• Jan Odijk (UIL-OTS)– director

• Hans Bennis (Meertens)– chair & representative of the users

• Peter Wittenburg (MPI)– technical director

• Arjan van Hessen (Twente/UIL-OTS)– dissemination, training, education

Executive Board - Composition

• ‘Senior researchers and experts in the field and in governance’

• Members– Geert Booij (LUCL, Leiden) – Lou Boves (CSLT, RU Nijmegen) – Peter Doorn (DANS, Den Haag) – Martin Everaert (UIL-OTS, Utrecht) – Jaap van den Herik (TiCC, Tilburg) – Aafke Hulk (ACLC, Amsterdam) – John Nerbonne (CLCG, Groningen)

Board - Composition

• Observors / Advisors– CLARIN Europe Coordinator

• Steven Krauwer

– CLARIN-NL Programme Director • Jan Odijk

Board - Composition

• ‘representatives from Linguistics and the Humanities’• Members

– Willem Adelaar (LUCL, Leiden)– Sjef Barbiers (Meertens, Amsterdam)– Jeannine Beeken (INL, Leiden)– Antal van den Bosch (TiCC, Tilburg) – Gosse Bouma (CLCG, Groningen)– Hennie Brugman (B&G, Hilversum)– Karina van Dalen-Oskam (Huygens, Den Haag) – Paul Doorenbosch (KB, Den Haag)– Willemijn Heeren (Twente)

NAP – Composition (1)

• Members (cont.)– Kees Hengeveld (ACLC, Amsterdam)– Marc Kemps-Snijders (MPI, Nijmegen)– Nelleke Oostdijk (CSLT, RU Nijmegen)– Reinier Salverda (FA, Leeuwarden)– Ted Sanders (UIL-OTS, Utrecht)– Piek Vossen (VU, Amsterdam)– Paul Wouters (VKS, Amsterdam)– Joris van Zundert (Huygens, Den Haag)

NAP – Composition (2)

• prominent and experienced researchers in the relevant fields outside NL

• Members– To be determined

IAP - Composition

• Universiteit van Utrecht (UU)– Utrecht institute of Linguistics OTS (UIL-OTS)– Landelijke Onderzoeksschool Taalkunde (LOT)

• Max Planck Instituut voor Psycholinguïstiek (MPI)• KNAW

– Meertens Instituut – Huygensinstituut– Data Archiving and Networked Services (DANS)– Fryske Akademy (FA)– Internationaal Instituut voor Sociale Geschiedenis (IISG)

• Digitale Bibliotheek voor de Nederlandse Letteren (DBNL) • Instituut voor Nederlandse Lexicologie (INL)

Partners - Composition

• Radboud Universiteit Nijmegen (RU)– Centre for Language and Speech Technology (CLST)– Centre for Language Studies (CLS)

• Universiteit van Amsterdam (UvA)– Amsterdam Center for Language and Communication (ACLC)– Intelligent Systems Lab Amsterdam (Universiteit van

Amsterdam)

• Universiteit van Groningen – Center for Language and Cognition (CLCG)

• Universiteit van Leiden– Centre for Linguistics (LUCL)

Partners - Composition

• Universiteit van Tilburg– Tilburg Centre for Creative Computing (TCCC)

• Universiteit van Twente– Human Media Interaction Group (HMI)

• Vrije Universiteit Amsterdam (VU)• Katholiek Documentatie Centrum (KDC) • Veteraneninstituut • Kennisinstituut Sociale en Psychische Gevolgen van

Oorlog, Vervolging en Geweld (COGIS)• Internationaal Informatiecentrum en Archief voor de

Vrouwenbeweging (IIAV)• Koninklijke Bibliotheek (KB)

Partners - Composition

• Funding by NWO• Budget: 9.01 M Euro• 2009: 1.35M Euro• 2010: 1.35M Euro• 2011: 1.90M Euro• 2012: 1.90M Euro• 2013: 1.25M Euro• 2014: 1.25M Euro

Budget & Funding

• Partners must sign cooperation agreement that will be circulated soon!

• NWO still has to send its official commitment letter…..

Caveats

Thanks for your Attention

Do NOT Go beyond this point

• eScience: – computationally intensive science– carried out in highly distributed network– ±that uses huge data sets requiring

grid computing – ±using technologies enabling distributed

collaboration

Definitions

top related