working with data.open.ac.uk, the linked data platform of the open university

34
Working with data.open.ac.uk, the linked data platform of the OU Mathieu d’Aquin and the LUCERO team @mdaquin Knowledge Media Institute, the Open University LUCERO project lucero-project.info – data.open.ac.uk

Upload: mathieu-daquin

Post on 11-May-2015

3.852 views

Category:

Technology


0 download

DESCRIPTION

Presentation of the Linked Data work realised at the Open University to the IT developer's forum - 10/05/2011

TRANSCRIPT

Page 1: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Working with data.open.ac.uk,

the linked data platform of the OUMathieu d’Aquin and the LUCERO team

@mdaquin

Knowledge Media Institute, the Open University

LUCERO project

lucero-project.info – data.open.ac.uk

Page 2: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Linked Data

• As set of principles and technologies for a Web of Data– Putting the “raw” data online in a

standard, web enabled representation (RDF)

– Make the data Web addressable (URIs)

– Link with other data

Page 3: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Graph (up to date)

Page 4: Working with data.open.ac.uk, the Linked Data Platform of the Open University

So Linked Data for the OU?

ORO

Archive of Course Material

Library’sCatalogueOf Digital Content

OpenLearnContent

A/V MaterialPodcastsiTunesU

Data from Research Outputs

BBC

DBPedia

DBLP

RAE

geonames

data.gov.uk

Currently: OU public data sit in different systems – hard to discover, obtain, integrate by users.

Exposed as linked data, our data interlink with each other and the external world: become part of the “global data space” on the Web

Page 5: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Why is it important?• The OU has been the first University to expose its data as

linked data: http://data.open.ac.uk• Now widely recognized as a critical step forward for the

HE sector in the UK (and worldwide)– Favor transparency and reuse of data, both externally and

internally– Reduces cost of dealing with our own public data: integration

and reuse by design– Enable both new kinds of applications, and to make the ones

that are already feasible more cost effective

• At least 3 other UK universities have now followed our example: – http://data.online.lincoln.ac.uk/, http://data.ox.ac.uk/,

http://data.southampton.ac.uk/– And others in other countries are setting up similar initiatives

Page 6: Working with data.open.ac.uk, the Linked Data Platform of the Open University

“if you are working in an IT department within a University you better read this report, as soon your department will need to be making these same decisions.” David Flanders,

JISCExpo Programme Manager,http://code.google.com/p/jiscexpo/wiki/luceroproject#Site_Visit_Report

Page 7: Working with data.open.ac.uk, the Linked Data Platform of the Open University

The data.open.ac.uk Stack

Technical infrastructure

Organizational infrastructure

Institutional repository data

Research Data (Arts)

Applications

Page 8: Working with data.open.ac.uk, the Linked Data Platform of the Open University

data.open.ac.uk

Page 9: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Technological principle: Everything has a URI• Example:

– http://data.open.ac.uk/course/m366 – the course M366

– http://data.open.ac.uk/oro/21166 – an article in ORO

– http://data.open.ac.uk/page/person/ext-911ee9dfa3db572830b00bd8a9983e39 – an Person, who authored the article above

– http://xmlns.com/foaf/0.1/Person – the type person

– http://purl.org/dc/terms/creator – the property that links an author to an article

Page 10: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Technological principle: Content negotiationAccept: text/html Accept: application/rdf+xml

<?xml version="1.0" encoding="UTF-8"?><rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><rdf:Description rdf:about="http://data.open.ac.uk/oro/9719"> <label xmlns="http://www.w3.org/2000/01/rdf-schema#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</label> <authorList xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://data.open.ac.uk/oro/9719#authors"/> <title xmlns="http://purl.org/dc/terms/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</title> <abstract xmlns="http://purl.org/ontology/bibo/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers against the glycosylated form of MUC1 are described, along with their use in treatment and diagnosis of conditions associated with elevated production of MUC1.</abstract> <isPartOf xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/oro/repository"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/peerReviewed"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/published"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-07bcb3718cb0de7883dc7b8fde7e283d"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/b7fc322e6386517c5ebef3c09d13bd9e"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-7c8b5252e28115f91640559c2fe64ca3"/> <date xmlns="http://purl.org/dc/terms/">2007-11-15</date> <rdf:type rdf:resource="http://purl.org/ontology/bibo/Article"/> <rdf:type rdf:resource="http://purl.org/ontology/bibo/Patent"/></rdf:Description></rdf:RDF>

Page 11: Working with data.open.ac.uk, the Linked Data Platform of the Open University

RDF <?xml version="1.0" encoding="UTF-8"?><rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><rdf:Description rdf:about="http://data.open.ac.uk/oro/9719"> <label xmlns="http://www.w3.org/2000/01/rdf-schema#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</label> <authorList xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://data.open.ac.uk/oro/9719#authors"/> <title xmlns="http://purl.org/dc/terms/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</title> <abstract xmlns="http://purl.org/ontology/bibo/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers against the glycosylated form of MUC1 are described, along with their use in treatment and diagnosis of conditions associated with elevated production of MUC1.</abstract> <isPartOf xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/oro/repository"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/peerReviewed"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/published"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-07bcb3718cb0de7883dc7b8fde7e283d"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/b7fc322e6386517c5ebef3c09d13bd9e"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-7c8b5252e28115f91640559c2fe64ca3"/> <date xmlns="http://purl.org/dc/terms/">2007-11-15</date> <rdf:type rdf:resource="http://purl.org/ontology/bibo/Article"/> <rdf:type rdf:resource="http://purl.org/ontology/bibo/Patent"/></rdf:Description></rdf:RDF>

Page 12: Working with data.open.ac.uk, the Linked Data Platform of the Open University

By the way…

• On Study at the OU:– http://data.open.ac.uk/course/m366

– if HTML requested, goes to http://www3.open.ac.uk/study/undergraduate/course/m366.htm

– Try http://www3.open.ac.uk/study/under

graduate/course/m366.rdf

Page 13: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Technological principle: link… also to external datasets

Using URIs makes pieces of data directly addressable and linkable on the Web, independently of where the data is:– http://data.open.ac.uk/course/m366 isAvailableIn

http://sws.geonames.org/458258/ (Republic of Latvia)– http://data.open.ac.uk/organization/

the_open_university sameAs http://education.data.gov.uk/doc/school/133849

– http://data.open.ac.uk/location/building/mbbn (Berrill Building North) postcode http://data.ordnancesurvey.co.uk/id/postcodeunit/MK76AA

And others can link to our data…

Page 14: Working with data.open.ac.uk, the Linked Data Platform of the Open University

SPARQL• The “SQL” of RDF and linked data• Fits the graph data model of RDF

– Select [variables: ?x ?name, etc.]– From [graph, or all graphs if nothing]– Where [triple patterns and filters]– Order by, limit, offset, etc.

• SPARQL protocol: simply based on HTTP– A SPARQL endpoint is a URL that takes a

“query” parameter– And return results in the SPARQL xml format– See http://data.open.ac.uk

Page 15: Working with data.open.ac.uk, the Linked Data Platform of the Open University

SPARQL: example queries

select distinct ?course where {?course <http://data.open.ac.uk/saou/ontology#isAvailableIn> <http://sws.geonames.org/2328926/>. ?course a <http://purl.org/vocab/aiiso/schema#Module>}

Courses available in Nigeria

http://data.open.ac.uk/query?query=select%20distinct%20%3Fcourse%20where%20{%3Fcourse%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23isAvailableIn%3E%20%3Chttp%3A%2F%2Fsws.geonames.org%2F2328926%2F%3E.%20%3Fcourse%20a%20%3Chttp%3A%2F%2Fpurl.org%2Fvocab%2Faiiso%2Fschema%23Module%3E}

Page 16: Working with data.open.ac.uk, the Linked Data Platform of the Open University

SPARQL: example queries

select distinct ?course where {?course <http://data.open.ac.uk/saou/ontology#isAvailableIn> <http://sws.geonames.org/2328926/>. ?course a <http://purl.org/vocab/aiiso/schema#Module>}

Courses available in Nigeria

http://data.open.ac.uk/query?query=select%20distinct%20%3Fcourse%20where%20{%3Fcourse%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23isAvailableIn%3E%20%3Chttp%3A%2F%2Fsws.geonames.org%2F2328926%2F%3E.%20%3Fcourse%20a%20%3Chttp%3A%2F%2Fpurl.org%2Fvocab%2Faiiso%2Fschema%23Module%3E}

Page 17: Working with data.open.ac.uk, the Linked Data Platform of the Open University

SPARQL: example queriesVideo podcasts related to postgraduate courses in computing

http://data.open.ac.uk/query?query=select%20%3Fx%20%3Ft%0Awhere%20{%0A%20%20%20%3Fc%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fsubject%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Ftopic%2Fcomputing%3E.%0A%20%20%20%3Fc%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23courseLevel%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23postgraduate%3E.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FrelatesToCourse%3E%20%3Fc.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Ftitle%3E%20%3Ft.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FVideoPodcast%3E%0A}&limit=0

select ?x ?t where {?c <http://purl.org/dc/terms/subject> <http://data.open.ac.uk/topic/computing>. ?c <http://data.open.ac.uk/saou/ontology#courseLevel> <http://data.open.ac.uk/saou/ontology#postgraduate>.?x <http://data.open.ac.uk/podcast/ontology/relatesToCourse> ?c.?x <http://purl.org/dc/terms/title> ?t.?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/podcast/ontology/VideoPodcast>}

Page 18: Working with data.open.ac.uk, the Linked Data Platform of the Open University

SPARQL: example queriesThings related to “earthquake”

http://data.open.ac.uk/query?query=select%20%3Fc%20%3Fdesc%20where%7B%0A%3Fc%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fdescription%3E%20%3Fdesc%20.%0A%7B%7B%3Fc%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%0A%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fopenlearn%2Fontology%2FOpenLearnUnit%3E%7D%0AUNION%0A%7B%3Fc%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%0A%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FVideoPodcast%3E%7D%7D%0AFILTER%20regex(str(%3Fdesc)%2C%20%22earthquake%22%2C%20%22i%22%20)%0A%7D&limit=0

select ?c ?desc where {?c <http://purl.org/dc/terms/description> ?desc .{ {?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/openlearn/ontology/OpenLearnUnit>}UNION{?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/podcast/ontology/VideoPodcast>} }FILTER regex(str(?desc), "earthquake", "i" )}

Page 19: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Planning + Logging

Collect Extract Link Store Expose

OntologiesScheduler

RSS Updater Triple Store

Delete (1)Add (2)

Index Search

SPARQLendpoint

Web Server

RSS Extractor

XML Updater

RDF Extractor

RDF Cleaner

Cleaning rules

Each datasets

Lib, courses, loc

ORO, podcast

URL redirection rules

RSS feed

New itemsObsolete items

RDF file (add) RDF file (delete)

RDF file (add) RDF file (delete)

Generic process Dataset specific process

Entity Name

SystemURI creation rules

Page 20: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Method for a exposing a dataset

Initial Meeting with Data Owner

- Identify data- Get sample data- Identify Copyright Issues- Identify possible links- Identify users and usage

Data Modeling sessions

Lucero Core Team

Data Owner

Lucero KMi Team

Lucero members

- Find reusable ontologies- Map onto the data- Identify uncovered parts- Define URI Scheme

Data Modeling Validation

Lucero Core Team

Data Owner

Development of Extractor

URI Creation Rules

DefinitionDeploymentLucero KMi

Team

Page 21: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Datasets• Already “officially” in place:

– ORO: more than 18,000 publications from OU researchers– Podcasts: 2,500 audio and video tracks from

podcast.open.ac.uk, linked to the relate courses– Study at the OU: more than 600 live module descriptions– OpenLearn: more than 550 Units of course material– KMi Staff and Planet newsletter

• Currently being processed:– OU Buildings in MK and regional centers– Library Catalogue– YouTube channel– Old Courses– “Reading Experience Database” project – People Profiles

Page 22: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Screenshot of the dataset page

Page 23: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Building applications with Linked Data

• Everything is based on HTTP/XML– In principle, just need a Web

connection…

• Libraries available in many languages to manipulate RDF data– Java: Jena (http://openjena.org/)– PHP: ARC2

(https://github.com/semsol/arc2)– Python: RDFLib (http://www.rdflib.net/)– …

Page 24: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Example: Accessing data.open.ac.uk with PHP/Arc2include_once("arc2/ARC2.php");

// declare the SPARQL endpoint$config = array('remote_store_endpoint' => 'http://data.open.ac.uk/query’,);$store = ARC2::getRemoteStore($config);

// Execute a SPARQL query$postcodesq = 'select distinct ?p where {[] <http://data.ordnancesurvey.co.uk/ontology/postcode/postcode> ?p.}’;$rows = $store->query($postcodesq, 'rows');

// Display the resultsforeach ($rows as $row) { echo $row[‘p’].”</br/>”;}

Page 25: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Applications• For education

– Mobile podcast explorer, podcast explorer on TV – OU Building Map, OU location tracker (cf.

foursquare)– OU Expert Search– Connecting courses/OpenLearn to relevant

podcast– OU Course Profile Facebook app using list of

courses, “Study Buddy” app connecting facebook users to relevant courses

• For Research– Display connections in a research community– Research Data/Impact Analysis– Connection research datasets to external data

Page 26: Working with data.open.ac.uk, the Linked Data Platform of the Open University
Page 27: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Example application: Link OpenLearn to relevant course/podcasts

Page 28: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Example Application: keep track of location, meetings, tutorials, at the OU

Page 29: Working with data.open.ac.uk, the Linked Data Platform of the Open University

EXAMPLE APPLICATION:Expert Search using publication information and connecting to contact information within the OU

Page 30: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Example application: Explore Information about a person in the “Reading Experience Database” based on data provided by DBPedia (Linked Data version of Wikipedia) New ways to look at humanities research data

Page 31: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Example application: exploring research communities

Page 32: Working with data.open.ac.uk, the Linked Data Platform of the Open University

The future• More data… always more data• More links, especially to external entities

– BBC– Government agencies– Other universities

• More applications:– Integration into main OU websites (e.g., study at the OU)– Integration into common OU applications (people profile, Facebook

course profile, etc.)– Support for common OU processes (REF audit, course

recommendation, providing resources to AL and lecturers)

• Connecting to other Universities– Many other universities in the UK and abroad are making the move to

linked data (see linkeduniversities.org)– Linked data has the potential to create connections across institutions,

a data-based network on higher education course providers

Page 33: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Conclusion• Linked data is more than an emerging,

academic trend. • data.open.ac.uk and linked data in general are

fast becoming very valuable resources for developers, internally and externally

• We are very proud to have been the first university to really deploy a linked data platform

• Needs to sustain and evolve as a core service at the OU…

• … and as a key component of the Web of University Linked Data

Page 34: Working with data.open.ac.uk, the Linked Data Platform of the Open University

Thank You Carlo Allocca

(Dev)

Mathieu d’Aquin(PD)

Salman Elahi((Ex)-Dev)

Enrico Motta(SGP)

Andriy Nikolov(linking)

Jane Whild(Admin)

Fouad Zablith(Dev)

Library Specialists

Owen Stephens(PM)

Richard Nurse((ex-)PM)

Non ScantleburyArts Specialists

Suzanne Duncanson-HunterJohn Wolfe

Paul Lawrence

Stuart Brown

Data Owners

KMi

OU Library

Com./StudentComp.Services

Arts