working with data.open.ac.uk, the linked data platform of the open university
DESCRIPTION
Presentation of the Linked Data work realised at the Open University to the IT developer's forum - 10/05/2011TRANSCRIPT
Working with data.open.ac.uk,
the linked data platform of the OUMathieu d’Aquin and the LUCERO team
@mdaquin
Knowledge Media Institute, the Open University
LUCERO project
lucero-project.info – data.open.ac.uk
Linked Data
• As set of principles and technologies for a Web of Data– Putting the “raw” data online in a
standard, web enabled representation (RDF)
– Make the data Web addressable (URIs)
– Link with other data
Graph (up to date)
So Linked Data for the OU?
ORO
Archive of Course Material
Library’sCatalogueOf Digital Content
OpenLearnContent
A/V MaterialPodcastsiTunesU
Data from Research Outputs
BBC
DBPedia
DBLP
RAE
geonames
data.gov.uk
Currently: OU public data sit in different systems – hard to discover, obtain, integrate by users.
Exposed as linked data, our data interlink with each other and the external world: become part of the “global data space” on the Web
Why is it important?• The OU has been the first University to expose its data as
linked data: http://data.open.ac.uk• Now widely recognized as a critical step forward for the
HE sector in the UK (and worldwide)– Favor transparency and reuse of data, both externally and
internally– Reduces cost of dealing with our own public data: integration
and reuse by design– Enable both new kinds of applications, and to make the ones
that are already feasible more cost effective
• At least 3 other UK universities have now followed our example: – http://data.online.lincoln.ac.uk/, http://data.ox.ac.uk/,
http://data.southampton.ac.uk/– And others in other countries are setting up similar initiatives
“if you are working in an IT department within a University you better read this report, as soon your department will need to be making these same decisions.” David Flanders,
JISCExpo Programme Manager,http://code.google.com/p/jiscexpo/wiki/luceroproject#Site_Visit_Report
The data.open.ac.uk Stack
Technical infrastructure
Organizational infrastructure
Institutional repository data
Research Data (Arts)
Applications
data.open.ac.uk
Technological principle: Everything has a URI• Example:
– http://data.open.ac.uk/course/m366 – the course M366
– http://data.open.ac.uk/oro/21166 – an article in ORO
– http://data.open.ac.uk/page/person/ext-911ee9dfa3db572830b00bd8a9983e39 – an Person, who authored the article above
– http://xmlns.com/foaf/0.1/Person – the type person
– http://purl.org/dc/terms/creator – the property that links an author to an article
Technological principle: Content negotiationAccept: text/html Accept: application/rdf+xml
<?xml version="1.0" encoding="UTF-8"?><rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><rdf:Description rdf:about="http://data.open.ac.uk/oro/9719"> <label xmlns="http://www.w3.org/2000/01/rdf-schema#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</label> <authorList xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://data.open.ac.uk/oro/9719#authors"/> <title xmlns="http://purl.org/dc/terms/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</title> <abstract xmlns="http://purl.org/ontology/bibo/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers against the glycosylated form of MUC1 are described, along with their use in treatment and diagnosis of conditions associated with elevated production of MUC1.</abstract> <isPartOf xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/oro/repository"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/peerReviewed"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/published"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-07bcb3718cb0de7883dc7b8fde7e283d"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/b7fc322e6386517c5ebef3c09d13bd9e"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-7c8b5252e28115f91640559c2fe64ca3"/> <date xmlns="http://purl.org/dc/terms/">2007-11-15</date> <rdf:type rdf:resource="http://purl.org/ontology/bibo/Article"/> <rdf:type rdf:resource="http://purl.org/ontology/bibo/Patent"/></rdf:Description></rdf:RDF>
RDF <?xml version="1.0" encoding="UTF-8"?><rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><rdf:Description rdf:about="http://data.open.ac.uk/oro/9719"> <label xmlns="http://www.w3.org/2000/01/rdf-schema#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</label> <authorList xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://data.open.ac.uk/oro/9719#authors"/> <title xmlns="http://purl.org/dc/terms/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</title> <abstract xmlns="http://purl.org/ontology/bibo/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers against the glycosylated form of MUC1 are described, along with their use in treatment and diagnosis of conditions associated with elevated production of MUC1.</abstract> <isPartOf xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/oro/repository"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/peerReviewed"/> <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/published"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-07bcb3718cb0de7883dc7b8fde7e283d"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/b7fc322e6386517c5ebef3c09d13bd9e"/> <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-7c8b5252e28115f91640559c2fe64ca3"/> <date xmlns="http://purl.org/dc/terms/">2007-11-15</date> <rdf:type rdf:resource="http://purl.org/ontology/bibo/Article"/> <rdf:type rdf:resource="http://purl.org/ontology/bibo/Patent"/></rdf:Description></rdf:RDF>
By the way…
• On Study at the OU:– http://data.open.ac.uk/course/m366
– if HTML requested, goes to http://www3.open.ac.uk/study/undergraduate/course/m366.htm
– Try http://www3.open.ac.uk/study/under
graduate/course/m366.rdf
Technological principle: link… also to external datasets
Using URIs makes pieces of data directly addressable and linkable on the Web, independently of where the data is:– http://data.open.ac.uk/course/m366 isAvailableIn
http://sws.geonames.org/458258/ (Republic of Latvia)– http://data.open.ac.uk/organization/
the_open_university sameAs http://education.data.gov.uk/doc/school/133849
– http://data.open.ac.uk/location/building/mbbn (Berrill Building North) postcode http://data.ordnancesurvey.co.uk/id/postcodeunit/MK76AA
And others can link to our data…
SPARQL• The “SQL” of RDF and linked data• Fits the graph data model of RDF
– Select [variables: ?x ?name, etc.]– From [graph, or all graphs if nothing]– Where [triple patterns and filters]– Order by, limit, offset, etc.
• SPARQL protocol: simply based on HTTP– A SPARQL endpoint is a URL that takes a
“query” parameter– And return results in the SPARQL xml format– See http://data.open.ac.uk
SPARQL: example queries
select distinct ?course where {?course <http://data.open.ac.uk/saou/ontology#isAvailableIn> <http://sws.geonames.org/2328926/>. ?course a <http://purl.org/vocab/aiiso/schema#Module>}
Courses available in Nigeria
http://data.open.ac.uk/query?query=select%20distinct%20%3Fcourse%20where%20{%3Fcourse%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23isAvailableIn%3E%20%3Chttp%3A%2F%2Fsws.geonames.org%2F2328926%2F%3E.%20%3Fcourse%20a%20%3Chttp%3A%2F%2Fpurl.org%2Fvocab%2Faiiso%2Fschema%23Module%3E}
SPARQL: example queries
select distinct ?course where {?course <http://data.open.ac.uk/saou/ontology#isAvailableIn> <http://sws.geonames.org/2328926/>. ?course a <http://purl.org/vocab/aiiso/schema#Module>}
Courses available in Nigeria
http://data.open.ac.uk/query?query=select%20distinct%20%3Fcourse%20where%20{%3Fcourse%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23isAvailableIn%3E%20%3Chttp%3A%2F%2Fsws.geonames.org%2F2328926%2F%3E.%20%3Fcourse%20a%20%3Chttp%3A%2F%2Fpurl.org%2Fvocab%2Faiiso%2Fschema%23Module%3E}
SPARQL: example queriesVideo podcasts related to postgraduate courses in computing
http://data.open.ac.uk/query?query=select%20%3Fx%20%3Ft%0Awhere%20{%0A%20%20%20%3Fc%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fsubject%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Ftopic%2Fcomputing%3E.%0A%20%20%20%3Fc%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23courseLevel%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23postgraduate%3E.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FrelatesToCourse%3E%20%3Fc.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Ftitle%3E%20%3Ft.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FVideoPodcast%3E%0A}&limit=0
select ?x ?t where {?c <http://purl.org/dc/terms/subject> <http://data.open.ac.uk/topic/computing>. ?c <http://data.open.ac.uk/saou/ontology#courseLevel> <http://data.open.ac.uk/saou/ontology#postgraduate>.?x <http://data.open.ac.uk/podcast/ontology/relatesToCourse> ?c.?x <http://purl.org/dc/terms/title> ?t.?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/podcast/ontology/VideoPodcast>}
SPARQL: example queriesThings related to “earthquake”
http://data.open.ac.uk/query?query=select%20%3Fc%20%3Fdesc%20where%7B%0A%3Fc%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fdescription%3E%20%3Fdesc%20.%0A%7B%7B%3Fc%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%0A%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fopenlearn%2Fontology%2FOpenLearnUnit%3E%7D%0AUNION%0A%7B%3Fc%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%0A%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FVideoPodcast%3E%7D%7D%0AFILTER%20regex(str(%3Fdesc)%2C%20%22earthquake%22%2C%20%22i%22%20)%0A%7D&limit=0
select ?c ?desc where {?c <http://purl.org/dc/terms/description> ?desc .{ {?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/openlearn/ontology/OpenLearnUnit>}UNION{?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/podcast/ontology/VideoPodcast>} }FILTER regex(str(?desc), "earthquake", "i" )}
Planning + Logging
Collect Extract Link Store Expose
OntologiesScheduler
RSS Updater Triple Store
Delete (1)Add (2)
Index Search
SPARQLendpoint
Web Server
RSS Extractor
XML Updater
RDF Extractor
RDF Cleaner
Cleaning rules
Each datasets
Lib, courses, loc
ORO, podcast
URL redirection rules
RSS feed
New itemsObsolete items
RDF file (add) RDF file (delete)
RDF file (add) RDF file (delete)
Generic process Dataset specific process
Entity Name
SystemURI creation rules
Method for a exposing a dataset
Initial Meeting with Data Owner
- Identify data- Get sample data- Identify Copyright Issues- Identify possible links- Identify users and usage
Data Modeling sessions
Lucero Core Team
Data Owner
Lucero KMi Team
Lucero members
- Find reusable ontologies- Map onto the data- Identify uncovered parts- Define URI Scheme
Data Modeling Validation
Lucero Core Team
Data Owner
Development of Extractor
URI Creation Rules
DefinitionDeploymentLucero KMi
Team
Datasets• Already “officially” in place:
– ORO: more than 18,000 publications from OU researchers– Podcasts: 2,500 audio and video tracks from
podcast.open.ac.uk, linked to the relate courses– Study at the OU: more than 600 live module descriptions– OpenLearn: more than 550 Units of course material– KMi Staff and Planet newsletter
• Currently being processed:– OU Buildings in MK and regional centers– Library Catalogue– YouTube channel– Old Courses– “Reading Experience Database” project – People Profiles
Screenshot of the dataset page
Building applications with Linked Data
• Everything is based on HTTP/XML– In principle, just need a Web
connection…
• Libraries available in many languages to manipulate RDF data– Java: Jena (http://openjena.org/)– PHP: ARC2
(https://github.com/semsol/arc2)– Python: RDFLib (http://www.rdflib.net/)– …
Example: Accessing data.open.ac.uk with PHP/Arc2include_once("arc2/ARC2.php");
// declare the SPARQL endpoint$config = array('remote_store_endpoint' => 'http://data.open.ac.uk/query’,);$store = ARC2::getRemoteStore($config);
// Execute a SPARQL query$postcodesq = 'select distinct ?p where {[] <http://data.ordnancesurvey.co.uk/ontology/postcode/postcode> ?p.}’;$rows = $store->query($postcodesq, 'rows');
// Display the resultsforeach ($rows as $row) { echo $row[‘p’].”</br/>”;}
Applications• For education
– Mobile podcast explorer, podcast explorer on TV – OU Building Map, OU location tracker (cf.
foursquare)– OU Expert Search– Connecting courses/OpenLearn to relevant
podcast– OU Course Profile Facebook app using list of
courses, “Study Buddy” app connecting facebook users to relevant courses
• For Research– Display connections in a research community– Research Data/Impact Analysis– Connection research datasets to external data
Example application: Link OpenLearn to relevant course/podcasts
Example Application: keep track of location, meetings, tutorials, at the OU
EXAMPLE APPLICATION:Expert Search using publication information and connecting to contact information within the OU
Example application: Explore Information about a person in the “Reading Experience Database” based on data provided by DBPedia (Linked Data version of Wikipedia) New ways to look at humanities research data
Example application: exploring research communities
The future• More data… always more data• More links, especially to external entities
– BBC– Government agencies– Other universities
• More applications:– Integration into main OU websites (e.g., study at the OU)– Integration into common OU applications (people profile, Facebook
course profile, etc.)– Support for common OU processes (REF audit, course
recommendation, providing resources to AL and lecturers)
• Connecting to other Universities– Many other universities in the UK and abroad are making the move to
linked data (see linkeduniversities.org)– Linked data has the potential to create connections across institutions,
a data-based network on higher education course providers
Conclusion• Linked data is more than an emerging,
academic trend. • data.open.ac.uk and linked data in general are
fast becoming very valuable resources for developers, internally and externally
• We are very proud to have been the first university to really deploy a linked data platform
• Needs to sustain and evolve as a core service at the OU…
• … and as a key component of the Web of University Linked Data
Thank You Carlo Allocca
(Dev)
Mathieu d’Aquin(PD)
Salman Elahi((Ex)-Dev)
Enrico Motta(SGP)
Andriy Nikolov(linking)
Jane Whild(Admin)
Fouad Zablith(Dev)
Library Specialists
Owen Stephens(PM)
Richard Nurse((ex-)PM)
Non ScantleburyArts Specialists
Suzanne Duncanson-HunterJohn Wolfe
Paul Lawrence
Stuart Brown
Data Owners
KMi
OU Library
Com./StudentComp.Services
Arts