putting linked data to use in a large higher-education organisation
DESCRIPTION
Presented at the "Interacting with Linked Data" workshop located with ESWC 2012TRANSCRIPT
Putting Linked Data to Use in a Large Higher-Education Organisation
Mathieu d’AquinKnowledge Media Institute (KMi),
The Open University, UK
@mdaquin
Motivation• Many works fosus on the publication of linked
data• But what do we do once its published
• We have built a full linked data platform for our university (the Open University, data.open.ac.uk)
• And built a lot of applications to demonstrate what we could do with it
• What do we learn from getting people to, unknowingly, use linked data?
• What experience can we reuse for the development of interactive tools relying on linked data?
Linked Data at the Open Universitydata.open.ac.uk
Linked Data at the Open University• Course information:
– 580 modules/ description of the course, information about the levels and number of credits associated with it, topics, and conditions of enrolment.
• Research publications: – 16,000 academic articles / information about authors, dates, abstract and venue of the
publication.
• Podcasts: – 2220 video podcasts and 1500 audio podcats / short description, topics, link to a
representative image and to a transscript if available, information about the course the podcast might relate to and license information regarding the content of the podcast.
• Open Educational Resources: – 640 OpenLearn Units / short description, topics, tags used to annotate the resource, its
language, the course it might relate to, and the license that applies to the content.
• Youtube videos: – 900 videos / short description of the video, tags that were used to annotate the video,
collection it might be part of and link to the related course if relevant.
• University buildings: – 100 buildings / address, a picture of the building and the sub-divisions of the building
into floors and spaces.
• Library catalogue: – 12,000 books/ topics, authors, publisher and ISBN, as well as the course related.
• Others…
Applications
Resource Discovery
Mobile and Personal Semantics
ResearchExploration
Social
Application 1: Study at the OU
That’s where linked data is
App lication 1: What we learned
• From the users’ perspective– Useful functionality can be very simple– Combining information from different sources– Transparent/Seamless
• From the developers’ perspective – Development time: from months to minutes– Interacting directly with the data, rather than
multiple different systems– Lack of awareness of Semantic Web
technologies– Correspondance with other, more common
technologies (e.g., SQL and relational DBs) misleading
– Performance: large number of SPARQL queries not easy to handle. Requires caching of pre-canned queries. Contradict the idea of open and unexpected reuse
Looked at it in rage for hours… just didn’t think it wouldn’t give me an error if I mispelled the
name of a property
App lication 2: Supporting the REF
* Combining public and semi-private data* Read/write
App lication 2: What we learned• From the users’ perspective
– No additional or duplicated output required from users: reusing what was collected in multiple systems
– Again transparent/seemless technology– Still some confusion related to consistancy across
systems/representations – Assumptions hard to conform with when data is drawn from
multiple systems with “unwritten conventions”
• From the developers’ perspective – Again, rapide development– Extensibility and flexibility – SPARQL Query / SPARQL Update duo very powerful for
lightweight interfaces (even client side)– Dealing with incomplete data is tricky (we don’t know when it
is incomplete)– No “meta-properties” of the data (i.e., all IDs are unique and
non redundant)– Assumption made are specific to the application, not generic– Where is the problem? In the application, linked data, the
original data?
Really? This uses linked data? I thought
we bought it from some company…
Can you add a new field?
Application 3: Research communities
Generic vs SpecificInterface
App lication 3: What we learned• From the users’ perspective
– Generic: more knowledge = more functionalisties
– Generic: homogeneous interface to heterogenesous data
– Generic: more demanding for users– Application-driven vs data-driven navigation– Specific interface allows for more complexity
• From the developers’ perspective – Generic is harder: can’t make assumptions
related to the specific data/application– Specific is less customisable/extensible:
adding new features requires custom code
Shouldn’t that be here in that case?
App lication 4: The OU in the media
Academics in “Arts and Humanities” most often involved with the media (in number of news items)
Topics most commonly mentioned by news outlets own by the BBC (in number of news items)
App lication 4: What we learned• From the users’ perspective
– Easy understandable outputs: embedable charts
– Customisable: build a dynamics dashboard in minutes
– Benefits of linked data: bring external data that can be jointly queried with you own
• From the developers’ perspective– Requires a good understanding of the data
and the technology (especially SPARQL)– Generic component to build specific interfaces
(best of both words?)– But again cannot rely on application/data
specific assumptions (meta-properties regarding redundancy, completeness, etc.)
I would like this chart for my
blog… What do you
mean by “give me 3 minutes”?
Discussion• Linked data should be hidden from the users
– Obvious? Yes… but is it really happening? – Requires some aspects of the data tto be persent, eg. Huamn readbale labels– Many lapplicatoins of linked data are still linked data applications– Higher level concpets, such as data0integration from multiple sources, are harder to
hide
• Generic vs Specigic– Reuse of software components is good– But forces to addopt a specifi form of interatction witch is driven by the technicallities
and the data– Trade-off to be found: generic + customisable
• Openess and flexibility – … are not always easy to deal with– Building interfaces fro the unknown.– No assumption can be made on the data, regarding redundance and complete ness – Need for meta-properties that can guide the building of applications (see what is
applicable)
Conclusion
• Applications in an large organisations used to more common technologies raise challenges that help understanding the common pitfalls of interactions with linked data
• Important to share experiences in addition to techniques/tools
• To build better systems and approaches for interaction
Thank you!
Any question?
Images (others are mine)
• Broadcast: http://commons.wikimedia.org/wiki/File:Ibaraki_Broadcast_System_headquater01.jpg
• Don’t know: http://commons.wikimedia.org/wiki/File:I_Don%27t_Know_ANY_of_This!.jpg
• Development: http://commons.wikimedia.org/wiki/File:Applications-development.svg
• Learning: http://www.flickr.com/photos/vivacomopuder/3122401239/
• Course / degree: http://commons.wikimedia.org/wiki/File:Degree.svg
• Article : http://commons.wikimedia.org/wiki/File:Articles.JPG
• Open Learning: http://commons.wikimedia.org/wiki/File:Colearn_-_learning_together.jpg
• Youtube: http://commons.wikimedia.org/wiki/File:Logo_YouTube_por_Hernando.svg
• Open University building : http://www.flickr.com/photos/rattyfied/3011643690/
• Library: http://commons.wikimedia.org/wiki/File:SteacieLibrary.jpg