Download - Linked Data vs Open Educational Resources
Linked Data and Open Educational Resources -
towards a symbiotic Relationshiptowards a symbiotic RelationshipStefan Dietze
- 6th tele-TASK Symposium 2012 -
Research areas
� Semantic Web & Linked Data, data & knowledge
integration (mapping, classification, interlinking)
� Application domains: education/TEL, Web archiving, …
Projects & activities
� EU funded research projects:
Introduction
⇒ http://www.l3s.de/
⇒ http://kmi.open.ac.uk/
� (Linked) Web data & education
� „Linked Learning“ and „LALD“ workshops
(eg LILE2012@WWW2012)
� http://linkededucation.org &
http://linkeduniversities.org
� More information: http://purl.org/dietze
Stefan Dietze 2tele-TASK Symposium 2012
RecSys
Collab./content- based RecSys
Open Educational Resources
TEL
RecSys for TEL; Learning Analytics
� TEL environments & recommender systems dependent on
availability of data
� “In the lab”: data for evaluation
� “In the wild”: real-world TEL applications
� Quantity, quality (e.g. accessibility, interoperability) and reusability
of Web data (in particular about OER) is crucial
RecSys
Collab./content- based RecSys
Open Educational Resources
TEL
of Web data (in particular about OER) is crucial
RecSys for TEL; Learning Analytics
Educational Web data
State
� Vast Open Educational Resource (OER) metadata collections
(e.g. OpenCourseware, OpenLearn, Merlot, ARIADNE)
� Usually exposed via APIs/services
� Competing Web interfaces
(e.g. SQI, OAI-PMH, SOAP, REST)
� Competing metadata standards
(e.g. IEEE LOM, ADL SCORM, DC…)
(c) Paul Miller
(e.g. IEEE LOM, ADL SCORM, DC…)
� Competing exchange formats and
serialisations
(e.g. JSON, RDF, XML)
� Fragmented use of taxonomies
Issues
� Heterogeneity & lack of interoperability
� Lack of take-up
Stefan Dietze 5tele-TASK Symposium 2012
Web-scale exploration of educational
resources and data ?
RecSys
TEL
Collab./content- based RecSys
Open Educational Resources
RecSys for TEL; Learning Analytics
Semantic Web
Linked Open Data
Educational Linked Data
RecSys
TEL
Semantic Web
Collab./content- based RecSys
Open Educational Resources
RecSys for TEL; Learning Analytics
(c) Paul Miller
(Linked) Open Data
Stefan Dietze 8tele-TASK Symposium 2012
(c) Paul Miller
Linked (Open) Data – “Semantic Web done right”
� Vision: well connected graph of open Web data
� W3C standards (RDF, SPARQL) to expose data,
URIs to interlink datasets
� => vast cloud of interconnected datasets
� Crossing all sorts of domains
(Linked) Open Data
Domain Number of
datasets Triples %
Media 25 1,841,852,061 5.82 %
Geographic 31 6,145,532,484 19.43 %
Government 49 13,315,009,400 42.09 %
Publications 87 2,950,720,693 9.33 %
Cross-domain 41 4,184,635,715 13.23 %
Life sciences 41 3,036,336,004 9.60 %
User-generated
content 20 134,127,413 0.42 %
295 31,634,213,770
Source: http://lod-cloud.net/state, September 2011 Stefan Dietze 9tele-TASK Symposium 2012
Datasets which might enhance (informal) learning
� Publications & literature: ACM, PubMed, DBLP (L3S),
OpenLibrary
� Domain-specific knowledge & resources: Bioportal for Life
Sciences, historic artefacts in Europeana, Geonames
� Cross-domain knowledge: DBpedia, Freebase, …
� Media resource metadata: BBC, Flickr, …
(Linked) Open Data for Education
Stefan Dietze 10tele-TASK Symposium 2012
Datasets which might enhance (informal) learning
� Publications & literature: ACM, PubMed, DBLP (L3S),
OpenLibrary
� Domain-specific knowledge & resources: Bioportal for Life
Sciences, historic artefacts in Europeana, Geonames
� Cross-domain knowledge: DBpedia, Freebase, …
� Media resource metadata: BBC, Flickr, …
Explicitly educational datasets and schemas
(Linked) Open Data for Education
� University Linked Data: eg The Open University UK,
http://data.open.ac.uk, Southampton University, University of
Munster (DE), http://education.data.gov.uk
� OER Linked Data: mEducator Linked ER
(http://ckan.net/package/meducator), Open Learn LD
� Schemas: Learning Resource Metadata Initiative (LRMI,
http://www.lrmi.net/), mEducator Educational Resources
schema (http://purl.org/meducator/ns)
=> see http://linkededucation.org & http://linkeduniversities.org
Stefan Dietze 11tele-TASK Symposium 2012
Applications of educational LOD
(eg from past projects & LILE2012)
� Web-search of educational courses/OER
(„educational graph“)
� Game-based learning & automatic
generation of assessment items from LOD
� Enrichment of learning resources
(facilitating more exploratory learning
approaches)….
(Linked) Open Data for Education
http://metamorphosis.med.duth.gr/
approaches)….
Stefan Dietze 12tele-TASK Symposium 2012
Web-scale TEL data integration
� data quality (ambiguity, richness, …)
� data heterogeneity (semantic),
� data interlinking
Web-scale TEL data exploitation
Applications/tools exploiting TEL Web
data for recommendation/exploration:
� scalability, robustness
� licensing and legal issues
Challenges
� data interlinking
Semantic Web
Linked Open Data
Educational Linked Data
RecSys
TEL
Semantic Web
Collab./content- based RecSys
Open Educational Resources
RecSys for TEL; Learning Analytics
Web-scale TEL data integration
� data quality (ambiguity, richness, …)
� data heterogeneity (semantic),
� data interlinking
Web-scale TEL data exploitation
Applications/tools exploiting TEL Web
data for recommendation/exploration:
� scalability, robustness
� licensing and legal issues
� data interlinking
Semantic Web
Linked Open Data
Educational Linked Data
RecSys
TEL
Semantic Web
Collab./content- based RecSys
Open Educational Resources
RecSys for TEL; Learning Analytics
mEducator
� EC-funded eContentPlus Best Practice Network (BPN) ,
� May 2009 – May 2012 (3 years duration)
� 14 partners:
=> http://www.meducator.net
+
Stefan Dietze 15tele-TASK Symposium 2012
Challenges & approach
1. Improving OER metadata interoperability and Web-wide search by applying LOD principles…
2. …WHILE exploiting existing OER metadata and infrastructures
Open Educational Resources ? Linked Data
Stefan Dietze 16tele-TASK Symposium 2012
Data/services integration & retrieval/search APIs
Challenges & approach
1. Improving OER metadata interoperability and Web-wide search by applying LOD principles…
2. …WHILE exploiting existing OER metadata and infrastructures
Data/services integration & retrieval/search APIs
Stefan Dietze 17tele-TASK Symposium 2012
Application context: biomedical education
=> http://metamorphosis.med.duth.gr/
Metamorphosis+ Tailored (L)CMS plugins
=> http://www.meducator3.net/
Data/services integration & retrieval/search APIs
Stefan Dietze 18tele-TASK Symposium 2012
Data/services integration & retrieval/search APIs
Data/services integration & retrieval/search APIs
Approach: educational service integration
� SmartLink: Linked Data registry of (educational) datasets / stores and their APIs
� Discovery and lifting of educational data out of heterogeneous repositories
� Transformation of heterogeneous data formats (XML, JSON...) and schemas (eg. IEEE LOM, Dublin
Core) into RDF (pre-requisite for LOD compliancy)
⇒ http://ckan.net/package/smartlink & http://purl.org/smartlink
Data/services integration & retrieval/search APIs
Stefan Dietze 19tele-TASK Symposium 2012
Data/services integration & retrieval/search APIs
Approach: educational data integration
Linked Educational Resources
� Issue: often poorly structured metadata, free-text and proprietary taxonomies
� Goal: improvement of lifted (RDF) data with public LOD vocabularies; tighter interlinking to
provide coherent and well-connected graph of educational data (across disparate stores)
� Approach: 1) Data enrichment (via DBpedia, Freebase, BioPortal)
2) Clustering (structural as well as linguistic) to identify correlating resources
⇒ http://linkededucation.org/meducator
Data/services integration & retrieval/search APIs Linked Educational Resources
Stefan Dietze 20tele-TASK Symposium 2012
?
(1) Enrichment: automated via DBpedia & Freebase
Semi-structured RDF
description of
educational resource
Stefan Dietze 21tele-TASK Symposium 2012
?
(1) Enrichment: automated via DBpedia & Freebase
Semi-structured RDF
description of
educational resource
Stefan Dietze 22tele-TASK Symposium 2012
(1) Enrichment: automated via DBpedia & Freebase
?
18/09/12 23Stefan Dietze
?
?
?
!
(1) Enrichment: automated via DBpedia & Freebase
NER & disambiguation,
eg, via
18/09/12 24Stefan Dietze
!
(1) Enrichment: semi-automated
http://metamorphosis.med.duth.gr/
Metamorphosis+Example: OER annotation in MetaMorphosis+
Stefan Dietze 25tele-TASK Symposium 2012
Access to 324 ontologies
and over 5 Mio entities
http://bioportal.bioontology.org/
(1) Enrichment: semi-automated
http://metamorphosis.med.duth.gr/
Metamorphosis+
1. User-specified term during
learning resource annotation
2. Suggested Entities
3. Selected entities from BioPortal used to describe discipline, keywords of resource
Stefan Dietze 26tele-TASK Symposium 2012
Number of resources per DBpedia reference/enrichment (subject) in mEducator dataset
Cervical_cancer
Screening
Cervical
Hpv
Oxygenation
Childhood
differential_diagnosis
Knowledge
Learning
decision_making
59
31
29
29
26
22
19
18
17
16
(2) Structural clustering of related resources
DBpedia references used most frequently to describe the
„subject“ of particular educational resources
decision_making
Training
Lecture
Risk
hpv_infection
Fear
pap_smear
Abnormal
Ventilation
Ecg
16
15
15
15
15
15
15
14
14
14
Stefan Dietze 27tele-TASK Symposium 2012
Cervical_cancer
Screening
Cervical
Hpv
Oxygenation
Childhood
differential_diagnosis
Knowledge
Learning
decision_making
59
31
29
29
26
22
19
18
17
16
Clustering of resources graph (blue nodes: resources, green nodes: enrichments)
(2) Structural clustering of related resources
Number of resources per DBpedia reference/enrichment (subject) in mEducator dataset
decision_making
Training
Lecture
Risk
hpv_infection
Fear
pap_smear
Abnormal
Ventilation
Ecg
16
15
15
15
15
15
15
14
14
14
Cluster of educational resources
relating to „cervical cancer“ subject
Stefan Dietze 28tele-TASK Symposium 2012
(2) Clustering (similarity-based, linguistic)
Vector-based similarity computation based on:
1) Data indexing => Doc-Term Matrix (term frequencies in given resource metadata)
2) Creation of similarity matrices => similarity values between resources
3) Clustering (based on
similarity thresholds)
Stefan Dietze 29tele-TASK Symposium 2012
Exploratory search enabled via clustering
Example: search results of OER in MetaMorphosis+
http://metamorphosis.med.duth.gr/
Metamorphosis+
Educational resources retrieved
based on particular user query
Stefan Dietze 30tele-TASK Symposium 2012
Exploratory search enabled via clustering
http://metamorphosis.med.duth.gr/
Metamorphosis+Example: search results of OER in MetaMorphosis+
Related resources (ranked)
Stefan Dietze 31tele-TASK Symposium 2012
� http://ckan.net/package/smartlink
� > 2000 triples so far
� > 300 links to iServe
� APIs used by several applications
Data so far: SmartLink/mEducator in LOD cloud
� http://ckan.net/package/meducator
� > 35000 triples so far
� > 1000 links to DBpedia & Bioportal
ontologies
� APIs used by 4 applications
Stefan Dietze 32tele-TASK Symposium 2012
Web-scale TEL data integration
� data quality (ambiguity, richness, …)
� data heterogeneity (semantic),
� data interlinking
Web-scale TEL data exploitation
Applications/tools exploiting TEL Web
data for recommendation/exploration:
� scalability, robustness
� licensing and legal issues
� data interlinking
Semantic Web
Linked Open Data
Educational Linked Data
RecSys
TEL
Semantic Web
Collab./content- based RecSys
Open Educational Resources
RecSys for TEL; Learning Analytics
Educational Web data: open issues
Motivation
� Quality and quantity of (educational) Web data constantly improving
� Exploitation of Web data lacking scale and often limited to few, mostly isolated datasets
Linking Web Data for Education Project – Open Challenge in Web-scale Data Integration
� EC Support Action, start November 2012, coordinated by L3S
� EC Support Action, start November 2012, coordinated by L3S
=> http://linkedup-project.eu
� Goals
� Push forward adoption of Web data/Linked Data in educational context
� Drive technological advancement of Web data integration technologies
(applications, IR technologies, recommender systems)
� Approach
� Open data competition; open education as big data scenario
Stefan Dietze 34tele-TASK Symposium 2012
LinkedUp in a nutshell
LinkedUp in a nutshell
Initialisation
Personal
data
Web
data
LinkedUp
submission data
LinkedUp Challenge Environment
• LinkedUp Evaluation Framework
3 stages of the LinkedUp competition
Stage 1-Initialisation
• Lowest requirements level for participation
• Inital prototypes and mockups, use of data Pa
rticipa
tion
Web data
� Linked Open data
(30+ billion statements)
� General Web data
(OAI-PMH feeds,
web metadata etc)
� …
Applications and tools
� TEL environments and
applications
� Data integration tools:
storage, analytics,
mining, integration,
mapping
� …
Educational data & resources
� OER metadata
� OpenLearn
� OpenCourseware
� Ariadne
� iTunesU
� EU project results
Network of supporting organisations
(see 3.2 Spreading excellence, exploiting results, disseminating knowledge)
• Dissemination (events, training)
• Data sharing initiatives
• Community building & clustering
• Technology transfer
• Cashprice awards & consulting
LinkedUp Support Actions
• LinkedUp Evaluation Framework
• Methods and Test Cases
• LinkedUp Data Testbed
• Competitor ranking list
P
PS
F
EI
B
OC
C
S
E
T
Stage 2
• Inital prototypes and mockups, use of data testbed required
• 10 to 20 projects are expected
Stage 3
• Medium requirements level for participation
• Working prototypes, minimum amount ofdata sources, clear target user group
• 5 to 10 projects are expected
Stage 4
• Deployment in real-world use cases
• Sustainable technologies, reaching outto critical amount of users,
• 3 to 5 projects are expected
Pa
rticipa
tion
criteria
Challenge
…provides support:
� Financial awards
� Legal & technical
guidance
� Data & use cases
Stefan Dietze 35tele-TASK Symposium 2012
LinkedUp consortium
Web data integration & TEL & Open Data disseminationL3S Research Center, Leibniz University, DE
� Leading institute in Web science &
data technologies as well as
technology-enhanced learning
� Coordinator and leader of LinkedUp
Challenge WPCELSTEC, The Open University, NL
� R&D institute in educational technologies and part of the
largest distance university in the netherlands
Elsevier, NL
� Leading scientific & educational publisher
� Innovative research on the future of publishing &
extensive experience in data competitions
The Open Knowledge Foundation, UK
Stefan Dietze 36tele-TASK Symposium 2012 18/09/12 36Stefan Dietze
KMI, The Open University, UK
� Leading R&D institute in areas related to LinkedUp
� World’s largest distance university (over 200.000
students)
Exact Learning Solutions, IT
� SME in educational technologies and services with
long-standing experience in (EC-funded) R&D projects
� Not-for profit organisation to promote open
knowledge and data; global network
� Host of key events (OKCon) and platforms (eg CKAN)
International(outside Europe)
LinkedUp
Exploitation, dissemination, sustainability
Commonwealth of Learning, COL (CA)
Athabasca University (CA)
SURF NL (NL)
Université Fribourg, eXascale Infolab Group (CH)
Democritus University of Thrace (GR)
Persistent “LinkedUp Network”(extensible community of industrial and academic institutions)
18/09/12 37Stefan Dietze
Democritus University of Thrace (GR)
AKSW, Universität Leipzig (DE)
Aristotele University of Thessaloniki (GR)
CNR Institute for Educational Technologies (IT)
Clam Messina Service and Research Centre (IT)
Eurix (IT)
Ontology Engineering Group (OEG), UPM, (ESP)
LinkedUp
Next steps
Ongoing preparations to enable quickstart (1 November 2012)
� Challenge design, community & clusters
� Challenge kickoff: initial calls expected by February 2013
(http://www.linkedup-project.eu)
Participate!
� As challenge participant � As challenge participant
� Submission of innovative application/tool tackling one or more of the challenge goals
� LinkedUp offers: financial, technical and legal support
� As associated partner
� Participate as evaluation panelist, use case or data contributor & benefit from access to large
network of organisations in Linked Data and TEL
� Take advantage of innovative research results
(LinkedUp challenge submissions, evaluation framework)
� Promote your own data and tools
Stefan Dietze 38tele-TASK Symposium 2012
?
Educational Linked Data
RecSys
TEL
Semantic Web
RecSys for TEL; Learning Analytics
� Improving in terms of scalability,
accuracy, performance etc
� Challenge: availability and
accessibility of diverse, high-
� Wealth of relevant data
available, improving in terms of
quantity and quality
� Challenge: exploration and
requires data
Linked Data & TEL – a symbiotic relationship!
Educational Linked Data
RecSys
TEL
Semantic Web
RecSys for TEL; Learning Analytics
accessibility of diverse, high-
quality, interoperable data
� Challenge: exploration and
recommendation in large-scale
distributed data
requires scalable IR/RecSys mechanisms
Thank you!
http://purl.org/dietze
http://linkededucation.org
http://linkedup-project.eu
Stefan Dietze 41tele-TASK Symposium 2012
Some upcoming events
� Knowledge Extraction and Consolidation from Social Media (KECSM2012), workshop at ISWC2012,
http://blogs.ecs.soton.ac.uk/knowledgeextraction/