ksim keizer 2010-10-19
Post on 27-Jan-2015
113 Views
Preview:
DESCRIPTION
TRANSCRIPT
The role of Thesauriand Standard Vocabularies in linking data-AGROVOC-UNBIS-EUROVOCA proposal for collaboration between agencies
Dr. Johannes KeizerFAO of the United NationsOffice of Knowledge Exchange, Research and ExtensionKnowledge and Capacity for Development
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
The Development of the Internet
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
“Closed” (“normal”) IT environments
Data sources carefully controlled.
Data formats “custom-defined” for an application.
Linked data based on an “open world mindset”
Integrating data from the open Web
Systems designed to incorporate new information incrementally
By design, tolerance of incomplete information
Open World Mindset
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19 The Linked Data Universe: http://
www.linkeddata.org (july 2009)
4
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19 The Linked Data Universe: http://
www.linkeddata.org (july 2010)
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
Example: BBC Wildlife Finder
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19 Humboldt Squid page, pulled together from a diversity of Linked
Data sources
Animal Diversity Web:Nocturnal way of life
BBC TV Documentary
BBC News item
Wikipedia
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
RDF– a grammar for the language of data
ResourcerelatedTo
ResourceA ResourceB
ResourcedescribedBy
ResourceA Some text
1. Describe resources using interrelated “statements” (“triples”).2. Use URIs – unique, globally managed identifiers – as the “words” of statements.
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
• http://www.w3.org/2007/Talks/0221-Bangalore-IH/
RDF as a common format for merging data
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
Born as tools to assure consistency in the indexing of library collections
Thesauri were based on “terms”, but terms represented already concepts in a non explicit way
Hierarchical and associative relationships represented generic ontological domain knowledge
Candidate building blocks for the semantic web
Role of thesauri/concept schemes
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
..from thesaurus to Ontologies….
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
around 30,000 concepts
600000 labels in around 20 languages.
one-stop shop for terminological knowledge related to agriculture in general
a knowledge base of related concepts organized in ontological relationships (hierarchical, associative, equivalence)
Is a concept/term/string based system
Concepts may be organized in multiple categories.
AGROVOC today
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19 Semantic Relationships
Concept to Concept
isA (hierarchy), isPestOf, hasPest
Concept to Term
has_lexicalization (links concepts to their lexical realizations)
Term to Term
isSynonymOf, isTranslationOf, hasAcronym, hasAbbreviation
Term to String
hasSpellingVariant, hasSingular
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
Further schemes in FAO
skos:broader
:bar
has_synonymhas_translation
skos:literalForm “maize”:foomaïs (fr)
:foo
has_synonymskos:literalForm “corn”
:bar
8171
1474
skosxl:altLabel
skosxl:prefLabel
skos:broader
has_synonym
SKOS Label
AGROVOC conceptual model,in SKOS-XL
SKOSConcept
rdf:type
rdf:type
6211
skos:broader
AGROVOCConceptScheme
skos:topConceptOf
skos:inScheme
Another scheme in FAO
Other scheme in FAO
skos:inScheme
12332
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
http://www.w3.org/2004/02/skos/
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19 SKOS-XL output
<rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/agrovocScheme"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/c_330829"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/><skos:inScheme rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/><skos:topConceptOf rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/xl_en_1278479064610"><literalForm xmlns="http://www.w3.org/2008/05/skos-xl#" xml:lang="en">subjects</literalForm> <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/></rdf:Description>
URI of AGROVOC concept
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
The concept scheme workbench
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
AGROVOC EUROVOC UNBIS Relationship
http://aims.fao.org/aos/agrovoc/c_207
http://eurovoc.europa.eu/219055
agroforestry skos:exactMatch/ owl:sameAs
http://aims.fao.org/aos/agrovoc/c_4826
http://eurovoc.europa.eu/220018
MILK skos:exactMatch/ owl:sameAs
http://aims.fao.org/aos/agrovoc/c_12332
http://eurovoc.europa.eu/219871
MAIZE skos:exactMatch/ owl:sameAs
Linking vocabularies
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
http://agris.fao.org/agris-search/search/display.do?f=2004/ZA/ZA04002.xml;ZA2004000049
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
http://aims.fao.org/aos/agrovoc/c_7825
http://eurovoc.europa.eu/218754
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
http://eurovoc.europa.eu/219871
Maize
skosxl: literalForm
Maize
http://aims.fao.org/aos/agrovoc/c_12332
AGROVOC
skosxl: literalFormMaize
http://aims.fao.org/aos/agrovoc/c_12332 owl:sameAs http://eurovoc.europa.eu/219871
owl:sameAs/exactMatch
http://agris.fao.org/agris-search/search/display.do?f=1996/TR/TR96001.xml;TR9600026
Linking data through common URIs
skosxl: literalForm
owl:sameAs/exactMatch
http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2010:202:0011:0015:EN:PDF
http://unbisnet.un.org:8080/ipac20/ipac.jsp?session=128F308557F34.283092&profile=bib&uri=full=3100001~!685149~!1&ri=1&aspect=subtab124&menu=search&source=~!horizon
Maize
Eurovoc
UNBIS
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
What are we doing with unstructured data?• We have enormous amounts of unstructured
material
• Still most of the documents that we are producing are mostly semantically unstructured
• Human work to catalogue and index is becoming always more rare
• We need machines to do automatic semantic mark ups of text
• If machines are trained and based on concept schemes, ther are able to do so
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
• Does Concept identification in unstructured texts
• Uses Agrovoc as a controlled vocabulary
• Prototype under testing with excellent results (entire repository of ICARDA indexed)
• Will produce in future Structured RDF files that can be used to link data like “open Calais”
•
AgroTagger
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
Life Demo: Semantic mark ups:
http://viewer.opencalais.com/http://agropedialabs.iitk.ac.in/Tagger/Agrotagger_text.php
Collaboration Some points, about what we need to do
and what we could do together
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
Our agencies have a wealth of important information
We should publish them as fast as possible as “Linked Open Data” and create links among them
metadata from databases and vocabularies) can be published without bigger investments and with little delay.
Our data need to be come reference points in the linked data environment.
01: Open Archives + Linked Open Data
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
a SKOS-XL model to transform multilingual complex thesauri in to conceptschemes and publish them as LOD
a cutting edge workbench to enrich and maintain the concept schemes/vocabularies
Semantic interoperability! Mapping!
02 Concept schemes!
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
Development of Production level machine indexing to substitute human indexing of agency publications.
Adapting AgroTagger for UNBIS
Methodologies to adapt the system to any Agency thesaurus and document corpus
Web Services to access the semantic markup engines
Customization of Search Engines
03 Semantic Technologies !
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
Working group with interested colleagues from different agencies
Discussion forum to elaborate a project proposal (can be hosted on aims.fao.org)
Workshop in spring to discuss and decide details
Possible Steps
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
Thank You!
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
Giving a try to the workbench
A demo version of the AWB: http://202.73.13.50:55234/agrovocdevv10d/ With all functionalities, availabe to users for testing purpose.
Latest stable release version 1.0 : (read/write) http://202.73.13.50:55381/agrovocv10i/
Latest stable release version 1.0 (Read only): http://202.73.13.50:55481/agrovocv10i/ (Visitors only with only view privilege)
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
KS
IM
mee
tin
g
N
ew Y
ork
, 20
10-1
0-19
…and more: http://aims.fao.org
top related