metadata and semantic web

28
WWW.ARCTICCENTRE. Metadata and semantic web Arto Vitikka Arctic Centre University of Lapland www.arcticcentre.org

Upload: rune

Post on 12-Jan-2016

63 views

Category:

Documents


0 download

DESCRIPTION

Metadata and semantic web. Arto Vitikka Arctic Centre University of Lapland www.arcticcentre.org. Contents. Metadata introduction to metadata metadata on scientific data Open Archives Initiative examples Semantic web tools and technologies in Finland - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Metadata and semantic web

Arto Vitikka

Arctic Centre

University of Laplandwww.arcticcentre.org

Page 2: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Contents

Metadata• introduction to metadata• metadata on scientific data• Open Archives Initiative• examples

Semantic web tools and technologies in Finland• introduction to semantic web technology• development work in Finland• examples

Page 3: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Introduction to metadata

Sources used here: • Introduction to Metadata, Online Edition, version 3.0 , Tony Gill, Anne J. Gilliland,

Maureen Whalen, and Mary S. Woodley, Edited by Murtha Baca. http://www.getty.edu/research/conducting_research/standards/intrometadata/

• Wikipedia

• Data about data

• Used in several domains: research, geographical information systems, libraries and social media (tags in Flickr, Del.icio.us)

Page 4: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Primary Functions of Metadata

• Organization and description. A primary function of metadata is the description and ordering of original objects or items in a repository or collection, as well as of the information objects relating to the originals

• Creation, multiversioning and reuse of information objects. Multiple versions of the same object may be created for preservation, research, exhibit and dissemination. Administrative and descriptive metadata should be included by the creator or digitizer, especially if reuse is envisaged.

• Searching and retrieval. Good descriptive metadata is essential to users’ ability to find and retrieve relevant metadata and information objects.

• Validation. To ascertain the authoritativeness and trustworthiness of the information.

Page 5: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Primary Functions of Metadata /2

• Utilization and preservation. Metadata on information objects related to user annotations, rights tracking, and version control may be created. Digital objects also need to be subject to a continuous preservation regime and undergo processes such as refreshing, migration, and integrity checking to ensure their continued availability and to document any changes that might have occurred to the information object during preservation processes.

• Disposition. Metadata is a key component in documenting the disposition (e.g., accessioning, deaccessioning) of original objects and items in a repository, as well as of the information objects relating to those originals.

Page 6: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Benefits of structured metadata

The more highly structured an information object is, the more that structure can be exploited for searching, manipulation, and interrelating with other information objects and systems. Then metadata:

• certifies the authenticity and degree of completeness of the content;

• establishes and documents the context of the content; • identifies and exploits the structural relationships that exist

within and between information objects; • provides a range of intellectual access points for an

increasingly diverse range of users; and • building of new services where integrating and reusing existing

information sources

Page 7: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

The Open Archives Initiative, Protocol for Metadata Harvesting - OAI-PMH

• The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content.

• The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low-barrier mechanism for repository interoperability. The essence of the open archives approach is to enable access to Web-accessible material through interoperable repositories for metadata sharing, publishing and archiving.

• The OAI-PMH gives a simple technical option for data providers to make their metadata available to services, based on the open standards HTTP (Hypertext Transport Protocol) and XML (Extensible Markup Language).

Page 8: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Definitions

• Data Provider: maintains one or more repositories (web servers) that support the OAI-PMH as a means of exposing metadata.

• Service Provider: issues OAI-PMH requests to data providers and uses the metadata as a basis for building value-added services. A Service Provider in this manner is "harvesting" the metadata exposed by Data Providers

• Harvesting: refers specifically to the gathering together of metadata from a number of distributed repositories into a combined data store.

Page 9: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Services and applications

• The metadata that is harvested may be in any format that is agreed by a community. Dublin Core is specified to provide a basic level of interoperability.

• Thus, metadata from many sources can be gathered together in one database, and services can be provided based on this centrally harvested, or "aggregated" data.

• The link between this metadata and the related content is not defined by the OAI protocol.

Page 10: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Services and applications / 2

• OAI-PMH does not provide a search across this data, it simply makes it possible to bring the data together in one place. In order to provide services, the harvesting approach must be combined with other mechanisms.

• OAI-PMH is technically very simple, but building coherent services that meet user requirements remains complex.

• A number of software systems support the OAI-PMH: Fedora, GNU EPrints, Open Journal Systems, DSpace, DigiTool and MetaLib among others.

Page 11: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Open Archives in Finland

• Doria contains digital collections of Finnish universities and polytechnics.

• University of Lapland is now starting to implement open archives system, integrated into Doria.

• Work starts with master's thesis, later on the publications of the staff and the Lapland University Press.

• https://oa.doria.fi/

Page 12: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Examples

• Map of OA repositories: http://maps.repository66.org/

• Registry of Open Access Repositories - http://roar.eprints.org/

• The aim of ROAR is to promote the development of open access by providing timely information about the growth and status of repositories throughout the world.

• Arctic Open Archives application to serve the UArctic and the arctic science community?

Page 13: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

More information

Sources used here and more information:

• Open Archives Forum - http://www.oaforum.org/tutorial/

• The Open Archives Initiative Protocol for Metadata Harvesting - http://www.openarchives.org/pmh/

Page 14: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

• Description of research data• Answers to questions: who, what, where, when, how

and how to obtain the data• International metadata standard:

– Directory Interchange Format (DIF)– used in Global Change Master Directory

• Required, Highly recommended and Recommended fields– title, parameters, data center, summary, personel,

instrument, resolution, temporal and spatial coverage, etc.

Metadata on research data

Page 15: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

• Global Change Master Directory (GCMD)– maintained by NASA– Earth science data sets and services relevant to global change– more than 30 000 descriptions on data and services– gcmd.nasa.gov

• Antarctic Master Directory– part of GCMD– about 6 400 data descriptions (3.3.2010)– national Antarctic data portals

• IPY Metadata Portal– part of GCMD– 363 descriptions (3.3.2010)

Data portals

Page 16: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Benefits of metadata

• Facilitate access to data and maximise the use of data• Avoid duplication of research and data collection• Improve efficiency of scientific data management• Facilitate new research through access to existing scientific

data• Improve cooperation and interoperability between disciplines • Data may be valued more than the immediate publications it

has generated• Scientists cannot be expected to know how their data may

be used in the future

Page 17: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Semantic web

• The Semantic web - the Internet of meanings - is the next generation of the Internet.

• The idea of the semantic web is to make content understandable for machines by binding it to some formal and meaningful description.

• Enables user communities to put machine-understandable contents on the web which can be shared and processed both by automated tools and people.

• Integration and reuse of the information in new unforeseeable applications and domains is possible.

Page 18: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Semantic web / 2

• Ontologies are the infrastructure of the semantic web.• Ontologies serve to make metadata understandable by

computers, they define the way descriptive terms are interrelated and used in a given domain of interest.

• Semantic web concept makes finding the correct data and information more effective, also ensuring the validity of the information and enabling language independence.

• For example when talking about Nokia - town, rubber boots, car tires,the Nokia company or a Nokia phone?

• Or ‘Paris’ in a web page tells the computer explicitly that in this context the information is about town Paris, Texas, US

Page 19: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

• The development of the Semantic Web started about ten years ago

• European Commission has funded related research and development projects.

• The Semantic computing research group at the Aalto University has conducted several years Semantic Web technology development projects

• Variety of Semantic Web infrastructure services like the Finnish Ontology Library Service and open source semantic tools for creating applications.

• Now we are at a state where the Semantic Web is moving from being a vision to becoming reality.

Semantic web / 3

Page 20: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Semantic web in Finland

Services

Finnish Ontology Library Service ONKIhttp://www.yso.fi/

The ONKI service contains Finnish and international ontologies, vocabularies and thesauri needed for publishing your content cost-efficiently on the Semantic Web. Ontologies are conceptual models identifying the concepts of a domain. They contain machine "understandable" descriptions of the relations between the concepts.

Page 21: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Semantic web in Finland /2

• Finnish General Upper Ontology (YSO) with ca. 20 000 concepts

• Besides general ontology there are several special ontologies• Ontologies have been created either based on existing

vocabularies or from scratch • The Finnish General Upper Ontology has been made available

for users (ontology developers, content indexers, information search) by setting up ontology server and providing applications for integrating the ontology into existing content management systems

• http://www.seco.tkk.fi/ontologies/

Page 22: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Page 23: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Semantic web open source tools

Semantic Portal Building Tools• Lightweight multifaceted search engine for RDF data• Browser-based semantic annotation tool• Tool for Creating Semantic View-Based Search and

Browsing Portals• Generic View-Based RDF Search Engine• A tool for creating static web sites based on semantical

content.

Semantic Information Extraction• A framework for automatic annotation• Automatic Information Retrieval Ontologically

Page 24: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Ontology services

• National Ontology Service ONKI• Ontology repository• Ontology server for publishing vocabularies• Ontology Service for Geographical Data• Ontology Service for Finding People and

Organizations• Ontology-based Annotation Assistant

Page 25: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Applications

CultureSampo • Semantic web portal and a publication channel for Finnish

cultural heritage. • Contents comes from over 20 different Finnish museums,

libraries, archives and other source, as well as from the Getty Foundation and Wikipedia.

• The system aggregates cross-domain content of various kinds including artifacts, paintings, scuplture, drawings, abstract art, novels, comics, web pages, folklore and runes of different kinds, fictive persons and places, folk music, photos, persons, organizations, historical events, videos, buildings, and cultural sites.

Page 26: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Applications / 2

TerveSuomi - HealthFinland - Demo • Metadata, ontology, and service infrastructure - based on W3C

semantic web recommendations, a domain-specific metadata schema (Dublin Core application), and a set of ontologies and services provided within the National Ontology Service.

• Semantic content creation process - for producing semantically annotated contents, based on the shared metadata model and ontologies.

• Semantic portal HealthFinland - material is published via a semantic portal that creates a single national entry-point for health information, health promotion and health-related news.– The information is collected from a diverse group of sources including

expert organizations, governmental institutes and NGOs. – A quality control process

Page 27: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Conclusions

• Metadata enables the creation of new intelligent web services

• Reuse and integration of information

• Standards and tools are existing• Open source tools does not mean that they are free• Building services still requires lots of work and a

good funding• Tools to build services for the arctic communities

Page 28: Metadata and semantic web

WW

W.A

RC

TIC

CE

NTR

E.O

RG

Kiitos paljon!

Tack så mycket!

Thank You!