Download - MESMUSES methodology Lessons learned and open issues… Alain Michard Florence, June 2003

MESMUSES methodology

Lessons learned and open issues…

Alain MichardFlorence, June 2003

MESMUSES broad vision

Just like several other projects SW is all about semantic interoperability

Sharing machine-readable terminologies and classification schemes

Science and culture are collective and international

Semantic Web methodology should be highly relevant for managing and sharing scientific and cultural information

Some key S&T issues in the Project

Model : is RDFS / OWL-Lite adequate ?

Schema authoring : method and tools needed !

Metadata : where does it come from ?

Automatic Indexing : experiments with a categorizer

The basic SW model

Dwelling Person Artefact

House Artist Artwork

Lives-in

Owner

Produces

Create

Type : texte imprimé, monographie

Auteur(s) : Zola, Émile (1840-1902)

Titre(s) : L'assommoir [Texte imprimé] / par Emile Zola

Edition : 50e éd.

Publication : Paris : G. Charpentier, 1878

Description matérielle : 111-569 p.

Notice n° : FRBNF35963044

CreatesLives-in Surrogates

Schema

Real-worldentities

Model and Schema Language

Typed attributes are needed XML-Schema types Derived types (e.g.: Celsius temperature,

Gregorian date, etc.) Enumerated types, thesauri

Time-stamping Cardinality constraints Explicit transitivity of properties (e.g.:

geographic inclusion)

Schema authoring issues (1)

Find the right level of abstraction Is « Glucid » a class or an instance ? Or is it sometime a class and sometime an

instance ?

Avoid the « KR » attitude and practices ! It’s all about indexing resources with shared

terminologies, not about representing human knowledge !


est-régulé-par

est-expliquée-par

Processus

Processusélémentaire

Processuscomplexe

est-réalisé-par

nécessite

déclenche

Structure

Cellule

Molécule

Organisme

Appareil

Organe

Tissus

Système

GTANSGrande Thématique

est-documentée-par

est-documentée-par

est-constitué-de

consomme

transforme

produit

implique

est-constitué-de

élimine

ISAISA

ISA


Authoring tools are badly needed Graphical representation of the schema Zooming on sub-graphs (hierarchies) Versioning

Consider using UML authoring environment ?

Established methodology and tutorials are needed

Creating Surrogates

Data extraction and fusion from structured sources

R-DB, XML-DB, LDAP Updating

When ? Should not create duplicates !

Detect cross-references Authority lists Thesauri Lexical distance ???

Automatic Categorization

Automatic indexing By extracting metadata from resources By automatic categorization

Define hierarchies of « concepts » inside the schema

Seeding with representative documents Machine learning to create categorizers

Pros : enriched search functionality Cons : hierarchies of categories are static

Adding a category may change the categorizers of the others

Bottom-line…

RDFS schema authoring may be more difficult than E-R modelling

Debates on syntactic features are irrelevant Should be grounded on real-world implementations

and testbeds

A new query language (e.g.: RQL) is not high priority

We have not addressed the « logical rules » layer

Semantic Web vs. Community Webs

Download - MESMUSES methodology Lessons learned and open issues… Alain Michard Florence, June 2003

Top Related