digitisation & metadata overviebouche/slides/metadata... · digitisation & metadata...

169
Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th , Berkeley

Upload: others

Post on 14-Jul-2020

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Digitisation & Metadata Overview

Thierry Bouche

Cellule MathDoc & institut Fourier, Grenoble

MSRI workshop, April th , Berkeley

Page 2: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Outline

GeneralitiesDefinitionMetadata in a digitisation project

The NUMDAM exampleFile names and file systemsPersistant URLInternal metadataExposed metadata

Sharing metadataMotivationsNUMDAM experiencesNUMDAM OAI serverA proposed DML infrastructure

Page 3: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Definition

What is metadata?

Descriptive material concerning your data. Additional data that was not there in the first place. Anything you’re interested in,

that is not what you consider being ‘data’.

Is OCR (meta)data?

Page 4: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Definition

What is metadata?

Descriptive material concerning your data. Additional data that was not there in the first place. Anything you’re interested in,

that is not what you consider being ‘data’.

Is OCR (meta)data?

Page 5: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Definition

What is metadata?

Descriptive material concerning your data. Additional data that was not there in the first place. Anything you’re interested in,

that is not what you consider being ‘data’.

Is OCR (meta)data?

Page 6: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Definition

What is metadata?

Descriptive material concerning your data. Additional data that was not there in the first place. Anything you’re interested in,

that is not what you consider being ‘data’.

Is OCR (meta)data?

Page 7: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Definition

What is metadata?

Descriptive material concerning your data. Additional data that was not there in the first place. Anything you’re interested in,

that is not what you consider being ‘data’.

Is OCR (meta)data?

Page 8: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Metadata in a digitisation project

There are three sets of metadata:

Internal Administrative metadata.Production metadata generated at scan time.Archiving metadata for the long term.Rights metadata.

Exposed The subset of all the collected or added metadata thatyou expose to your users.This is what allows you to build the pieces of your userinterface.Although ‘exposed’, some of it may be hidden from theuser.

Exported The description of your collections that you’re eager tosee used elsewhere (mostly because you expect bettervisibility, i.e. links pointing to your resources).

Page 9: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Metadata in a digitisation project

There are three sets of metadata:

Internal Administrative metadata.Production metadata generated at scan time.Archiving metadata for the long term.Rights metadata.

Exposed The subset of all the collected or added metadata thatyou expose to your users.This is what allows you to build the pieces of your userinterface.Although ‘exposed’, some of it may be hidden from theuser.

Exported The description of your collections that you’re eager tosee used elsewhere (mostly because you expect bettervisibility, i.e. links pointing to your resources).

Page 10: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Metadata in a digitisation project

There are three sets of metadata:

Internal Administrative metadata.Production metadata generated at scan time.Archiving metadata for the long term.Rights metadata.

Exposed The subset of all the collected or added metadata thatyou expose to your users.This is what allows you to build the pieces of your userinterface.Although ‘exposed’, some of it may be hidden from theuser.

Exported The description of your collections that you’re eager tosee used elsewhere (mostly because you expect bettervisibility, i.e. links pointing to your resources).

Page 11: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Metadata in a digitisation project

There are three sets of metadata:

Internal Administrative metadata.Production metadata generated at scan time.Archiving metadata for the long term.Rights metadata.

Exposed The subset of all the collected or added metadata thatyou expose to your users.This is what allows you to build the pieces of your userinterface.Although ‘exposed’, some of it may be hidden from theuser.

Exported The description of your collections that you’re eager tosee used elsewhere (mostly because you expect bettervisibility, i.e. links pointing to your resources).

Page 12: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The NUMDAM example

File names and file systems

Persistant URLsThe NUMDAM DTD:

Internal metadataExposed metadata

Page 13: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

File names and file systems

One highly important issue for archiving isa consistant naming and organising system for files.

If everything else is lost or unreadable,the file system should be as self explanatory as conceivable.

Redundance is not a problem.

It is preferable to dumb numbers whose significationdepends on external resources that may vanish.

Page 14: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

File names and file systems

One highly important issue for archiving isa consistant naming and organising system for files.

If everything else is lost or unreadable,the file system should be as self explanatory as conceivable.

Redundance is not a problem.

It is preferable to dumb numbers whose significationdepends on external resources that may vanish.

Page 15: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

File names and file systems

One highly important issue for archiving isa consistant naming and organising system for files.

If everything else is lost or unreadable,the file system should be as self explanatory as conceivable.

Redundance is not a problem.

It is preferable to dumb numbers whose significationdepends on external resources that may vanish.

Page 16: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

File names and file systems

One highly important issue for archiving isa consistant naming and organising system for files.

If everything else is lost or unreadable,the file system should be as self explanatory as conceivable.

Redundance is not a problem.

It is preferable to dumb numbers whose significationdepends on external resources that may vanish.

Page 17: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File names and file systems

The journal article model in NUMDAM is the following:

Journal identifier (acronym).

Year of publication.

Series.

Volume.

Issue (identifies the paper volume).

First page.

Article order if not alone on that page.

ASENS/ASENS____/ASENS______/These are the directories on CD-ROMas well as at archive.numdam.org.

Page 18: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File names and file systems

The journal article model in NUMDAM is the following:

Journal identifier (acronym).

Year of publication.

Series.

Volume.

Issue (identifies the paper volume).

First page.

Article order if not alone on that page.

ASENS/ASENS____/ASENS______/These are the directories on CD-ROMas well as at archive.numdam.org.

Page 19: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File names and file systems

The journal article model in NUMDAM is the following:

Journal identifier (acronym).

Year of publication.

Series.

Volume.

Issue (identifies the paper volume).

First page.

Article order if not alone on that page.

ASENS/ASENS____/ASENS______/These are the directories on CD-ROMas well as at archive.numdam.org.

Page 20: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File names and file systems

The journal article model in NUMDAM is the following:

Journal identifier (acronym).

Year of publication.

Series.

Volume.

Issue (identifies the paper volume).

First page.

Article order if not alone on that page.

ASENS/ASENS____/ASENS______/These are the directories on CD-ROMas well as at archive.numdam.org.

Page 21: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File names and file systems

The journal article model in NUMDAM is the following:

Journal identifier (acronym).

Year of publication.

Series.

Volume.

Issue (identifies the paper volume).

First page.

Article order if not alone on that page.

ASENS/ASENS____/ASENS______/These are the directories on CD-ROMas well as at archive.numdam.org.

Page 22: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File names and file systems

The journal article model in NUMDAM is the following:

Journal identifier (acronym).

Year of publication.

Series.

Volume.

Issue (identifies the paper volume).

First page.

Article order if not alone on that page.

ASENS/ASENS____/ASENS______/These are the directories on CD-ROMas well as at archive.numdam.org.

Page 23: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File names and file systems

The journal article model in NUMDAM is the following:

Journal identifier (acronym).

Year of publication.

Series.

Volume.

Issue (identifies the paper volume).

First page.

Article order if not alone on that page.

ASENS/ASENS____/ASENS______/These are the directories on CD-ROMas well as at archive.numdam.org.

Page 24: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File names and file systems

The journal article model in NUMDAM is the following:

Journal identifier (acronym).

Year of publication.

Series.

Volume.

Issue (identifies the paper volume).

First page.

Article order if not alone on that page.

ASENS/ASENS____/ASENS______/These are the directories on CD-ROMas well as at archive.numdam.org.

Page 25: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File names and file systems

The journal article model in NUMDAM is the following:

Journal identifier (acronym).

Year of publication.

Series.

Volume.

Issue (identifies the paper volume).

First page.

Article order if not alone on that page.

ASENS/ASENS____/ASENS______/These are the directories on CD-ROMas well as at archive.numdam.org.

Page 26: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File system for archiving

ASENS/Volume directories.Global info for the journal.

ASENS/ASENS____/Article directories.Monochrome TIFF files for all the inner pages.Special TIFF files (grey or colour).Cover TIFF pages.XML metadata for the volume.

ASENS/ASENS____/ASENS______/Multipage TIFF file.XML OCR file.PDF and DjVu file.

Page 27: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File system for archiving

ASENS/Volume directories.Global info for the journal.

ASENS/ASENS____/Article directories.Monochrome TIFF files for all the inner pages.Special TIFF files (grey or colour).Cover TIFF pages.XML metadata for the volume.

ASENS/ASENS____/ASENS______/Multipage TIFF file.XML OCR file.PDF and DjVu file.

Page 28: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File system for archiving

ASENS/Volume directories.Global info for the journal.

ASENS/ASENS____/Article directories.Monochrome TIFF files for all the inner pages.Special TIFF files (grey or colour).Cover TIFF pages.XML metadata for the volume.

ASENS/ASENS____/ASENS______/Multipage TIFF file.XML OCR file.PDF and DjVu file.

Page 29: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File system for posting

ASENS/Global info for the journal.

ASENS/ASENS____/Article directories.First cover JPEG page.

ASENS/ASENS____/ASENS______/User PDF and DjVu file.HTML abstract.

Page 30: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File system for posting

ASENS/Global info for the journal.

ASENS/ASENS____/Article directories.First cover JPEG page.

ASENS/ASENS____/ASENS______/User PDF and DjVu file.HTML abstract.

Page 31: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: File system for posting

ASENS/Global info for the journal.

ASENS/ASENS____/Article directories.First cover JPEG page.

ASENS/ASENS____/ASENS______/User PDF and DjVu file.HTML abstract.

Page 32: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Persistant URL

Each item has a unique identifier.

Each item has a persistant URLwhich is built as simply as possible from its identifier.

By prefixing, you get another universal identifier.

http://www.numdam.org/item?id=ASENS______oai:numdam.org:ASENS______

Page 33: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Persistant URL

Each item has a unique identifier.

Each item has a persistant URLwhich is built as simply as possible from its identifier.

By prefixing, you get another universal identifier.

http://www.numdam.org/item?id=ASENS______oai:numdam.org:ASENS______

Page 34: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Persistant URL

Each item has a unique identifier.

Each item has a persistant URLwhich is built as simply as possible from its identifier.

By prefixing, you get another universal identifier.

http://www.numdam.org/item?id=ASENS______oai:numdam.org:ASENS______

Page 35: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Volume record

Identifier (volume level)Journal

ISSNAcronym

SeriesVolume

NumberYear

Issue

Editor (of a specific issue)

Publisher (rights owner)

Pages (number)Digitisation infos

Physical identifier(unused)Creation dateSpecial (anything special)Note (anything worthnoticing)

Page 36: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Volume record

Identifier (volume level)Journal

ISSNAcronym

SeriesVolume

NumberYear

Issue

Editor (of a specific issue)

Publisher (rights owner)

Pages (number)Digitisation infos

Physical identifier(unused)Creation dateSpecial (anything special)Note (anything worthnoticing)

Page 37: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Volume record

Identifier (volume level)Journal

ISSNAcronym

SeriesVolume

NumberYear

Issue

Editor (of a specific issue)

Publisher (rights owner)

Pages (number)Digitisation infos

Physical identifier(unused)Creation dateSpecial (anything special)Note (anything worthnoticing)

Page 38: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Volume record

Identifier (volume level)Journal

ISSNAcronym

SeriesVolume

NumberYear

Issue

Editor (of a specific issue)

Publisher (rights owner)

Pages (number)Digitisation infos

Physical identifier(unused)Creation dateSpecial (anything special)Note (anything worthnoticing)

Page 39: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Volume record

Identifier (volume level)Journal

ISSNAcronym

SeriesVolume

NumberYear

Issue

Editor (of a specific issue)

Publisher (rights owner)

Pages (number)Digitisation infos

Physical identifier(unused)Creation dateSpecial (anything special)Note (anything worthnoticing)

Page 40: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Volume record

Identifier (volume level)Journal

ISSNAcronym

SeriesVolume

NumberYear

Issue

Editor (of a specific issue)

Publisher (rights owner)

Pages (number)Digitisation infos

Physical identifier(unused)Creation dateSpecial (anything special)Note (anything worthnoticing)

Page 41: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Volume record

Identifier (volume level)Journal

ISSNAcronym

SeriesVolume

NumberYear

Issue

Editor (of a specific issue)

Publisher (rights owner)

Pages (number)Digitisation infos

Physical identifier(unused)Creation dateSpecial (anything special)Note (anything worthnoticing)

Page 42: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Volume record

Identifier (volume level)Journal

ISSNAcronym

SeriesVolume

NumberYear

Issue

Editor (of a specific issue)

Publisher (rights owner)

Pages (number)Digitisation infos

Physical identifier(unused)Creation dateSpecial (anything special)Note (anything worthnoticing)

Page 43: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Volume record

Identifier (volume level)Journal

ISSNAcronym

SeriesVolume

NumberYear

Issue

Editor (of a specific issue)

Publisher (rights owner)

Pages (number)Digitisation infos

Physical identifier(unused)Creation dateSpecial (anything special)Note (anything worthnoticing)

Page 44: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Special items

Type (title page, summary, index, ads, special, art, . . . )

First page (arabic number, many prefixes to deal with printer’sidiosyncrasies. . . )

Last page

Note

Monopage files list

Page 45: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Special items

Type (title page, summary, index, ads, special, art, . . . )

First page (arabic number, many prefixes to deal with printer’sidiosyncrasies. . . )

Last page

Note

Monopage files list

Page 46: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Special items

Type (title page, summary, index, ads, special, art, . . . )

First page (arabic number, many prefixes to deal with printer’sidiosyncrasies. . . )

Last page

Note

Monopage files list

Page 47: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Special items

Type (title page, summary, index, ads, special, art, . . . )

First page (arabic number, many prefixes to deal with printer’sidiosyncrasies. . . )

Last page

Note

Monopage files list

Page 48: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Internal metadata

The NUMDAM DTD: Special items

Type (title page, summary, index, ads, special, art, . . . )

First page (arabic number, many prefixes to deal with printer’sidiosyncrasies. . . )

Last page

Note

Monopage files list

Page 49: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 50: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 51: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 52: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 53: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 54: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 55: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 56: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 57: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 58: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 59: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 60: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 61: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: ArticlesIdentifier

First page

Last page

Order (in the volume)

Order (on the first page)Author

Last nameFirst nameAffiliationEmail

Contributor (translator,redactor, . . . )

Title (repeatable with a langattribute)

Main languageRelations

Is corrected by(and vice versa)Is completed by(and vice versa)

Note

Monopage files list

OCR file

Page 62: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: Article’s bibliographiesBibliography

bibitemAuthor(First name, last name)TitleYearFirst page

[Continued]Optional tags

JournalPublisherLocationCollection. . .

Page 63: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: Article’s bibliographiesBibliography

bibitemAuthor(First name, last name)TitleYearFirst page

[Continued]Optional tags

JournalPublisherLocationCollection. . .

Page 64: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: Article’s bibliographiesBibliography

bibitemAuthor(First name, last name)TitleYearFirst page

[Continued]Optional tags

JournalPublisherLocationCollection. . .

Page 65: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM: Exposed metadata

The NUMDAM DTD: Article’s bibliographiesBibliography

bibitemAuthor(First name, last name)TitleYearFirst page

[Continued]Optional tags

JournalPublisherLocationCollection. . .

Page 66: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Sharing metadata

There are three circles of metadata:

Complete Everything you know about your collections.Preliminary catalogue, captured at scan time, or addedlater. Reserved for insiders of the project.

Published The subset that feeds your website.Typically under copyright and protected from batchdownloads. Serves your users with specific features.

Public The subset that you give away freely.Should be accurate enough to be useful.Should be simple enough to be usable at all.

The border line between those circles are political or economicalchoices. Not so easy to agree upon?

Page 67: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Sharing metadata

There are three circles of metadata:

Complete Everything you know about your collections.Preliminary catalogue, captured at scan time, or addedlater. Reserved for insiders of the project.

Published The subset that feeds your website.Typically under copyright and protected from batchdownloads. Serves your users with specific features.

Public The subset that you give away freely.Should be accurate enough to be useful.Should be simple enough to be usable at all.

The border line between those circles are political or economicalchoices. Not so easy to agree upon?

Page 68: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Sharing metadata

There are three circles of metadata:

Complete Everything you know about your collections.Preliminary catalogue, captured at scan time, or addedlater. Reserved for insiders of the project.

Published The subset that feeds your website.Typically under copyright and protected from batchdownloads. Serves your users with specific features.

Public The subset that you give away freely.Should be accurate enough to be useful.Should be simple enough to be usable at all.

The border line between those circles are political or economicalchoices. Not so easy to agree upon?

Page 69: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Sharing metadata

There are three circles of metadata:

Complete Everything you know about your collections.Preliminary catalogue, captured at scan time, or addedlater. Reserved for insiders of the project.

Published The subset that feeds your website.Typically under copyright and protected from batchdownloads. Serves your users with specific features.

Public The subset that you give away freely.Should be accurate enough to be useful.Should be simple enough to be usable at all.

The border line between those circles are political or economicalchoices. Not so easy to agree upon?

Page 70: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Motivations

You want that every possible user knows the availabilityof your collections.

You can’t force them to come to your site and use your interface.

Working mathematicians with a subscription use MathSciNetor Zentralblatt.

Historians use hand made lists or Jahrbuch.

Everybody relies upon rumour, fame or chance.

They’ll fall back googling anyway.

Page 71: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Motivations

You want that every possible user knows the availabilityof your collections.

You can’t force them to come to your site and use your interface.

Working mathematicians with a subscription use MathSciNetor Zentralblatt.

Historians use hand made lists or Jahrbuch.

Everybody relies upon rumour, fame or chance.

They’ll fall back googling anyway.

Page 72: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Motivations

You want that every possible user knows the availabilityof your collections.

You can’t force them to come to your site and use your interface.

Working mathematicians with a subscription use MathSciNetor Zentralblatt.

Historians use hand made lists or Jahrbuch.

Everybody relies upon rumour, fame or chance.

They’ll fall back googling anyway.

Page 73: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Motivations

You want that every possible user knows the availabilityof your collections.

You can’t force them to come to your site and use your interface.

Working mathematicians with a subscription use MathSciNetor Zentralblatt.

Historians use hand made lists or Jahrbuch.

Everybody relies upon rumour, fame or chance.

They’ll fall back googling anyway.

Page 74: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Motivations

You want that every possible user knows the availabilityof your collections.

You can’t force them to come to your site and use your interface.

Working mathematicians with a subscription use MathSciNetor Zentralblatt.

Historians use hand made lists or Jahrbuch.

Everybody relies upon rumour, fame or chance.

They’ll fall back googling anyway.

Page 75: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Motivations

You want that every possible user knows the availabilityof your collections.

You can’t force them to come to your site and use your interface.

Working mathematicians with a subscription use MathSciNetor Zentralblatt.

Historians use hand made lists or Jahrbuch.

Everybody relies upon rumour, fame or chance.

They’ll fall back googling anyway.

Page 76: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Conjectures

The distance between you and your possible user might beinfinite if you rely on your fame and he relies on his trustedsources of information.

It is however most probable that the distance amounts to oneor two (kind of Erdos distance).

Page 77: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Conjectures

The distance between you and your possible user might beinfinite if you rely on your fame and he relies on his trustedsources of information.

It is however most probable that the distance amounts to oneor two (kind of Erdos distance).

Page 78: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Solutions

Give away your basic catalogue for free. Harvesting and updating must be automated. Make it as detailed as you can afford to. Never forget the “forget” (XML/XSLT) functor!

Page 79: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Solutions

Give away your basic catalogue for free. Harvesting and updating must be automated. Make it as detailed as you can afford to. Never forget the “forget” (XML/XSLT) functor!

Page 80: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Solutions

Give away your basic catalogue for free. Harvesting and updating must be automated. Make it as detailed as you can afford to. Never forget the “forget” (XML/XSLT) functor!

Page 81: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Solutions

Give away your basic catalogue for free. Harvesting and updating must be automated. Make it as detailed as you can afford to. Never forget the “forget” (XML/XSLT) functor!

Page 82: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM experiences

On top of our web server (EDBM), Claude Goutorbe has built anOAI server.

It was integrated by tenth of harvesters overnight.Among them, internet search engines like Yahoo, MSN, Excite.

Under Google’s request, we built a “cloaked” interface so thatthey could index the full OCRed text of articles.

Compare the results for “Théorie unitaire du champ physique”in Google and Yahoo!Usage statistics have more than doubled in one month.Searched expressions tend to show that there is not so muchnoise.

Page 83: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM experiences

On top of our web server (EDBM), Claude Goutorbe has built anOAI server.

It was integrated by tenth of harvesters overnight.Among them, internet search engines like Yahoo, MSN, Excite.

Under Google’s request, we built a “cloaked” interface so thatthey could index the full OCRed text of articles.

Compare the results for “Théorie unitaire du champ physique”in Google and Yahoo!Usage statistics have more than doubled in one month.Searched expressions tend to show that there is not so muchnoise.

Page 84: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM experiences

On top of our web server (EDBM), Claude Goutorbe has built anOAI server.

It was integrated by tenth of harvesters overnight.Among them, internet search engines like Yahoo, MSN, Excite.

Under Google’s request, we built a “cloaked” interface so thatthey could index the full OCRed text of articles.

Compare the results for “Théorie unitaire du champ physique”in Google and Yahoo!Usage statistics have more than doubled in one month.Searched expressions tend to show that there is not so muchnoise.

Page 85: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM experiences

On top of our web server (EDBM), Claude Goutorbe has built anOAI server.

It was integrated by tenth of harvesters overnight.Among them, internet search engines like Yahoo, MSN, Excite.

Under Google’s request, we built a “cloaked” interface so thatthey could index the full OCRed text of articles.

Compare the results for “Théorie unitaire du champ physique”in Google and Yahoo!Usage statistics have more than doubled in one month.Searched expressions tend to show that there is not so muchnoise.

Page 86: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM experiences

On top of our web server (EDBM), Claude Goutorbe has built anOAI server.

It was integrated by tenth of harvesters overnight.Among them, internet search engines like Yahoo, MSN, Excite.

Under Google’s request, we built a “cloaked” interface so thatthey could index the full OCRed text of articles.

Compare the results for “Théorie unitaire du champ physique”in Google and Yahoo!Usage statistics have more than doubled in one month.Searched expressions tend to show that there is not so muchnoise.

Page 87: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM experiences

On top of our web server (EDBM), Claude Goutorbe has built anOAI server.

It was integrated by tenth of harvesters overnight.Among them, internet search engines like Yahoo, MSN, Excite.

Under Google’s request, we built a “cloaked” interface so thatthey could index the full OCRed text of articles.

Compare the results for “Théorie unitaire du champ physique”in Google and Yahoo!Usage statistics have more than doubled in one month.Searched expressions tend to show that there is not so muchnoise.

Page 88: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM experiences

On top of our web server (EDBM), Claude Goutorbe has built anOAI server.

It was integrated by tenth of harvesters overnight.Among them, internet search engines like Yahoo, MSN, Excite.

Under Google’s request, we built a “cloaked” interface so thatthey could index the full OCRed text of articles.

Compare the results for “Théorie unitaire du champ physique”in Google and Yahoo!Usage statistics have more than doubled in one month.Searched expressions tend to show that there is not so muchnoise.

Page 89: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM’s OAI server

Sits on top of our web server (EDBM), thus only providesa subset of our published metadata.

Conforms to OAI-PMH ..Provides one set per journal, two metadata formats

Dublin Core (mandatory)minidml (experimental)

Available metadata formatshttp://www.numdam.org/oai?verb=ListMetadataFormats

List of available setshttp://www.numdam.org/oai?verb=ListSets

Retrieves all the records from one set in the given metadata formathttp://www.numdam.org/oai?verb=ListRecords&metadataPrefix=minidml&set=aug

Page 90: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM’s OAI server

Sits on top of our web server (EDBM), thus only providesa subset of our published metadata.

Conforms to OAI-PMH ..Provides one set per journal, two metadata formats

Dublin Core (mandatory)minidml (experimental)

Available metadata formatshttp://www.numdam.org/oai?verb=ListMetadataFormats

List of available setshttp://www.numdam.org/oai?verb=ListSets

Retrieves all the records from one set in the given metadata formathttp://www.numdam.org/oai?verb=ListRecords&metadataPrefix=minidml&set=aug

Page 91: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM’s OAI server

Sits on top of our web server (EDBM), thus only providesa subset of our published metadata.

Conforms to OAI-PMH ..Provides one set per journal, two metadata formats

Dublin Core (mandatory)minidml (experimental)

Available metadata formatshttp://www.numdam.org/oai?verb=ListMetadataFormats

List of available setshttp://www.numdam.org/oai?verb=ListSets

Retrieves all the records from one set in the given metadata formathttp://www.numdam.org/oai?verb=ListRecords&metadataPrefix=minidml&set=aug

Page 92: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM’s OAI server

Sits on top of our web server (EDBM), thus only providesa subset of our published metadata.

Conforms to OAI-PMH ..Provides one set per journal, two metadata formats

Dublin Core (mandatory)minidml (experimental)

Available metadata formatshttp://www.numdam.org/oai?verb=ListMetadataFormats

List of available setshttp://www.numdam.org/oai?verb=ListSets

Retrieves all the records from one set in the given metadata formathttp://www.numdam.org/oai?verb=ListRecords&metadataPrefix=minidml&set=aug

Page 93: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM’s OAI server

Sits on top of our web server (EDBM), thus only providesa subset of our published metadata.

Conforms to OAI-PMH ..Provides one set per journal, two metadata formats

Dublin Core (mandatory)minidml (experimental)

Available metadata formatshttp://www.numdam.org/oai?verb=ListMetadataFormats

List of available setshttp://www.numdam.org/oai?verb=ListSets

Retrieves all the records from one set in the given metadata formathttp://www.numdam.org/oai?verb=ListRecords&metadataPrefix=minidml&set=aug

Page 94: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

NUMDAM’s OAI server

Sits on top of our web server (EDBM), thus only providesa subset of our published metadata.

Conforms to OAI-PMH ..Provides one set per journal, two metadata formats

Dublin Core (mandatory)minidml (experimental)

Available metadata formatshttp://www.numdam.org/oai?verb=ListMetadataFormats

List of available setshttp://www.numdam.org/oai?verb=ListSets

Retrieves all the records from one set in the given metadata formathttp://www.numdam.org/oai?verb=ListRecords&metadataPrefix=minidml&set=aug

Page 95: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author RFlat: Einstein, AlbertShould be complex

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 96: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title RLanguage attributeShould have explicit Contentencoding (TEX, HTML,XML+MathML, RAW. . . )

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 97: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract RLanguage attributeShould have explicit Contentencoding (TEX, HTML,XML+MathML, RAW. . . )

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 98: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords RLanguage attribute

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 99: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 100: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 101: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

CitationReference string as printed

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 102: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

PagesFlat: -Should have First, Last (withnumeration system?), Count

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 103: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

RightsShould be standardised!

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 104: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid RWith service attribute

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 105: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier RWith scheme attribute

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 106: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format RFull text’s mime types ?Could have Byte size and URL

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 107: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 108: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

AbbrevJournal abbreviation

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 109: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

AbbrevJournal abbreviation

JtitleJournal title

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 110: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

AbbrevJournal abbreviation

JtitleJournal title

JhomeJournal URLShould be merged

Series

Volume

Issue

Date

Publisher

Provider

Page 111: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

SeriesGoes with Journal or Citation?

Volume

Issue

Date

Publisher

Provider

Page 112: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 113: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 114: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date (= Year?)Could be merged in Citation?

Publisher

Provider

Page 115: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 116: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

The minidml schema

The ‘Article’ element in minidml:Author R

Title R

Abstract R

Keywords R

Main language

MSC []

Citation

Pages

Rights

Reviewid R

Identifier R

Format R

ISSN

Abbrev

Jtitle

Jhome

Series

Volume

Issue

Date

Publisher

Provider

Page 117: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A proposed DML infrastructure

Each digitisation centre should set up:

A unique identifier for each item.

An associated persistant URL.

Structured basic metadata served through OAI.

Then anyone can build an integrated access portal to whatevercollection.

Page 118: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A proposed DML infrastructure

Each digitisation centre should set up:

A unique identifier for each item.

An associated persistant URL.

Structured basic metadata served through OAI.

Then anyone can build an integrated access portal to whatevercollection.

Page 119: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A proposed DML infrastructure

Each digitisation centre should set up:

A unique identifier for each item.

An associated persistant URL.

Structured basic metadata served through OAI.

Then anyone can build an integrated access portal to whatevercollection.

Page 120: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A proposed DML infrastructure

Each digitisation centre should set up:

A unique identifier for each item.

An associated persistant URL.

Structured basic metadata served through OAI.

Then anyone can build an integrated access portal to whatevercollection.

Page 121: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Towards an xdml schema

General prinicples.Two main objectives:

Provide a basic catalogue so that people can add itto an existing database and set up links.

Make it possible to match an existing reference againstyour catalogue to decide whether it’s already there or not.

Thus: accuracy, simplicity and versatility. Don’t rely on heuristics to parse your data. Make it clean. Be explicit on every decision you made

(text and math encodings, e.g.) Prefer complex well structured fields over flat structure

with misleading element names. But keep in mind that conversion from and to legacy formats

(bibtex. . . ) should be possible.

Page 122: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Towards an xdml schema

General prinicples.Two main objectives:

Provide a basic catalogue so that people can add itto an existing database and set up links.

Make it possible to match an existing reference againstyour catalogue to decide whether it’s already there or not.

Thus: accuracy, simplicity and versatility. Don’t rely on heuristics to parse your data. Make it clean. Be explicit on every decision you made

(text and math encodings, e.g.) Prefer complex well structured fields over flat structure

with misleading element names. But keep in mind that conversion from and to legacy formats

(bibtex. . . ) should be possible.

Page 123: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Towards an xdml schema

General prinicples.Two main objectives:

Provide a basic catalogue so that people can add itto an existing database and set up links.

Make it possible to match an existing reference againstyour catalogue to decide whether it’s already there or not.

Thus: accuracy, simplicity and versatility. Don’t rely on heuristics to parse your data. Make it clean. Be explicit on every decision you made

(text and math encodings, e.g.) Prefer complex well structured fields over flat structure

with misleading element names. But keep in mind that conversion from and to legacy formats

(bibtex. . . ) should be possible.

Page 124: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Towards an xdml schema

General prinicples.Two main objectives:

Provide a basic catalogue so that people can add itto an existing database and set up links.

Make it possible to match an existing reference againstyour catalogue to decide whether it’s already there or not.

Thus: accuracy, simplicity and versatility. Don’t rely on heuristics to parse your data. Make it clean. Be explicit on every decision you made

(text and math encodings, e.g.) Prefer complex well structured fields over flat structure

with misleading element names. But keep in mind that conversion from and to legacy formats

(bibtex. . . ) should be possible.

Page 125: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Towards an xdml schema

General prinicples.Two main objectives:

Provide a basic catalogue so that people can add itto an existing database and set up links.

Make it possible to match an existing reference againstyour catalogue to decide whether it’s already there or not.

Thus: accuracy, simplicity and versatility. Don’t rely on heuristics to parse your data. Make it clean. Be explicit on every decision you made

(text and math encodings, e.g.) Prefer complex well structured fields over flat structure

with misleading element names. But keep in mind that conversion from and to legacy formats

(bibtex. . . ) should be possible.

Page 126: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Towards an xdml schema

General prinicples.Two main objectives:

Provide a basic catalogue so that people can add itto an existing database and set up links.

Make it possible to match an existing reference againstyour catalogue to decide whether it’s already there or not.

Thus: accuracy, simplicity and versatility. Don’t rely on heuristics to parse your data. Make it clean. Be explicit on every decision you made

(text and math encodings, e.g.) Prefer complex well structured fields over flat structure

with misleading element names. But keep in mind that conversion from and to legacy formats

(bibtex. . . ) should be possible.

Page 127: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Towards an xdml schema

General prinicples.Two main objectives:

Provide a basic catalogue so that people can add itto an existing database and set up links.

Make it possible to match an existing reference againstyour catalogue to decide whether it’s already there or not.

Thus: accuracy, simplicity and versatility. Don’t rely on heuristics to parse your data. Make it clean. Be explicit on every decision you made

(text and math encodings, e.g.) Prefer complex well structured fields over flat structure

with misleading element names. But keep in mind that conversion from and to legacy formats

(bibtex. . . ) should be possible.

Page 128: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Towards an xdml schema

General prinicples.Two main objectives:

Provide a basic catalogue so that people can add itto an existing database and set up links.

Make it possible to match an existing reference againstyour catalogue to decide whether it’s already there or not.

Thus: accuracy, simplicity and versatility. Don’t rely on heuristics to parse your data. Make it clean. Be explicit on every decision you made

(text and math encodings, e.g.) Prefer complex well structured fields over flat structure

with misleading element names. But keep in mind that conversion from and to legacy formats

(bibtex. . . ) should be possible.

Page 129: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

Towards an xdml schema

General prinicples.Two main objectives:

Provide a basic catalogue so that people can add itto an existing database and set up links.

Make it possible to match an existing reference againstyour catalogue to decide whether it’s already there or not.

Thus: accuracy, simplicity and versatility. Don’t rely on heuristics to parse your data. Make it clean. Be explicit on every decision you made

(text and math encodings, e.g.) Prefer complex well structured fields over flat structure

with misleading element names. But keep in mind that conversion from and to legacy formats

(bibtex. . . ) should be possible.

Page 130: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schemaThe basic item types should follow bibtex legacy (bold=mandatory):

Article (author, title, journal, year, volume, number, pages,month, note)

Book (author or editor, title, publisher, year, volume, series,address, edition, month, note)

Proceedings (title, year, editor, publisher, organization,address, month, note)

Inbook (author or editor, title, chapter, pages, publisher, yearvolume, series, address, edition, month, note)

Inproceedings (author, title, booktitle, year, editor, pages,organization, publisher, address, month, note)

Thesis (author, title, school, year, address, month, note)

Unpublished (author, title, note, month, year)

Misc (author, title, howpublished, month, year, note)

Page 131: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schemaThe basic item types should follow bibtex legacy (bold=mandatory):

Article (author, title, journal, year, volume, number, pages,month, note)

Book (author or editor, title, publisher, year, volume, series,address, edition, month, note)

Proceedings (title, year, editor, publisher, organization,address, month, note)

Inbook (author or editor, title, chapter, pages, publisher, yearvolume, series, address, edition, month, note)

Inproceedings (author, title, booktitle, year, editor, pages,organization, publisher, address, month, note)

Thesis (author, title, school, year, address, month, note)

Unpublished (author, title, note, month, year)

Misc (author, title, howpublished, month, year, note)

Page 132: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schemaThe basic item types should follow bibtex legacy (bold=mandatory):

Article (author, title, journal, year, volume, number, pages,month, note)

Book (author or editor, title, publisher, year, volume, series,address, edition, month, note)

Proceedings (title, year, editor, publisher, organization,address, month, note)

Inbook (author or editor, title, chapter, pages, publisher, yearvolume, series, address, edition, month, note)

Inproceedings (author, title, booktitle, year, editor, pages,organization, publisher, address, month, note)

Thesis (author, title, school, year, address, month, note)

Unpublished (author, title, note, month, year)

Misc (author, title, howpublished, month, year, note)

Page 133: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schemaThe basic item types should follow bibtex legacy (bold=mandatory):

Article (author, title, journal, year, volume, number, pages,month, note)

Book (author or editor, title, publisher, year, volume, series,address, edition, month, note)

Proceedings (title, year, editor, publisher, organization,address, month, note)

Inbook (author or editor, title, chapter, pages, publisher, yearvolume, series, address, edition, month, note)

Inproceedings (author, title, booktitle, year, editor, pages,organization, publisher, address, month, note)

Thesis (author, title, school, year, address, month, note)

Unpublished (author, title, note, month, year)

Misc (author, title, howpublished, month, year, note)

Page 134: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schemaThe basic item types should follow bibtex legacy (bold=mandatory):

Article (author, title, journal, year, volume, number, pages,month, note)

Book (author or editor, title, publisher, year, volume, series,address, edition, month, note)

Proceedings (title, year, editor, publisher, organization,address, month, note)

Inbook (author or editor, title, chapter, pages, publisher, yearvolume, series, address, edition, month, note)

Inproceedings (author, title, booktitle, year, editor, pages,organization, publisher, address, month, note)

Thesis (author, title, school, year, address, month, note)

Unpublished (author, title, note, month, year)

Misc (author, title, howpublished, month, year, note)

Page 135: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schemaThe basic item types should follow bibtex legacy (bold=mandatory):

Article (author, title, journal, year, volume, number, pages,month, note)

Book (author or editor, title, publisher, year, volume, series,address, edition, month, note)

Proceedings (title, year, editor, publisher, organization,address, month, note)

Inbook (author or editor, title, chapter, pages, publisher, yearvolume, series, address, edition, month, note)

Inproceedings (author, title, booktitle, year, editor, pages,organization, publisher, address, month, note)

Thesis (author, title, school, year, address, month, note)

Unpublished (author, title, note, month, year)

Misc (author, title, howpublished, month, year, note)

Page 136: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schemaThe basic item types should follow bibtex legacy (bold=mandatory):

Article (author, title, journal, year, volume, number, pages,month, note)

Book (author or editor, title, publisher, year, volume, series,address, edition, month, note)

Proceedings (title, year, editor, publisher, organization,address, month, note)

Inbook (author or editor, title, chapter, pages, publisher, yearvolume, series, address, edition, month, note)

Inproceedings (author, title, booktitle, year, editor, pages,organization, publisher, address, month, note)

Thesis (author, title, school, year, address, month, note)

Unpublished (author, title, note, month, year)

Misc (author, title, howpublished, month, year, note)

Page 137: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schemaThe basic item types should follow bibtex legacy (bold=mandatory):

Article (author, title, journal, year, volume, number, pages,month, note)

Book (author or editor, title, publisher, year, volume, series,address, edition, month, note)

Proceedings (title, year, editor, publisher, organization,address, month, note)

Inbook (author or editor, title, chapter, pages, publisher, yearvolume, series, address, edition, month, note)

Inproceedings (author, title, booktitle, year, editor, pages,organization, publisher, address, month, note)

Thesis (author, title, school, year, address, month, note)

Unpublished (author, title, note, month, year)

Misc (author, title, howpublished, month, year, note)

Page 138: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 139: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 140: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 141: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 142: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 143: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 144: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 145: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 146: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 147: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 148: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 149: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 150: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

A tentative xdml specification

A core set for every item:Creator R (Given, Middle, Von, Last, Sort, String, URI,attr: author, editor, translator, redactor)Title R (attr: lang, text-encoding, math-encoding)YearMain languageAbstract R (attr: lang, text-encoding, math-encoding)Keywords R (attr: lang, text-encoding, math-encoding)MSC R (attr: revision)Review R (attr: service)Identifier R (attr: scheme)Format R (Mime-Type, URL, Byte-size)Provider (Name, Location, URL, Policy)RightsCitation (depends on item type)

Page 151: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Creator, type=author

Name string (Charles de La Vallée Poussin)

Given name (Charles)

Middle name ( )

Von name (de)

Last name (La Vallée Poussin)

Sort name (La Vallée Poussin, Charles de)

URL (serving as kind of identifier)?

Page 152: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Creator, type=author

Name string (Charles de La Vallée Poussin)

Given name (Charles)

Middle name ( )

Von name (de)

Last name (La Vallée Poussin)

Sort name (La Vallée Poussin, Charles de)

URL (serving as kind of identifier)?

Page 153: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Creator, type=author

Name string (Charles de La Vallée Poussin)

Given name (Charles)

Middle name ( )

Von name (de)

Last name (La Vallée Poussin)

Sort name (La Vallée Poussin, Charles de)

URL (serving as kind of identifier)?

Page 154: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Creator, type=author

Name string (Charles de La Vallée Poussin)

Given name (Charles)

Middle name ( )

Von name (de)

Last name (La Vallée Poussin)

Sort name (La Vallée Poussin, Charles de)

URL (serving as kind of identifier)?

Page 155: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Creator, type=author

Name string (Charles de La Vallée Poussin)

Given name (Charles)

Middle name ( )

Von name (de)

Last name (La Vallée Poussin)

Sort name (La Vallée Poussin, Charles de)

URL (serving as kind of identifier)?

Page 156: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Creator, type=author

Name string (Charles de La Vallée Poussin)

Given name (Charles)

Middle name ( )

Von name (de)

Last name (La Vallée Poussin)

Sort name (La Vallée Poussin, Charles de)

URL (serving as kind of identifier)?

Page 157: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Creator, type=author

Name string (Charles de La Vallée Poussin)

Given name (Charles)

Middle name ( )

Von name (de)

Last name (La Vallée Poussin)

Sort name (La Vallée Poussin, Charles de)

URL (serving as kind of identifier)?

Page 158: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=article

Citation stringJournal

TitleSeriesAbbreviationIdentifier R (attr: issn, uri)Home

Volume

IssuePages

CollationFirst (attr: sysnum,. . . )Last (attr: sysnum,. . . )Count

Author (String) + Title + Citation (String) = full printed reference.

Page 159: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=article

Citation stringJournal

TitleSeriesAbbreviationIdentifier R (attr: issn, uri)Home

Volume

IssuePages

CollationFirst (attr: sysnum,. . . )Last (attr: sysnum,. . . )Count

Author (String) + Title + Citation (String) = full printed reference.

Page 160: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=article

Citation stringJournal

TitleSeriesAbbreviationIdentifier R (attr: issn, uri)Home

Volume

IssuePages

CollationFirst (attr: sysnum,. . . )Last (attr: sysnum,. . . )Count

Author (String) + Title + Citation (String) = full printed reference.

Page 161: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=article

Citation stringJournal

TitleSeriesAbbreviationIdentifier R (attr: issn, uri)Home

Volume

IssuePages

CollationFirst (attr: sysnum,. . . )Last (attr: sysnum,. . . )Count

Author (String) + Title + Citation (String) = full printed reference.

Page 162: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=article

Citation stringJournal

TitleSeriesAbbreviationIdentifier R (attr: issn, uri)Home

Volume

IssuePages

CollationFirst (attr: sysnum,. . . )Last (attr: sysnum,. . . )Count

Author (String) + Title + Citation (String) = full printed reference.

Page 163: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=article

Citation stringJournal

TitleSeriesAbbreviationIdentifier R (attr: issn, uri)Home

Volume

IssuePages

CollationFirst (attr: sysnum,. . . )Last (attr: sysnum,. . . )Count

Author (String) + Title + Citation (String) = full printed reference.

Page 164: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=book

Citation stringPublisher

NameLocationHome

PagesCollationCount

CollectionTitleAbbreviationIdentifier R (attr: issn, uri)Home

Number

Author (String) + Title + Citation (String) = full printed reference.

Page 165: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=book

Citation stringPublisher

NameLocationHome

PagesCollationCount

CollectionTitleAbbreviationIdentifier R (attr: issn, uri)Home

Number

Author (String) + Title + Citation (String) = full printed reference.

Page 166: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=book

Citation stringPublisher

NameLocationHome

PagesCollationCount

CollectionTitleAbbreviationIdentifier R (attr: issn, uri)Home

Number

Author (String) + Title + Citation (String) = full printed reference.

Page 167: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=book

Citation stringPublisher

NameLocationHome

PagesCollationCount

CollectionTitleAbbreviationIdentifier R (attr: issn, uri)Home

Number

Author (String) + Title + Citation (String) = full printed reference.

Page 168: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=book

Citation stringPublisher

NameLocationHome

PagesCollationCount

CollectionTitleAbbreviationIdentifier R (attr: issn, uri)Home

Number

Author (String) + Title + Citation (String) = full printed reference.

Page 169: Digitisation & Metadata Overviebouche/Slides/metadata... · Digitisation & Metadata Overview Thierry Bouche Cellule MathDoc & institut Fourier, Grenoble MSRI workshop, April th ,

Generalities The NUMDAM example Sharing metadata

An xdml schema

Example: Citation, type=book

Citation stringPublisher

NameLocationHome

PagesCollationCount

CollectionTitleAbbreviationIdentifier R (attr: issn, uri)Home

Number

Author (String) + Title + Citation (String) = full printed reference.