data.bnf.fr françoise leresche bibliothèque nationale de france bibliographic and digital...
TRANSCRIPT
data.bnf.fr
Françoise LerescheBibliothèque nationale de FranceBibliographic and Digital Information DepartmentStandards and Models Unit
Library Models and Standards and Their Availability in the Semantic Web Workshop
LIDA 2012
Exposing heterogeneous data more efficiently,using Semantic Web tools
Bibliothèque nationale de France
Millions of resourcesof any kind (books, scores, maps, still
images, audiovisual materials…)published items / manuscripts and archival
resources
Made Web available through digitization
Different logics
Published resources Bibliographic description (ISBD) and MARC
format
Archival resources Archival description (tree structure) and
XML-EAD format
Digitized resources Simple Dublin Core format
Catalogues
BnF Catalogue Général 15 millions of bibliographic records Legal deposit of French edition
BnF Archives et Manuscrits (BAM) Dozens of thousands of manuscripts and archival
resources Mediaeval manuscripts, authors' archives
Gallica 1,5 million of digital objects The largest digital library
in the French-speaking world
2%
24%
19%
54%
297 000 livres 816 000 presse et revues
364 000 images 32 000 cartes
13 000 manuscrits 1 500 documents sonores
5 700 partitions musicales
Resources
Often hidden in the "Deep Web," i.e.: buried in catalogues scattered among various addresses
Web practices
Less people come in through home pages they access site pages directly
Keyword research associated with final content
e.g.: a book title
Navigating links on blogs, sites, networks
Organizing information
Gathering contents, links, and services Pages for human reading Data for machine reading
Some tools are at hand: Authority records Description standards Perennial identifiers
The project
The project
Data clustering Published resources Archival resources Digital resources
About authors, works, and topics Providing
HTML pages RDF structured, open, W3C standards
compliant data
For what purposes?
To propose services Beyond warehouses For any audience
To open data and make it free
To provide reliable links
The project
Linksfor authors, works, and topics
Links to books, manuscripts, images…
disseminated to any audience through user-friendly HTML pages
HTML pages
In order to:
provide content
create links to resources
lead users to services
HTML pages
Clustering on one page all the information relating to
an authora worka topic
from heterogeneous formats using Semantic Web technologies
Semantic Web
Combining a traditional Web interface with a Semantic web approach Gathering information around a concept In addition to resources, an address and
meaningful machine-usable data RDF chosen as resource description
framework
Elaborationon data
Relying on authority records, and persistent identifiers
Period of reflection on standards Defining a model Selecting vocabularies
Data alignment and integration
How doesit work?
BnF Catalogue général
Digitized resources
Human-friendly Web pages
BnF Archives et Manuscrits
Machine-friendly dataClustering - Alignment
Functional principles
Extracting - Transforming
MandragoreVirtualexhibitions
Gallica
BnF
Archives & manuscrits
Cataloguegénéral
Clustering - Aligning
Publication - control
Autres sourcesWikipedia.
Other resources
Wikipedia, etc.
Web Interface Linked Data
Page concepts leading to
Other libraries
Authority records
ID card for an entity (person, corporate body, work…) the preferred form variant forms (references) notes sources links to other records an identifier (at the BnF, ARK identifier)
Authority record for a workassociated with an author
Preferred access pointtitle of work
Link tobibliographic records
Link to the authority recordfor the creator of the work
Links to other works (whole/part relationships)
Persistent identifier
Persistent identifiers
Persistent identifier to recognize a resource uniquely on the long term
Stable Reference Naming Finding Citing
Two objectives Access Preservation
Persistent identifiers: ARK
http://catalogue.bnf.fr/ark:/12148/cb11992684q/UNIMARC
Name Mapping Authority Host (NMAH)
Label
Name Assigning Authority Number (NAAN)
Name(assigned by the Name Assigning Authority)
Qualifier (interpreted by the Name Mapping Authority)
FRBR model
Item
Manifestation
Expression
Work
Creation(at the top level of abstraction)
e.g.: Tales of the Grotesque and Arabesque, E. A. Poe
Version under which a Work is available(type of content, language, contribution)
e.g.: French translation by Baudelaire
Publication(production of the book: publisher, printer)
e.g.: Ed. Paris : M. Lévy frères,1875. xxxi-330 p.
Physical copy
e.g.: digitized copy ark:/12148/bpt6k2029183
Why FRBR?
Organized display
of the various versions of a Work
of the various productions of a Person
Better matches to Web users' various queries
Why FRBR?
To interrelate persons and works at the right level of contribution Creator of a work
author, composer… Contributor to an edition
translator, author of a foreword… Producer of a publication
publisher, printer, distributor Name associated with a single exemplar
owner, annotator…
Vocabularies
No "records," only data
for concepts SKOS for authors FOAF for works RDA
implementation of the FRBR modelhttp://www.rdatoolkit.org/backgroundfiles/RelationshipsOverview_10_9_09.pdf
definition of the roles associated with the various types of contribution
Bn
F d
ata
B
nF d
ata
sou
rces
sou
rces
Alig
nm
en
ts w
ith
A
lig
nm
en
ts w
ith
exte
rnal d
ata
exte
rnal d
ata
BnFAuthorities
Virtualexhibitions
Romain WENZ BnF-IBN 27
Complex ontology
"Author" page
"Work" page
RDF data
http://data.bnf.fr/11890582/charles_baudelaire/rdf.xml
http://data.bnf.fr/11947965/charles_baudelaire_les_fleurs_du_mal/rdf.xml
RDF data
<rdf:Description rdf:about="http://data.bnf.fr/ark:/12148/cb118905823"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept" /> <rdfs:seeAlso rdf:resource="http://catalogue.bnf.fr/ark:/12148/cb118905823" /> <foaf:focus rdf:resource="http://data.bnf.fr/ark:/12148/cb118905823#foaf:Person" /> <skos:editorialNote>DBF. - GDEL. - Laffont Bompiani, Auteurs</skos:editorialNote> <skos:editorialNote>Baudelaire / Isabelle Viéville-Degeorges, impr.
2011</skos:editorialNote> <skos:editorialNote>Les Fleurs du mal / Charles Baudelaire, 1869</skos:editorialNote> <skos:editorialNote>BN Cat. gén.. - BN Cat. gén. 1960-1969. - BN Cat. gén.
suppl.</skos:editorialNote> <owl:sameAs rdf:resource="http://viaf.org/viaf/17218730" /> <skos:prefLabel>Charles Baudelaire (1821-1867)</skos:prefLabel> <skos:altLabel>Baudelaire- Dufays (1821-1867)</skos:altLabel> <skos:altLabel>Baudelaire-Dufays (1821-1867)</skos:altLabel> <skos:altLabel>Pierre Dufaÿs (1821-1867)</skos:altLabel> <skos:altLabel>Baudelaire (1821-1867)</skos:altLabel> <skos:altLabel>Charles Pierre Baudelaire (1821-1867)</skos:altLabel> </rdf:Description>
Authority records are regarded as concepts SKOS
RDF data
Identifying the person FOAF
<rdf:Description rdf:about="http://data.bnf.fr/ark:/12148/cb118905823#foaf:Person"> <foaf:name>Charles Baudelaire</foaf:name> <rdagroup2elements:fieldOfActivityOfThePerson>Littératures</rdagroup2elements:fieldOfActivityOfThePerson> <rdagroup2elements:fieldOfActivityOfThePerson rdf:resource="http://dewey.info/class/800/" /> <foaf:familyName>Baudelaire</foaf:familyName> <foaf:givenName>Charles</foaf:givenName> <rdagroup2elements:placeOfDeath>Paris</rdagroup2elements:placeOfDeath> <dc:contributor rdf:resource="http://data.bnf.fr/ark:/12148/cb13911358q#frbr:Work" /> <dc:contributor rdf:resource="http://data.bnf.fr/ark:/12148/cb157080678#frbr:Work" /> <dc:contributor rdf:resource="http://data.bnf.fr/ark:/12148/cb16527546n#frbr:Work" /> <owl:sameAs rdf:resource="http://dbpedia.org/resource/Charles_Baudelaire" /> <rdagroup2elements:languageOfThePerson rdf:resource="http://id.loc.gov/vocabulary/iso639-2/fre" /> <foaf:birthday>04-09</foaf:birthday> <rdagroup2elements:placeOfBirth>Paris</rdagroup2elements:placeOfBirth> <foaf:depiction rdf:resource="http://upload.wikimedia.org/wikipedia/commons/thumb/5/52/
Baudelaire_crop.jpg/200px-Baudelaire_crop.jpg" /> <foaf:depiction rdf:resource="http://gallica.bnf.fr/ark:/12148/bpt6k96352b.thumbnail" /> <foaf:depiction rdf:resource="http://gallica.bnf.fr/ark:/12148/btv1b6903323t.thumbnail" /> <foaf:depiction rdf:resource="http://gallica.bnf.fr/ark:/12148/bpt6k61826b.thumbnail" /> <foaf:depiction rdf:resource="http://gallica.bnf.fr/ark:/12148/bpt6k118051q.thumbnail" /> <foaf:depiction rdf:resource="http://gallica.bnf.fr/ark:/12148/bpt6k629005.thumbnail" /> <foaf:depiction rdf:resource="http://gallica.bnf.fr/ark:/12148/bpt6k8843k.thumbnail" /> <foaf:depiction rdf:resource="http://gallica.bnf.fr/ark:/12148/bpt6k5839355j.thumbnail" /> <foaf:depiction rdf:resource="http://gallica.bnf.fr/ark:/12148/bpt6k22909f.thumbnail" /> <foaf:depiction rdf:resource="http://gallica.bnf.fr/ark:/12148/bpt6k1092623.thumbnail" /> <rdagroup2elements:dateOfDeath>31-08-1867</rdagroup2elements:dateOfDeath> <rdagroup2elements:dateOfBirth>09-04-1821</rdagroup2elements:dateOfBirth> <dc:title xml:lang="fr">Charles Baudelaire</dc:title> <foaf:page rdf:resource="http://data.bnf.fr/11890582/charles_baudelaire/" /> <rdagroup2elements:biographicalInformation>Poète. - Critique d'art. - Traducteur d'Edgar Allan Poe. - Signa
ses premières oeuvres de son patronyme et du nom de sa mère Caroline Archenbaut-Dufays</rdagroup2elements:biographicalInformation>
<foaf:gender>male</foaf:gender> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person" /> <dc:date>1821-1867</dc:date> <xfoaf:nationality rdf:resource="http://id.loc.gov/vocabulary/countries/fr" /> </rdf:Description>
RDF data
Baudelaire as translator [contributor] Poe, Edgar Allan. Histoires grotesques et sérieuses
<rdf:Description rdf:about="http://catalogue.bnf.fr/ark:/12148/cb32536208w#frbr:Expression"> <dc:contributor rdf:resource="http://data.bnf.fr/ark:/12148/cb118905823#foaf:Person" /> <marcrel:trl rdf:resource="http://data.bnf.fr/ark:/12148/cb118905823#foaf:Person" /> </rdf:Description>
Baudelaire as author [creator]Baudelaire, Charles. Le spleen de Paris
<rdf:Description rdf:about="http://catalogue.bnf.fr/ark:/12148/cb120434510#frbr:Work"> <dc:creator rdf:resource="http://data.bnf.fr/ark:/12148/cb118905823#foaf:Person" /> </rdf:Description>
Benefits
Visibility Being on users' path Finding resources without knowing of
them
Knowledge enhancement Simple pages and large audience Creating links Enabling data reuse
Benefits
For the BnF Experimentation
FRBRizing catalogue records Future developments for the catalogue New mode of disseminating bibliographic
information
Showing up in the Linked Data world
Presence in the Linked Data cloud
http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.html
Thank you!