From MARC-XML to JSON-LD - HESGE MARC-XML to JSON-LD ... Your Data in the LOD Cloud ... I RERO LOD I Elasticsearch https: ...

Download From MARC-XML to JSON-LD - HESGE  MARC-XML to JSON-LD ... Your Data in the LOD Cloud ... I RERO LOD   I Elasticsearch https: ...

Post on 09-Mar-2018

212 views

Category:

Documents

0 download

TRANSCRIPT

From MARC-XML to JSON-LDA New Invenio Data Model for Bibliographic Objects andBeyondJohnny MarithozHEG-Genve, 2016/08/04RERO DOCRseau des biliothques Suisse occidentale 2 HEG-Genve, 2016/08/04RERO DOC: the RERO Invenio InstanceI RERO digital libraryI started in 2004I 35000 documentsI 215000 print media issuesI 44 institutionsI content: heritage and scholarly documentsI based on Invenio 1.x with patchesRseau des biliothques Suisse occidentale 3 HEG-Genve, 2016/08/04RERO CustomizationsI Web designI 1st page design with content informationI purpose of ElasticsearchI hierarchical facets navigationI search results highlightingI press pageI multilingual full text searchI HTML templates introductionI document viewer Multivio - an e-lib.ch project(http://www.multivio.org)I visitor statistics pageI 1st JSON-LD/schema.org versionthanks to the Invenio teamRseau des biliothques Suisse occidentale 4 HEG-Genve, 2016/08/04http://www.multivio.orgRERO DOC before 2013Rseau des biliothques Suisse occidentale 5 HEG-Genve, 2016/08/04RERO DOC Home PageRseau des biliothques Suisse occidentale 6 HEG-Genve, 2016/08/04RERO DOC Search Results PageRseau des biliothques Suisse occidentale 7 HEG-Genve, 2016/08/04RERO DOC Digitized Press Page and MultivioRseau des biliothques Suisse occidentale 8 HEG-Genve, 2016/08/04RERO DOC Visitor Statistics PageRseau des biliothques Suisse occidentale 9 HEG-Genve, 2016/08/04RERO DOC: New ChallengesI software maintenance (over Invenio versions)I new submission interface with data import capabilitiesI new services REST APII Invenio 3I Linked Open Data at RERO (http://data.rero.ch)I at the center of our future data modelI RERO DOC as a proof of conceptI focus on internal and external data linking via a large use ofidentifiers (ORCID, etc.)I authority recordsI to be applied also to the Union Catalog Invenio 3 with a new data model!Rseau des biliothques Suisse occidentale 10 HEG-Genve, 2016/08/04New Software New Data Model?Rseau des biliothques Suisse occidentale 11 HEG-Genve, 2016/08/04Why Not MARC?Rseau des biliothques Suisse occidentale 12 HEG-Genve, 2016/08/04Is MARC Too Old? By 1971, MARC formats had become thenational standard for dissemination of bibliographicdata in the United States (Wikipedia) 1972 C programming language is released (http://computerhistory.org)1980 Python project is started (Wikipedia)1989 Berners-Lee, Tim. "Information Management: A Proposal" (Wikipedia)1990 HTML, URL, HTTP (Wikipedia)1994 HTML 1.0 (Wikipedia)1994 Netscape 1.0 (Wikipedia)1998 Google (Wikipedia)2002 Python 2.0 is released (Wikipedia)2002 First CDSWare Release (Wikipedia)2002 http://json.org started (Wikipedia)2005, 2006 JSON is used by Yahoo and Google (Wikipedia)2006 First Invenio Release (Wikipedia)2014 RFC 7159 became the main reference for JSONs internet uses (Wikipedia)Rseau des biliothques Suisse occidentale 13 HEG-Genve, 2016/08/04What Has Changed at the Data Level?I THE WEBI data handling has largely improved with new programminglanguagesI Web 2.0 application (client-server, services, etc.) with a lotof interactionsI emergence of Linked Open Data: everyone wants toconnect to your dataI more and more exchange formats, driven by Zotero,OAI-PMH, social networks, search engines, etc.I developers spend their time converting the dataRseau des biliothques Suisse occidentale 14 HEG-Genve, 2016/08/04MARC Was Designed for the Machinesof the 70s!What About Modern Machines?Rseau des biliothques Suisse occidentale 15 HEG-Genve, 2016/08/04Object Oriented Data ModelBookRecord Author+ id: int+ first name: string+ last name: stringPerson+ id: int+ title: string+ authors: listBibRecordis a is ahas aRseau des biliothques Suisse occidentale 16 HEG-Genve, 2016/08/04MARC Format1234From MARC to JSONAvram, HenrietteCrockford, DouglasRseau des biliothques Suisse occidentale 17 HEG-Genve, 2016/08/04Computer Data StructuresBase Types (value)title = "From MARC to JSON"_id = 1234value = 12.3Rseau des biliothques Suisse occidentale 18 HEG-Genve, 2016/08/04Computer Data StructuresList (array)authors = ["Henriette Avram","Douglas Crockford"]Rseau des biliothques Suisse occidentale 18 HEG-Genve, 2016/08/04Computer Data StructuresDictionary (object)author = {"lastname": "Avram","firstname": "Henriette"}Rseau des biliothques Suisse occidentale 18 HEG-Genve, 2016/08/04Computer Data StructuresAll Together{"id": 1234,"title": "From MARC to JSON","authors": [{"lastname": "Avram","firstname": "Henriette"}, {"lastname": "Crockford","firstname": "Douglas"}]}Rseau des biliothques Suisse occidentale 18 HEG-Genve, 2016/08/04Computer Data StructuresOutput Format{"id": 1234,"title": "From MARC to JSON","authors": [{"lastname": "Avram","firstname": "Henriette"}, {"lastname": "Crockford","firstname": "Douglas"}]}Rseau des biliothques Suisse occidentale 18 HEG-Genve, 2016/08/04The JSON FormatRseau des biliothques Suisse occidentale 19 HEG-Genve, 2016/08/04Interesting FeaturesI simple: value, array, objectI easy toI read and writeI share between client and server (python, javascript)I share (REST API)I work with existing libraries (Elasticsearch, Postgresql)I can represent any kind of object (comments, notes, tags,libraries, collections, etc.)I supported by many programming languagesI human readable (debug, understand)I widely used on the WebI (too?) flexibleRseau des biliothques Suisse occidentale 20 HEG-Genve, 2016/08/04Missing FeaturesI standard naming (creators, authors, etc.)I data validationI clear format description: human and machine JSON SchemaRseau des biliothques Suisse occidentale 21 HEG-Genve, 2016/08/04JSON SchemaRseau des biliothques Suisse occidentale 22 HEG-Genve, 2016/08/04The ConceptDataJSONSchemaJSON+ ValidationIngestionQuality ControlEditor ConfigJSONEditor Schema FormJavascript+ Web Editorwith validationRseau des biliothques Suisse occidentale 23 HEG-Genve, 2016/08/04JSON Schema AdvantagesI describes your existing data formatI clear, human- and machine-readable documentationI complete structural validation, useful forI automated testingI validating client-submitted dataRseau des biliothques Suisse occidentale 24 HEG-Genve, 2016/08/04ExamplePerson Schema{"$schema": "http://json-schema.org/schema#","id":"/schemas/person-v1.0.0.json","title": "Person","description": "A Physical Person","type": "object","properties": {"firstName": {"type": "string"},"lastName": {"type": "string"},"age": {"description": "Age in years","type": "integer","minimum": 18}},"required": ["firstName", "lastName"]}Valid Person Data{"firstName": "Henriette","lastName": "Avram","age": 55}Invalid Person Data{"lastName": "Avram","age": 10}Rseau des biliothques Suisse occidentale 25 HEG-Genve, 2016/08/04JSON Editor - Angular Form EditorPerson Schema{"$schema": "http://json-schema.org/schema#","id":"/schemas/person-v1.0.0.json","title": "Person","description": "A Physical Person","type": "object","properties": {"firstName": {"type": "string"},"lastName": {"type": "string"},"age": {"description": "Age in years","type": "integer","minimum": 18}},"required": ["firstName", "lastName"]}EditorEditor Configuration[{"key": "firstName","placeholder": "please enter..."}, {"key": "lastName","placeholder": "please enter..."},"age",{"key": "comment","type": "textarea","placeholder": "Make a comment"}, {"type": "submit","style": "btn-info","title": "Submit"}]Form DataRseau des biliothques Suisse occidentale 26 HEG-Genve, 2016/08/04http://schemaform.io/examples/bootstrap-example.html#/44e5c966452dac5f5e19http://schemaform.io/examples/bootstrap-example.html#/64552997a5e41ac53322http://schemaform.io/examples/bootstrap-example.html#/5a32e594e552fa45ec14Exporting your Data JSON-LDRseau des biliothques Suisse occidentale 27 HEG-Genve, 2016/08/04The ConceptJSONLocal@contextMapping +JSON-LDRDFRDFa RDF/XML N3 TurtleRseau des biliothques Suisse occidentale 28 HEG-Genve, 2016/08/04Your Data in the LOD CloudInvenioInstanceJSON-LDRseau des biliothques Suisse occidentale 29 HEG-Genve, 2016/08/04JSON EditorBook{"recid": "1234","title": "From Marc to JSON","authors": [{"name": "Crockford, Douglas 1955-"},{"uri": "http://viaf.org/viaf/18236820"}]}@context"@context": {"dc": "http://purl.org/dc/elements/1.1/","dct": "http://purl.org/dc/terms/","@base": "http://doc.rero.ch/record/","recid": "@id","uri": "@id","name": "@value","title": "dct:title","authors": "dc:creator"}JSON-LDRseau des biliothques Suisse occidentale 30 HEG-Genve, 2016/08/04http://json-ld.org/playground/#startTab=tab-compacted&json-ld=%7B%22%40context%22%3A%7B%22dc%22%3A%22http%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F%22%2C%22dct%22%3A%22http%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%22%2C%22%40base%22%3A%22http%3A%2F%2Fdoc.rero.ch%2Frecord%2F%22%2C%22recid%22%3A%22%40id%22%2C%22uri%22%3A%22%40id%22%2C%22name%22%3A%22%40value%22%2C%22title%22%3A%22dct%3Atitle%22%2C%22authors%22%3A%22dc%3Acreator%22%7D%2C%22recid%22%3A%221234%22%2C%22title%22%3A%22From%20Marc%20to%20Json%22%2C%22authors%22%3A%5B%7B%22name%22%3A%22Crockford%2C%20Douglas%201955-%22%7D%2C%7B%22uri%22%3A%22http%3A%2F%2Fviaf.org%2Fviaf%2F18236820%22%7D%5D%7D&context=%7B%22%40context%22%3A%7B%22dc%22%3A%22http%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F%22%2C%22dct%22%3A%22http%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%22%2C%22%40base%22%3A%22http%3A%2F%2Fdoc.rero.ch%2Frecord%2F%22%2C%22recid%22%3A%22%40id%22%2C%22uri%22%3A%22%40id%22%2C%22name%22%3A%22%40value%22%2C%22title%22%3A%22dct%3Atitle%22%2C%22authors%22%3A%22dc%3Acreator%22%7D%7DSummaryI JSON DataI simpleI powerfulI portableI JSON-Schema FramworkI validationI HTML form generationI JSON-LD MappingI lightweight data exchangeAnd MARC? full forward/backward compatibilityMARC JSONRseau des biliothques Suisse occidentale 31 HEG-Genve, 2016/08/04A New Data Model Based on JSONRseau des biliothques Suisse occidentale 32 HEG-Genve, 2016/08/04The Current ModelCore LibraryInternal RepresentationJSON-LD schema.org (Google)RERO-LDHTML/XMLFrontendScholarOpenGraphunAPI (zotero)FacebookTwiterOAI-PMH serverASCIIBibTexREST APIJSONIndexerStorageSubmission InterfaceExternal SourceMARC21 via z3950XMLMARC via OAI-PMHMARCPython CodePython / XSLTComplexLibraryRseau des biliothques Suisse occidentale 33 HEG-Genve, 2016/08/04The New Data ModelCore LibraryInternal RepresentationJSON-LD schema.org (Google)RERO-LDHTML/XMLFrontendScholarOpenGraphunAPI (zotero)FacebookTwiterOAI-PMH serverASCIIBibTexREST APIJSONIndexerStorageSubmission InterfaceExternal SourceMARC21 via z3950XMLMARC via OAI-PMHMARCJsonPython CodeNoneHTML Template@contextschemaformJSONeasydojsonRseau des biliothques Suisse occidentale 34 HEG-Genve, 2016/08/04ConclusionRseau des biliothques Suisse occidentale 35 HEG-Genve, 2016/08/04ConclusionI Invenio 3 opens new perspectivesI JSON is obvious for the WebI still MARC compatibleI data conversion is more affordable, robust and easier tomaintainI developers may focus on new developmentsI librarians may take full control of data modeling andexchange by learning JSON-Schema and JSON-LDRseau des biliothques Suisse occidentale 36 HEG-Genve, 2016/08/04ReferencesI RERO DOC http://doc.rero.chI Invenio http://invenio-software.org/I JSON-LD http://json-ld.org/I JSON Schema http://json-schema.org/I RERO LOD http://data.rero.chI Elasticsearch https://www.elastic.coI Angular Form Editor http://schemaform.io/Rseau des biliothques Suisse occidentale 37 HEG-Genve, 2016/08/04http://doc.rero.chhttp://invenio-software.org/http://json-ld.org/http://json-schema.org/http://data.rero.chhttps://www.elastic.cohttp://schemaform.io/IntroductionRERO DOCThe Bibliographic Data ModelComputer Data StructuresThe JSON FormatJSON SchemaJSON-LDSummaryA New Data Model Based on JSONConclusionReferences