using architectures for semantic interoperability to create journal clubs for emergency response

27
Form 836 (7/06) LA-UR- Approved for public release; distribution is unlimited. Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by the Los Alamos National Security, LLC for the National Nuclear Security Administration of the U.S. Department of Energy under contract DE-AC52-06NA25396. By acceptance of this article, the publisher recognizes that the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce the published form of this contribution, or to allow others to do so, for U.S. Government purposes. Los Alamos National Laboratory requests that the publisher identify this article as work performed under the auspices of the U.S. Department of Energy. Los Alamos National Laboratory strongly supports academic freedom and a researcher’s right to publish; as an institution, however, the Laboratory does not endorse the viewpoint of a publication or guarantee its technical correctness. Title: Author(s): Intended for: 09-02970 Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response James E. Powell Linn Marks Collins Mark L. B. Martinez 6th International Conference on Information Systems for Crisis Response and Management Special Session on Solutions for Information Overload May 10-13, 2009 Göteborg, Sweden

Upload: james-powell

Post on 11-May-2015

1.384 views

Category:

Technology


1 download

DESCRIPTION

In certain types of _slow burn_ emergencies, careful accumulation and evaluation of information can offer a crucial advantage. The SARS outbreak in the first decade of the 21st century was such an event, and ad hoc journal clubs played a critical role in assisting scientific and technical responders in identifying and developing various strategies for halting what could have become a dangerous pandemic. This paper describes a process for leveraging emerging semantic web and digital library architectures and standards to (1) create a focused collection of bibliographic metadata, (2) extract semantic information, (3) convert it to the Resource Description Framework /Extensible Markup Language (RDF/XML), and (4) integrate it so that scientific and technical responders can share and explore critical information in the collections. systems.

TRANSCRIPT

Page 1: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Form 836 (7/06)

LA-UR-Approved for public release;distribution is unlimited.

Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by the Los Alamos National Security, LLCfor the National Nuclear Security Administration of the U.S. Department of Energy under contract DE-AC52-06NA25396. By acceptanceof this article, the publisher recognizes that the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce thepublished form of this contribution, or to allow others to do so, for U.S. Government purposes. Los Alamos National Laboratory requeststhat the publisher identify this article as work performed under the auspices of the U.S. Department of Energy. Los Alamos NationalLaboratory strongly supports academic freedom and a researcher’s right to publish; as an institution, however, the Laboratory does notendorse the viewpoint of a publication or guarantee its technical correctness.

Title:

Author(s):

Intended for:

09-02970

Using Architectures for Semantic Interoperability to CreateJournal Clubs for Emergency Response

James E. PowellLinn Marks CollinsMark L. B. Martinez

6th International Conference on Information Systems forCrisis Response and ManagementSpecial Session on Solutions for Information OverloadMay 10-13, 2009Göteborg, Sweden

Page 2: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 1

Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

James E. Powell, Linn Marks Collins, and Mark L.B. Martinez

{jepowell, linn, mlbm}@lanl.gov

Knowledge Systems and Human Factors Team

Research Library, Los Alamos National Laboratory

Page 3: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 2U N C L A S S I F I E D

Bibliographic Metadata to Social, Semantic CollectionMetadata In

Social Semantic Repository Out

Page 4: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 3U N C L A S S I F I E D

SARS event(Severe Acute Respiratory Syndrome)

First cases in 2002 in Asia

Spread to 29 countries

8,000+ cases, nearly 800 deaths

Significant economic toll

Chaotic and secretive response

Took months to identify cause

Science played a critical role

But essentially, we're still here because we got lucky.

Page 5: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 4U N C L A S S I F I E D

How we (re-)defined journal clubTraditional journal club• Regular face-to-face meetings• Reader summaries, supervised discussion• Short reading list• In support of research, clinical, academic activities

Our version: social, topical collections• Metadata for a few hundred to a few thousand selected papers (mini-digital libraries)• Various mechanisms to explore collection• Social tools for calling participants attention to particular papers• Interoperability with other collaboration frameworks

Page 6: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 5U N C L A S S I F I E D

Digital Library - “thought in cold storage”*

What is a digital library?• Collection of content – e-journal articles, e-books, etc.• Metadata to expose that content – the “archive”• Persistent identifiers• Search engine(s) to explore metadata• Citation counts enhance the utility of metadata• Metadata export tools

Name Domain SizeACM Digital Library Computer Science >54,000PubMed Life sciences and biomedicine >17,000,000Web of Science Broad scientific coverage >38,000,000Scopus Broad scientific coverage >33,000,000arXiv Physical sciences >44,000Oppie Broad scientific coverage >94,000,000NSDL Broad scientific coverage >2,000,000

Some Examples:

Page 7: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 6U N C L A S S I F I E D

Research in Progress at theLos Alamos National Laboratory, Los Alamos, NM, US Making digital library content relevant in emergencies

• Is deeper knowledge of value in this situation?• What sources would be useful?

Develop strategy for creating a focused, or targeted, collection— Topic specific harvest or— results of a search against a larger digital library

• Match focused collections with people – (expert user or librarian(s) + technology)

Normalize, import and provide access to the data• Map the data to RDF expose using tools for exploring moderately

large highly relevant collections, and• Make the content social

Page 8: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 7U N C L A S S I F I E D

Components of the SystemPerform query or harvest content

Topic harvester automatically maps content to RDF/XML• Augments content with georeference and foaf data

RDF repository hosts the semantic content

Middleware –• REST web services which expose standards compliant output in response to queries

against RDF repository• SPARQL query templates at the core• Depending on service, output is GraphML, KML, GDF, etc.

Rendering layer –• Combines output with a viewing app (e.g. google maps view, graph applet view)

RDF Digital Library –• Search and social linking services – ties it all together

Page 9: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 8U N C L A S S I F I E D

From query to RDF repositoryBibliographic metadata search service that returns results as XML

Metadata mapped to RDF/XML representations (using Dublin Core, FOAF ontologies)

Each unique author assigned a UUID

Social network of co-authors, author matching via UUID, FOAF

Georeference augmentation via placenames found in MARC bibliographic fields – geonames.org lookup

Option to perform additional augmentation by submitted abstracts to OpenCalais

Page 10: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 9U N C L A S S I F I E D

Harvest to RDFOAI-PMH – Open Archives Initiative – Protocol for Metadata Harvesting• Data provider shares metadata• Service provider harvests metadata, aggregates, and exposes it• OAI-PMH enables retrieval of bibliographic metadata from remote repositories• Protocol allows for limited set of verbs for exploring a target repository,

including: ListMetadataFormats, ListSets, ListRecords• http://www.oaforum.org/tutorial/

Then, map and augment as with search results• Map metadata to triples • Represent authorship networks • Augment with georeference information• Store content in RDF repository

Page 11: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 10U N C L A S S I F I E D

Embracing Lossy80% solution needs to be okay• Need quick turnaround• Accept that metadata mapping is not an exact science• Handle mismatched or omitted elements gracefully• Cope with metadata lacking sufficient granularity or type information• Downgrade metadata when necessary, e.g. MARC to Dublin Core• Normalize and extract maximum value from available data, both explicit and implicit

Original element RDF property nameidentifier dc:identifiertitle dc:titlecreator, contributor dc:creator

foaf:namedate dc:datesubject dc:subject

geo:placenamedescription dc:description

Mapping Metadata

Page 12: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 11U N C L A S S I F I E D

AugmentationMake explicit what is implicit: social networks• Assign a UUID to each author• Associate authors and co-authors using FOAF• Match author names across publications, within query results/harvested set

Enhance what's there• Find placenames in metadata• Ask geonames service for coordinates

Leverage knowledge extraction services• OpenCalais can identify placenames in full text• Also identifies, exposes other concepts and RDF links

Page 13: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 12U N C L A S S I F I E D

Pre- and post-augmentation results for sample queries

Query Original Post-processingsituation # records 6652 4609situation # georefs 5 445infoviz # records 6807 4636infoviz # georefs 7 194humanitarian # records 1626 1312humanitarian # georefs 86 455sars # records 4674 2937sars # georefs 369

situation: ("situation awareness")

infoviz: ("information visualization")

humanitarian: ("humanitarian assistance" OR "disaster relief")

sars: (“sars” and “coronavirus”)

Page 14: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 13U N C L A S S I F I E D

Mapping process summary

Page 15: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 14U N C L A S S I F I E D

Core ontologies used in mapped dataDublin Core (xmlns:dc="http://purl.org/dc/elements/1.1/")• Metadata about publications

— <object> dc:title “SARS: How a Global Epidemic was Stopped”.— <object> dc:identifier “92-9061-213-4”.

FOAF (xmlns:foaf="http://xmlns.com/foaf/0.1/")• Properties describing people

— <object> foaf:name “James Powell”.— uuid foaf:knows uuid.

Geonames (xmlns:geo="http://www.geonames.org/ontology#”)• Properties describing places

— <object> geo:name “Ohio, United States”.— “Ohio, United States” geo:lat “40.5”.— “Ohio, United States” geo:long “-82.5”.

Page 16: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 15U N C L A S S I F I E D

Social (Author) Awareness ToolQuery layer

/srdf?q=Visualization&repo=cr&format=graphml

• Output could be used with any visualization tool that supports GraphML markup

Rendering layer/sat?q=Visualization&repo=cr

• Uses GUESS Java visualization Applet• Combines Java applet with output from

above REST web service

SPARQL querySELECT ?title ?name ?selfuuid ?knowsuuidWHERE { ?y <http://purl.org/dc/elements/1.1/#title>

?title. ...

Page 17: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 16U N C L A S S I F I E D

Geographic Awareness ToolQuery layer

/grdf?query=avian+flu&repo=influenza&format=kml

• KML output could be used with Google Earth

Rendering layer/gat?query=avian+flu&repo=influenza

• Combines Java applet with output from above REST web service

• uses Google Maps API

SPARQL querySELECT ?title ?name ?selfuuid ?knowsuuid WHERE { ?y

<http://purl.org/dc/elements/1.1/#title> ?title. ?y <http://purl.org/dc/elements/1.1/#creator> ?selfuuid. ?selfuuid <http://xmlns.com/foaf/0.1/#name> ?name.

...

Page 18: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 17U N C L A S S I F I E D

RDF Digital Library Interface

Page 19: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 18U N C L A S S I F I E D

Enabling Journal Club style interactions

Practitioners reviewing and discussing a set of published journal articles

Intersects nicely with social linking capabilities • Tagging – user provided keywords related to a paper• Rating – a user's personal ranking of the paper on a simple numeric scale• Comment - a user's thoughts about a particular paper

Technical Issues• Unique identifier ensures social data is connected to appropriate object• User identity, and overlap with authorship• Users may be interacting all over the place – how can we be interoperable with other

collaboration spaces?

Page 20: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 19U N C L A S S I F I E D

Use SIOCSIOC stands for Semantically Interlinked Online Communities• enables harvesting and aggregation of social linking data • from disparate social networking sites

We use SIOC properties to describe users of our social linking tools, as well as some aspects of the content they input when they annotate a particular record.

We also use SIOC types module's Comment property

A Java servlet converts social linking form input into triples, and uses the openrdf API to insert the triples into the triplestore

Social linking – totally native RDF, no intermediary –

Immediately accessible to aggregation services

Page 21: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 20U N C L A S S I F I E D

A User Comment in SIOC<sioct:Comment><sioc:about rdf:resource="oai:casi.ntrs.nasa.gov:20000081109"/>

<dc:title>Pele</dc:title><dcterms:created>2009.04.08 03:41:26 MDT</dcterms:created>

<sioc:has_container rdf:resource="uuid:1234-1234-1234"/>

<sioc:has_creator><sioc:User rdf:about="http://rdfsearch/queryRdf/search.jsp#jpowell">

<rdfs:seeAlso rdf:resource="http://rdfsearch/queryRdf/showComments?userid=jpowell"/></sioc:User>

</sioc:has_creator><sioc:content>Does Pele know about Io (or for that matter, Jupiter?)</sioc:content>

Page 22: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 21U N C L A S S I F I E D

Next up, user tagsUsers will be able to tag content

Tags will receive similar treatment as Comments, • stored as triples, • using various ontologies including Dublin Core, FOAF, SIOC• applicable ontologies include SCOT (Social Semantic Cloud of Tags), and/or

MOAT (Meaning Of A Tag)

Tags will link to DBPedia concepts

RDF Linking will greatly enhance the utility of the tags, • The tag “infectious disease” would be associated with

http://dbpedia.org/page/Infectious_disease• Reuse of “infectious disease” will point to other library content

Our tags could be aggregated with tags from other collaboration spaces

Page 23: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 22U N C L A S S I F I E D

SummaryDisease outbreaks can be disastrous • but diseases often emerge slowly, sometimes crossing species barriers• appear in one region, then move with populations• are caused by a pathogen that may eventually be identified• may be treatable, corralled by quarantine, curtailed by hygiene, halted by vaccines

Science can prevent or reduce casualties from, and, if we're lucky, reduce or eliminate the risk of future outbreaks of disease• but only if scientists have access to topical, vetted, high quality information• and tools to explore and share this knowledge

Our semantic digital library solution has normalized collections of enhanced metadata, organized by topic (search), or source (harvest) with• Middleware layers that enable exploration of the data• Rendering layers that make results usable (e.g. via visualization tools, overlay

results onto map) • Social linking capabilities for collaboration and interoperability

Page 24: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 23U N C L A S S I F I E D

Questions?

Page 25: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 24U N C L A S S I F I E D

Awareness tools for E-SOS

Our content exploration tools started out as a suite of Just In Time Information Retrieval Tools (JITIR)• leverage user's context• proactively perform tasks in support of user's main activity• manage user's attention appropriately

“Context” is content authoring – e.g. blog post, forum entry, email message

Input text is “intercepted” and parsed for tokens that may be used to automatically construct queries

Query is stop word deleted – boolean linked significant terms, or results of an intermediate call to a term extraction service (Calais, Yahoo)

Query is submitted to query middleware tool

Results are returned when available, using appropriate rendering layer

Page 26: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 25U N C L A S S I F I E D

Page 27: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

Slide 26U N C L A S S I F I E D

Excerpt of possible tag RDF/XML

<sioc:about rdf:resource="info:lanl-repo/biosis/PREV198273075669"/><moat:Tag rdf:about="http://tags.moat-project.org/tag/Infectious_disease"><moat:name><![CDATA[infectious disease]]></moat:name><moat:hasMeaning><moat:Meaning>

<moat:meaningURI rdf:resource="http://dbpedia.org/page/Infectious_disease"/><foaf:maker rdf:resource="http://rdfsearch/queryRdf/search.jsp#jpowell"/><foaf:maker rdf:resource="http://example.org/user/foaf/1"/>

</moat:Meaning></moat:has_meaning>

...More info: http://moat-project.org/