daniel schober on behalf of debugit community

26
Slide 1 IDO-WS 2010 Daniel Schober Daniel Schober on behalf of DebugIT Community Semantic integration of antibiotics resistance patterns

Upload: boaz

Post on 30-Jan-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Daniel Schober on behalf of DebugIT Community. Semantic integration of antibiotics resistance patterns. Healthcare Context. A need for ‚IT-biotics‘. DebugIT D etecting and E liminating B acteria U sin G I nformation T echnology - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Daniel Schober on behalf of DebugIT Community

Slide 1 IDO-WS 2010 Daniel Schober

Daniel Schober on behalf of DebugIT Community

Semantic integration of antibiotics resistance patterns

Page 2: Daniel Schober on behalf of DebugIT Community

Slide 2 IDO-WS 2010 Daniel Schober

Healthcare Context

Page 3: Daniel Schober on behalf of DebugIT Community

Slide 3 IDO-WS 2010 Daniel Schober

A need for ‚IT-biotics‘

• DebugIT– Detecting and Eliminating Bacteria UsinG Information

Technology– Using ‘semantic linked data’ to exploit distributed clinical

data • Acquire new knowledge

– Through advanced data mining

• Apply knowledge in decision support– E.g. prescription choice

• Apply knowledge in monitoring– Analyze current & predict future trends

– Discover patient safety patterns

Page 4: Daniel Schober on behalf of DebugIT Community

Slide 4 IDO-WS 2010 Daniel Schober

The DebugIT project: general architecture

Activities

Repositories

ClinicalData

Repository(aggregated,distributed)

Analysis

KnowledgeRepositories

(aggregated,distributed)

Reasoningengine

Knowledge

ClinicalInformation

System

clinical dataclinical data

Knowledge

1-CollectData

2-Learn

3-StoreKnowledge

4-Apply

Activities

Repositories

ClinicalData

Repository(aggregated,distributed)

Analysis

KnowledgeRepositories

(aggregated,distributed)

Reasoningengine

Knowledge

ClinicalInformation

System

clinical dataclinical data

Knowledge

1-CollectData

2-Learn

3-StoreKnowledge

4-Apply

Page 5: Daniel Schober on behalf of DebugIT Community

Slide 5 IDO-WS 2010 Daniel Schober

Using Ontologies in DebugIT

• Provide common semantic identifiers– Allow crosstalk within Interoperability Platform– SPARQL query to express research question– Provide formal meaning exploitable by logical & rule-based

reasoners

• Integrate access to heterogeneous CIS– Normalization via terminologies and textmining

Page 6: Daniel Schober on behalf of DebugIT Community

Slide 6 IDO-WS 2010 Daniel Schober

Data normalisation & ontology mapping (annotation)

Refined clinical data– uniform format & semantics– anonymized

Ontologies Text Mining

De-identification

Raw clinical data– different encodings– different languages

Page 7: Daniel Schober on behalf of DebugIT Community

Slide 7 IDO-WS 2010 Daniel Schober

Data integration architecture

• ETL (2) populates local RDTBs in DMZ layer• D2R conversion (3) allows SPARQL integration (4) via Ontologies (DCO, OO)

Page 8: Daniel Schober on behalf of DebugIT Community

Slide 8 IDO-WS 2010 Daniel Schober

Linking data values to ontologies (via CVs)

1. Textmining links CIS data values to CVs

2. Create SKOS mappings

from CV to Ontology (DCO)

SNOMED CT findings Diseases

Uniprot NEWT taxonomy Bacteria

WHO ATC codes Drugs, antibiotics

Foundational Model of Anatomy Human anatomy

… …

Page 9: Daniel Schober on behalf of DebugIT Community

Slide 9 IDO-WS 2010 Daniel Schober

Ontology Layers within DebugIT

1 DebugIT Core Ontology (DCO)- Clinical domain of infectious diseases- OWL-DL

30 Operational ontologies (OO)- Implementation, module crosstalk, data mining

- query building, statistics, analysis, evidences, maths, units, …

- OWL-Full

7 Data Definition Ontologies (DDO)- Describing hospital specific CIS Data model

Page 10: Daniel Schober on behalf of DebugIT Community

Slide 10 IDO-WS 2010 Daniel Schober

Describing data

Describing real world (independent of data)

‚female patient‘ in different ontology layers

Page 11: Daniel Schober on behalf of DebugIT Community

Slide 11 IDO-WS 2010 Daniel Schober

Steps for solving a clinical analysis question

Clinician states clinical analysis question in natural language

1. Clinical Researcher– clinical analysis query via QueryBuilder & SPARQL OOs & DCO

2. Data Miners– data set queries for each targeted CDR via SPARQL DDOs

3. Data Manager– maintains N3 rule set to convert instances from the endpoint specific

DDO to OO & DCO

4. Data Miners– aggregate data set SPARQL result graphs in DCO using the needed

conversion rule sets– performs clinical analysis, e.g. using/creating N3 rules using OOs, DCO– formalizes the clinical analysis result, using OOs & DCO

5. Clinical Researcher– validates result & presents it to Clinician who validates result.

Page 12: Daniel Schober on behalf of DebugIT Community

Slide 12 IDO-WS 2010 Daniel Schober

Clinical Analysis SPARQL Query (construct)

“What percentage of Escherichia coli cases, cultured from urine samples, is resistant to the combination of trimethoprim/sulfametoxazol (TMP/SMX) or trimethoprim in the period 2006-2010?”

CONSTRUCT {?percentage

quex:percentageOf ?total;quex:percentageThat ?part;quex:hasValue ?percentageValue; quex:hasUnit units:percent.

?total rdfs:subClassOf cao:EColi, [a owl:Restriction; owl:onProperty cao:culturedFrom; owl:someValuesFrom [

rdfs:subClassOf dco:UrineSample;a owl:Restriction; owl:onProperty biotop:outcomeOf;

owl:someValuesFrom [ rdfs:subClassOf dco:UrineSampleCollection; a owl:Restriction; owl:onProperty event:during;

owl:hasValue [ dco:hasStartDateTime "2006-01-

01T00:00:00"^^xsd:dataTime;dco:hasEndDateTime "2010-12-

31T23:59:59"^^xsd:dataTime]]]].?part rdfs:subClassOf ?total, [

a owl:Restriction; owl:onProperty cao:resistantTo; owl:someValuesFrom [owl:unionOf (dco:Trimethoprim

dco:SulfamethoxazoleAndTrimethoprim)]]}

Page 13: Daniel Schober on behalf of DebugIT Community

Slide 13 IDO-WS 2010 Daniel Schober

Clinical Analysis SPARQL Query (where)

WHERE {?percentage

quex:percentageOf ?total;quex:percentageThat ?part;quex:hasValue ?percentageValue; quex:hasUnit units:percent.

?total rdfs:subClassOf cao:EColi, [a owl:Restriction; owl:onProperty cao:culturedFrom;

owl:someValuesFrom [rdfs:subClassOf dco:UrineSample;a owl:Restriction; owl:onProperty biotop:outcomeOf;

owl:someValuesFrom [ rdfs:subClassOf dco:UrineSampleCollection; a owl:Restriction; owl:onProperty

event:during; owl:hasValue [ dco:hasStartDateTime "2006-01-

01T00:00:00"^^xsd:dataTime;dco:hasEndDateTime "2010-12-

31T23:59:59"^^xsd:dataTime]]]].?part rdfs:subClassOf ?total, [

a owl:Restriction; owl:onProperty cao:resistantTo; owl:someValuesFrom [

owl:unionOf (dco:Trimethoprim dco:SulfamethoxazoleAndTrimethoprim)]]}

Page 14: Daniel Schober on behalf of DebugIT Community

Slide 14 IDO-WS 2010 Daniel Schober

Data set SPARQL query (for HUG-DDO)

CONSTRUCT {?antibiogram a ddo:Antibiogram;

ddo:hasCulture ?culturing; ddo:hasIdentifiedBacterium [ddo:hasBacteriumCode

"562"^^biosko:uniProtTaxonomyDT];ddo:hasTestedDrug [ddo:hasDrugCode ?atc];ddo:hasOutcome ?antibiogramResult.

?culturing ddo:hasSampleType ?sampleType;ddo:hasResultDate ?resultDate}

WHERE {?antibiogram a ddo:Antibiogram;

ddo:hasCulture ?culturing; ddo:hasIdentifiedBacterium [ddo:hasBacteriumCode

"562"^^biosko:uniProtTaxonomyDT];ddo:hasTestedDrug [ddo:hasDrugCode ?atc];ddo:hasOutcome ?antibiogramResult.

?culturing ddo:hasSampleType ?sampleType;ddo:hasResultDate ?resultDate.

FILTER (?atc = "J01EA01"^^clisko:atc20090101DT || ?atc = "J01EE01"^^clisko:atc20090101DT) FILTER ("2006-01-01T00:00:00"^^xsd:dateTime < ?resultDate && ?resultDate < "2010-12-31T23:59:59"^^xsd:dateTime)

FILTER (?sampleType = "102866000"^^clisko:sct20080731DT)} # to be changed to 122575003 for "Urine specimen"

Page 15: Daniel Schober on behalf of DebugIT Community

Slide 15 IDO-WS 2010 Daniel Schober

DDO to DCO mapping via N3 rules

MAPPING FROM HUG-ddo:Culture

TO dco:BacterialCultureProcedure

{ ?culturing ddo:hasSampleType ?sample.

?Sample skos:exactMatch [skos:notation ?sample]}

=>

{ ?culturing biotop:precededBy [a dco:SampleCollection; biotop:hasOutcome [a ?Sample]]}.

Page 16: Daniel Schober on behalf of DebugIT Community

Slide 16 IDO-WS 2010 Daniel Schober

Cross-site integrated SPARQL result

2 instances of total result set of 1764

<https://babar.unige.ch:8443/cdr/resource/Culture/100320> a dco:AntimicrobialSusceptibilityTest,dco:BacterialAntibiogramAnalysis, dco:BacterialCultureProcedure;:hasOutcome [:encodes [:qualityLocated [a :SpeciesEscherichiaColiValueRegion]]], [

:encodes [:qualityLocated [a dco:Sensitive]]];:hasParticipant [a dco:SulfamethoxazoleAndTrimethoprim];dco:hasResultDateTime "2006-11-03T09:57:00"^^xsd:dateTime.

<https://lincoln.imt.liu.se:8443/d2r-server/resource/culture/7219> a dco:AntimicrobialSusceptibilityTest,dco:BacterialAntibiogramAnalysis, dco:BacterialCultureProcedure;:hasOutcome [:encodes [:qualityLocated [a :SpeciesEscherichiaColiValueRegion]]], [

:encodes [:qualityLocated [a dco:Sensitive]]];:hasParticipant [a dco:Trimethoprim ];:precededBy [a dco:SampleCollection; :hasOutcome "abnormal urine" ];dco:hasResultDateTime "2008-10-16T00:00:00"^^xsd:dateTime .

Page 17: Daniel Schober on behalf of DebugIT Community

Slide 17 IDO-WS 2010 Daniel Schober

DCO design principles

• OWL-DL– Reasoner for autoclassification & consistency checks during OE– Reasoner infers multiple parenthood

• Reusing BioTop– Ensure a rigid modeling view – Provides reuseable constraints (bridges to all TLO)

• Concepts harvested from– Hospital CDR schemata– Competency questions from clinical use case

• Datadriven bottom up

– Domain terminologies in use• Via UMLS or OLS• Ontology modularisation tools (A.Rector)• HL7 v3 based

Page 18: Daniel Schober on behalf of DebugIT Community

Slide 18 IDO-WS 2010 Daniel Schober

DCO content (statistics)

•Ontology elements & axioms •Overall •DCO •BioTop

•Classes •1311 •1014 •375

•Object Properties (relations) •78 •3 •74

•Datatype Properties •11 •10 •0

•Subclass Axioms •1494 •1050 •444

•Equivalent Class Axioms •197 •98 •99

•Disjoint Axioms •76 •1 •75

Page 19: Daniel Schober on behalf of DebugIT Community

Slide 19 IDO-WS 2010 Daniel Schober

A tripartite granular disease model (SDP pattern)

Page 20: Daniel Schober on behalf of DebugIT Community

Slide 20 IDO-WS 2010 Daniel Schober

Inference of new facts(BloodSample is a BodyLiquidSample)

Stated Facts

•Inferred Hierarchy (more structure)

Logics Reasoner

•BodyLiquidSample =

•BloodSample =

•Asserted Hierarchy (flat list)

•BodyLiquid =

Page 21: Daniel Schober on behalf of DebugIT Community

Slide 21 IDO-WS 2010 Daniel Schober

•Use CNL for Ontology Evaluation

Page 22: Daniel Schober on behalf of DebugIT Community

Slide 22 IDO-WS 2010 Daniel Schober

Next steps

• Enhance coverage • Refinement of DCO structure

– Addressing drugs dosages & disease therapies– Use rectors Snomed CT modularisation algorithm to extract

relevant SNOMED CT IDs form DCO-provided seed list

• Publish and distribute– E.g. on Bioportal

Page 23: Daniel Schober on behalf of DebugIT Community

Slide 23 IDO-WS 2010 Daniel Schober

DCO evaluation

• Ultimate overall evaluation• Can clinicians

– run the overall system ?– build queries and understand results ?

• Can data miner – create data set results ?– do data mining and formalize quality criteria for results ?

• DCO internal evaluation• Fitness for use tested by ability to answer CQ

• Evaluate validity of assertions by– Reasoners– Graphical and textual representations to domain experts– Serialization of modules into Constrained Natural Languages (CNL)

Page 24: Daniel Schober on behalf of DebugIT Community

Slide 24 IDO-WS 2010 Daniel Schober

(Preliminary) Conclusion

• Semantically rich application ontologies• Successive Query formalisations are complex

… but approach scales over space & time

• Used in practice– Practical SPARQL query building– Data integration across 7 EU Hospitals

• DL-reasoning helps ontology engineering– DL limitation justified for smaller ontologies – For larger models use rule-based reasoning

• As data is dirty we need we need to cope with errors arising

Page 25: Daniel Schober on behalf of DebugIT Community

Slide 25 IDO-WS 2010 Daniel Schober

Resources & Acknowledgements

Resources• DebugIT project

– http://www.DebugIT.eu

• Ontology sources– http://purl.org/imbi/dco/dco

• TermBrowser – http://www.imbi.uni-freiburg.de/~schober/dco_owlDoc/

Acknowledgements• Hans Cools, Martin Boeker, Kristof Depraetere, Douglas Teodoro, Remy

Choquet, Stefan Schulz, Ilinca Tudose, Maren Kechel, Giovanni Mels, Dirk Coalert, Dimitris Iakovidis, the DebugIT team

• Funded by grant agreement ICT-2007.5.2-217139

Page 26: Daniel Schober on behalf of DebugIT Community

Slide 26 IDO-WS 2010 Daniel Schober

In the Hospital kitchen I was approached by a member of the feared ‘Antibiotics Resistance’ …