hikm2010 - query resolution for biology and medicine

14
Customisable Query Resolution in Biology and Medicine Peter Ansell Microsoft Queensland University of Technology eResearch Centre [email protected]

Upload: peter-ansell

Post on 02-Jul-2015

1.352 views

Category:

Health & Medicine


5 download

TRANSCRIPT

Page 1: HIKM2010 - Query Resolution for Biology and Medicine

Customisable Query Resolution in Biology and Medicine

Peter AnsellMicrosoft Queensland University of Technology

eResearch Centre

[email protected]

Page 2: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

2

Outline

● What data is out there● Current data formats● RDF based system● Biology and Medicine case study

Page 3: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

3

Current data formats

● FASTA● EMBL● GFF● BSML● Genbank● Many other formats, including custom XML

Page 4: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

4

Page 5: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

5

Page 6: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

6

Linked Data

1) Use URIs as names for things

2) Use HTTP URIs so that people can look up those names.

3) When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)

4) Include links to other URIs. so that they can discover more things.

http://www.w3.org/DesignIssues/LinkedData.html

Page 7: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

7

Bio2RDF distributed queries

● Assign namespaces to providers and create URI's based on the namespace

● Just using RDF is not enough, the URI's have to be transparent enough to be used and referenced

● Query across relevant providers given a users query and get results in a single RDF document

● Aggregate all results into a single RDF document and return to the user

Page 8: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

8

Bio2RDF workflow

Resolved URI: http://bio2rdf.org/label/go:0000345

Host name: http://bio2rdf.org/ Query: label/go:0000345

Regular expression: label/([\w-]+):(.+)

http://bio2rdf.org/query:labelsearch

http://bio2rdf.org/query:labelsearchforgo

Page 9: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

9

Demo background

● The background for this hypothetical demonstration is a patient who has not been responding well to a particular drug, Isocarboxazid, as a treatment for their depression

● The goal is to determine what information is available to a doctor in changing the treatment

Page 10: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

10

Genomics demo

● http://bio2rdf.org/drugbank_drugs:DB01247

● Isocarboxazid

● http://bio2rdf.org/drugbank_targets:3939

● http://bio2rdf.org/hgnc:6834

● http://bio2rdf.org/geneid:4129

– MAOB● http://bio2rdf.org/pubmed:10653595

– Localisation of MAOA and MAOB in pancreas, thyroid and adrenal glands

Page 11: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

11

Drug effects demo

● http://bio2rdf.org/links/drugbank_drugs:DB01247● http://bio2rdf.org/drugbank_druginteractions:DB00176_DB01247

– Possible adverse effects with Fluvoxamine● http://bio2rdf.org/sider_drugs:3759

● http://bio2rdf.org/sider_sideeffects:C0027813

– Known possible side effect of Neuritis

Page 12: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

12

Alternative drugs demo

● http://bio2rdf.org/drugbank_targets:3939● http://bio2rdf.org/pfam:PF01593

– Amino oxidase protein family

● http://bio2rdf.org/drugbank_targets:3041– Similar protein, L-amino-acid oxidase

● http://bio2rdf.org/drugbank_drugs:DB03147– Drug for similar protein, Flavin-Adenine

Dinucleotide

Page 13: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

13

Private and public data

● Private information could be provided using current or future access models

● Public information can be linked to make it explicit what the links are from the private patient or clinical information to the wider set of biological and chemical databases are

Page 14: HIKM2010 - Query Resolution for Biology and Medicine

Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010

14

Conclusion

● Many large distributed datasources● Single interface, RDF● Distribute queries efficiently across the

endpoints● Allow for private data to remain private, but be

linked out to public information