spo: spo: an ontology for describing host-pathogen interactions inherent to streptococcus pneumoniae...

30
Spo: An Ontology for Describing Host-pathogen Interactions Inherent to Streptococcus Pneumoniae Infections Talk by Cátia Vaz INESC-ID / ISEL-IPL Joint work with Alexandre Francisco, Susana Vinga, Pedro Reis, Ana Teresa Freitas and Pneumopath Consorcium

Upload: emily-perry

Post on 01-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Spo:An Ontology for Describing Host-pathogen Interactions Inherent to Streptococcus Pneumoniae Infections

Talk by Cátia Vaz

INESC-ID / ISEL-IPL

Joint work with Alexandre Francisco, Susana Vinga, Pedro Reis, Ana Teresa Freitas

and Pneumopath Consorcium

host-pathogen interactions

Over the past twenty years, the study of infection has tended to consider individual virulence factors or host factors.

For finding new targets for diagnosis and treatment it is important to take into account more that one factor It is important to study the host-pathogen interactions

during infection of Streptococcus pneumonaie. This is one of the main goals of the Pneumopath

project.

host-pathogen interactions

The infection can be determined by multiple attributes of both host and pathogen. It is important to take into account:

epidemiological and genomic characterization of pneumococcal strains;

the results from experiments that evaluate host or pneumococcal responses to infection or different environmental challenges;

the results from experiments that identify host genetic susceptibility factors.

Data

The data to be considered to describe the host-pathogen interactions during infection of Streptococcus pneumonaie includes: characterization of pneumococcal strains, typing information data of in vitro and in vivo experiments with animals and

cell models. Some of these data are scattered across numerous

information systems and repositories, each with its own terminologies, identifier schemes, and data formats.

Data Integration

Thererefore, it is needed to have a common understanding of the concepts that describes host-pneumococcal interactions and thus it is need to: Define a vocabulary

The concepts and relations The semantic interconections

Relations between concepts and relations :. Define an Ontology!

Modeling Information

We have defined a model based on the information We have defined the concepts and their relations It allows more sharing of data and interoperability

We express data using knowledge representation languages We define a less coupled model to technology, making simpler

data integration

The model is very adaptive With the appearance of new concepts and new relations, it is

only necessary to add that new information avoid processes of data migration

Modeling Information

Thus, semantic annotation and interoperability become an absolute necessity for the integration of such diverse biomolecular data.

ExperimentsWorkpackage

Experiment

AnimalExperiment

Assay

GrowthExperiment

Participant

Institution

GeographicInformation

hasWorkpackage hasParticipant

“is-a”“is-a”

“is-a”belongsToInstitution

hasGeographicInformation

hasParticipant

ExperimentsWorkpackage

Experiment

AnimalExperiment

Assay

GrowthExperiment

Participant

Institution

GeographicInformation

hasWorkpackage hasParticipant

“is-a”“is-a”

“is-a”belongsToInstitution

hasGeographicInformation

hasParticipant

Name

DateRawFile Name

Name

AddressCountryCountryISOCodeRegion

Common Understanding...For instance, in experiences with animals:

And in experiences of Growth:

Two different Time Series?

Common Understanding...And with the same measurements in the same kind of experience:

different partners...

Experiment Measurements

AssayMeasurement

TemporalMeasurement

PainScore

GrowthCount

Measurement

TimeOfDeath Age BacteriaBatchDose

“is-a” “is-a”“is-a”“is-a”

“is-a”

“is-a”

“is-a”

Experiment Measurements

AssayMeasurement

TemporalMeasurement

PainScore

GrowthCount

Measurement

TimeOfDeath Age BacteriaBatchDose

“is-a” “is-a”“is-a”“is-a”

“is-a”

“is-a”

“is-a”

ValueValueUnit

MediumTimeTimeUnit

NameStandardDeviation

MethodType

Organisms

Bacteria

TimeOfDeath

Age

Animal Microorganism

Organism

Species

belongsToSpecies

hasAge

hasTimeOfDeath

“is-a” “is-a”

“is-a”“is-a”

GeographicInformation

hasGeographicInformation

Human Mouse

Organisms

Bacteria

TimeOfDeath

Age

Animal Microorganism

Organism

Species

belongsToSpecies

hasAge

hasTimeOfDeath

“is-a” “is-a”

“is-a”“is-a”

GeographicInformation

hasGeographicInformation

Human Mouse

NameStrain

Name

Gender

CarriageClinicalDiseaseRiskFactors

Isolate

Isolate

Origin

Environment Host Animal

TypingInformation

GeographicInformation

Species

hasGeographicInformation

hasGeographicInformation

belongsToSpecies

hasTypingInformation

http://www.phyloviz.net/typon/

Typon concepts

Isolate

Isolate

Origin

Environment Host Animal

TypingInformation

GeographicInformation

Species

hasGeographicInformation

hasGeographicInformation

belongsToSpecies

hasTypingInformation

http://www.phyloviz.net/typon/

ContigsNameGenesFileOtherNameProteinsFileStrainWholeGenomeSequenced

Experiment with Animals

AnimalExperiment

GrowthCount

PainScore

BacteriaBatchDose

Bacteria

Animal

AnimalGroup

hasBacteria

hasAnimalGroup

hasAnimal

hasGrowthCount

hasPainScore

hasBacteriaBatchDose

Experiment with Animals

AnimalExperiment

GrowthCount

PainScore

BacteriaBatchDose

Bacteria

Animal

AnimalGroup

hasBacteria

hasAnimalGroup

hasAnimal

hasGrowthCount

hasPainScore

hasBacteriaBatchDose

RouteSurvivalRoute

Strain

Experiments of Growth

GrowthExperiment

GrowthCount

Bacteria

hasGrowthCount

hasBacteria

Experiments of Growth

GrowthExperiment

GrowthCount

Bacteria

hasGrowthCount

hasBacteriaSugarConcentrationTypeOfSugar

Assays

Assay

AssayMeasurement

Bacteria

hasAssayMeasurement

hasBacteria

Assays

Assay

AssayMeasurement

Bacteria

hasAssayMeasurement

hasBacteria

CellTypeMOI

How this worked in practice?We collected data files for the partners in the project

We have meetings with partners to understand the concepts involved in their data

Their data were among several spreadsheets, with different formats

How this worked in practice?We collected data files for the partners in the project

We have meetings with partners to understand the concepts involved in their data

Their data were among several spreadsheets, with different formats

But...

How this worked in practice?

But:Partners were not familiar with knowledge representation

Moreover:they have difficulty in understanding how the columuns and rows of their spreadsheets transform into triples

How this worked in practice?

But:Partners were not familiar with knowledge representation

Moreover:they have difficulty in understanding how the columuns and rows of their spreadsheets transform into triples

? How did we work on this?

How this worked in practice?We have meetings with partners to understand the concepts involved in their data

. . .We discussed the common concepts among them and the relations between the

We developed and ontology

We show to the partners a “tabular” view of the ontology

We refined the ontology until we reached a general agreement among partners

We transformed the data according to this new model and we integrated it for further analysis

Final Remarks

SPO was developed in the context of a large research project It describes knowledge in this field; Allows validation and aggregation of existing

knowledge which is essential for data integration.• We are continuing improving and generalizing

this ontology for describing more aspects of host-pathogen interactions

Thanks