nih workshop: informatics for data and resource discovery in addiction research july 8, 2010
DESCRIPTION
NIH WORKSHOP: INFORMATICS FOR DATA AND RESOURCE DISCOVERY IN ADDICTION RESEARCH July 8, 2010. Case Study 5 (NEMO) : Informatics tools to support theoretical and practical integration of human neuroscience data Gwen Frishkoff, Ph.D. Psychology & Neuroscience, Georgia State University - PowerPoint PPT PresentationTRANSCRIPT
NIH WORKSHOP: INFORMATICS FOR DATA AND RESOURCE DISCOVERY IN ADDICTION RESEARCH
July 8, 2010
Case Study 5 (NEMO):Informatics tools to support
theoretical and practical integration of human neuroscience data
Gwen Frishkoff, Ph.D.Psychology & Neuroscience, Georgia State University
NeuroInformatics Center, University of Oregon
http://nemo.nic.uoregon.edu
What the computer scientist says…
Should wewrite out the data to XML or
RDF triples? And do you plan to use ontology rules
to do complex reasoning or just use SQL to query the
data?
What the neuroscientist hears…
Blah blah blah blah blah…data… blah blah
blah? And blah blah blah…the data?
GOALS FOR THIS TUTORIAL
• What is an ontology & what’s it for?– Why bother? (Case Study: Classification of EEG/ERP data)
– What are some “best practices” in ontology design & implementation?
• What is RDF & what’s it for?– How does RDF represent information?– How is it used to link data to ontologies?– How can ontology-based annotation be used to
support classification of data?
Case Study 5 (NEMO): Neural ElectroMagnetic Ontologies
The problem (pattern classification)
The methods & tools ontologies RDF database
Proof of concept (a worked example)
The challenge (pattern classification)
The methods & tools ontologies RDF database
Proof of concept (a worked example)
Case Study 5 (NEMO): Neural ElectroMagnetic Ontologies
2-min Primer on EEG/ERP Methods
EEGs (“brainwaves” or flunctuations in brain electrical potentials) are recorded by placing two or more electrodes on the scalp surface.
256-channel Geodesic Sensor Net ~5,000 ms
Event-related potentials (ERP)
ERPs (“event-related potentials”) are the result of averaging across multiple segments of EEG, time-locking to an event of interest.
ERP Patterns (“Components”)
1. TIME — peak latency, duration (WHEN in time)2. SPACE— scalp “topography” (WHERE on scalp)3. FUNCTION — sensitivity to experiment factors
Donchin & Duncan-Johnson, 1977
ERP Patterns are characterized by 3 dimensions:
120 ms
• Tried and true method for noninvasive brain functional mapping
• Millisecond temporal resolution• Direct measure neuronal activity• Portable and inexpensive• Recent innovations give new windows
into rich, multi-dimensional patterns– More spatial info (high-density EEG)– More temporal & spectral info (JTF, etc.)– Multimodal integration & joint recordings
of EEG and fMRI– Specificity of different patterns
beyond “reduction in P300” amplitude…
1 sec
Brain Electrophysiology (EEG/ERP): The promise (Biomarkers of addiction?)
Brain Electrophysiology (EEG/ERP): The challenge
• An embarrassment of riches– A wealth of data
– A plethora of methods
• A lack of integration– How to compare patterns across studies, labs?
– How to do valid meta-analyses in ERP research?
• A need for robust pattern classification– Bottom-up (data-driven) methods– Top-down (knowledge-driven) methods
410 ms
450 ms
330 ms
Peak latency 410 ms
A lack of standardization
Will the “real” N400 please step forward?
Hypothetical Database Query: Show me all the N400 patterns in data set X.
Neural ElectroMagnetic Ontologies(NEMO)
The driving goal is to develop methods and tools to support cross-lab, cross-experiment integration of EEG and MEG data
We bring a set of methods & tools to bear to address this:
A set of formal (OWL) ontologies for representation of EEG/MEG and ERP/ERF data
A suite of tools for ontology-based annotation and analysis of EEG and ERP data
An RDF database that stores annotated data from our NEMO ERP consortium and supports ERP pattern classification via SPARQL queries
The challenge (EEG pattern classification)
The methods & tools ontologies RDF database
Proof of concept (a worked example)
Case Study 5 (NEMO): Neural ElectroMagnetic Ontologies
What’s an ontology & what’s it for?“Highly semantically
structured”
What does this mean & what
does it buy us?
Ontologies for high-level, explicit
representation of domain knowledge
theoretical integration*
*NOTE: We can record pattern definitions from
literature in ontology without committing to
the truth of these records now and forever
Science evolves… So do ontologies!!
Maryann: “Avoid ontology
wars…”
Ontology design principles(based on OBO Foundry
recommendations)1. Factor the domain to generate modular
(“orthogonal”) ontologies that can be reused, integrated for other projects
2. Reuse existing ontologies (esp. foundational concepts) to define basic (low-level) concepts
3. Validate definitions of high-level concepts in bottom-up (data-driven) as well as top-down (knowledge-driven) methods
4. Collaborate with a community of experts in collaborative design, testing of ontology-based tools for data representation and analysis
Factoring the ERP domain
1 sec
TIME SPACE
FUNCTION Modulation of pattern features (time,
space, amplitude) under different experiment conditions
Overview: NEMO Ontologies– NEMO core modules:
• NEMO_spatial• NEMO_temporal• NEMO_functional• NEMO_ERP• NEMO_data
– NEMO backend:• NEMO_relations• NEMO_imports• NEMO_deprecated• NEMO_annotation_properties
ERP spatial subdomain
1 sec
TIME SPACE
FUNCTION Modulation of ERP pattern features under different experiment conditions
Reuse in dev’t of NEMO Spatial
BFO (Basic Formal
Ontology) “UPPER
ONTOLOGY”
FMA(Foundational
Model of Anatomy)
“MIDLEVEL ONTOLOGY”
ERP temporal subdomain
1 sec
TIME SPACE
FUNCTION Modulation of ERP pattern features under different experiment conditions
Early (“exogenous”) vs. Late (“endogenous”) ERP patterns
~0-150 ms after event (e.g., stimulus onset)
EARLY
501 ms or more after event (e.g., stimulus onset)
LATE
~151-500 after event (e.g., stimulus onset)
MID-LATENCY
Collaboration in dev’t of NEMO ERP
1 sec
TIME SPACE
FUNCTION Modulation of ERP pattern features under different experiment conditions
NEMO Functional Ontology
Angela Laird
BrainMap
Jessica Turner
BIRN(now part of Neurolex)
CogPO
http://brainmap.org/scribe/index.html
Reconsistituting the ERP domain…
1 sec
TIME SPACE
FUNCTION Modulation of ERP pattern features under different experiment conditions
Frishkoff, Frank, et al., 2007
Validation through application of NEMO ontologies in modeling ERP data
The challenge (EEG pattern classification)
The methods & tools ontologies RDF database
Proof of concept (a worked example)
Case Study 5 (NEMO): Neural ElectroMagnetic Ontologies
Ontologies for high-level, explicit
representation of domain knowledge
theoretical integration
RDF to support principled mark-up of data for meta-
analysispractical integration
NEMO International Language & Literacy Consortium
Tim CurranUniversity of Colorado
Kerry KibornUniversity of Glasgow
Dennis MolfeseUniversity of Louisville
Chuck PerfettiUniversity of Pittsburgh
John ConnollyMcMaster University
Formed in 2007
Annoting EEG/ERP data
Pattern Labels
Functional attributes
Temporal attributes
Spatial attributes
= + +
Robert M. Frank
Concepts coded in OWL NEMO ontology
Data coded in RDF NEMO database
HOW?
Annotating Data in RDF• Data Annotation
– The process of marking up or “tagging” data with meaningful symbols; tags may come from ontology linked to a URI
• URI (Uniform Resource Identifier)– A compact sequence of characters that identifies an abstract
or physical resource (typically located on the Web)
• RDF (Resource Description Framework)– RDF is a directed, labeled graph (data model) for representing
information (typically on the Web)
*See Glossary (http://www.seiservices.com/nida/1014080/ReadingRoom.aspx)
Recall: The goal is to formulate pattern definitions, use them to classify data, and ultimately to revise them based on
meta-analysis results
Observed Pattern = “N400” iff
Event type is onset of meaningful stimulus (e.g., word) AND
Peak latency is between 300 and 500 ms AND
Scalp region of interest (ROI) is centroparietal AND
Polarity over ROI is negative(>0)
Typical tabular representation of summary ERP data
Peak latency measurement
ERP observation (pattern extracted from “raw” ERP data)
The “RDF Triple”In RDF form: <001> <type> <NEMO_0000093>
Subject – Predicate –Object In natural language:
The data represented in row A is an instance of (“is a”) some ERP pattern.
That is, measurements (cells) are “about” ERP patterns (rows).
In graph form:
RDF Triple #2
In natural language =
The data represented in cell Z (row A, column 1) is an instance of (“is a”) a peak latency temporal measurement (i.e., the time at which the pattern is of maximal amplitude)
In RDF form: <002> <type> <NEMO_0745000>
Subject – Predicate –Object
RDF Triple #3
This graph represents an assertion, expressed in RDF =<001> <is_peak_latency_measurement_of> <002>
The data represented in cell Z is a temporal property of the ERP pattern represented in row A
The challenge (EEG pattern classification)
The methods & tools ontologies RDF database
Proof of concept (a worked example)
Case Study 5 (NEMO): Neural ElectroMagnetic Ontologies
Formulating pattern rules in the ontology
First, we write the rule in semi-natural language:
IF (1) 001 type ERP_spatiotemporal pattern• and (2) 002 type peak_latency_measurement_datum• and (3) 002 is_peak_latency_measurement_of 001,• and (4) 002 has_numeric_value X,• and (5) 500 >= X >= 300 (X has datatype decimal)
(in reality, there are spatial, temporal, & functional criteria…)
THEN (6) 001 type N400_pattern
Translating the rule into OWL/RDF
Next, we convert the rule to a SPARQL query by replacing natural language terms with corresponding URI (tags) from NEMO ontology
• type rdf:type • ERP_spatiotemporal_pattern NEMO_0000093• peak_latency_measurement NEMO_0745000• is_measurement_of NEMO_9278000• has_numeric_value NEMO_7943000
Executing the query
Finally, we load Virtuoso’s SPARQL interface http://nemo.nic.uoregon.edu:8890/sparql
& then cut and paste the query into the Query textbox and click Run Query.
…. And Virtuoso returns the following results (for ex):
As a result, we can deduce that ERP observations 0002, 0003, 0004, 0006, and 0140 are
N400 pattern instances… QED
Take-home message from CARMEN project:
“Raw data remains static; metadata evolves.”
(note this implies that the ontology also evolves!)
“Data integrity is preserved; the science has room to develop”
NEMO Database
Design
NOW YOU SHOULD KNOW…
• What is an ontology & what’s it for?– Why bother?– What are some “best practices” in ontology
design & implementation?
• What is RDF & what’s it for?– How does RDF represent information?– How is it used to link data to ontologies?– How can ontology-based annotation be used to
support classification of data?
Funding from the National Institutes of Health (NIBIB), R01-MH084812 (Dou, Frishkoff, Malony)
NEMO Ontology Task ForceRobert M. Frank (NIC)Dejing Dou (CIS)Paea LePendu (CIS)Haishan Liu (CIS)Allen Malony (NIC, CIS)Snezana Nikolic (PSY, GSU)
Acknowledgments
www.nemo.nic.uoregon.edu
NEMO EEG/MEG Data ConsortiumTim Curran (U. Colorado)Dennis Molfese (U. Louisville)John Connolly (McMaster U.)Kerry Kilborn (Glasgow U.)Charles Perfetti (U. Pittsburgh)
Special thanks to:Maryann Martone & associates (NIF)Jessica Turner (cogPO)Angela Laird (BrainMap)Scott Makeig & Jeff Grethe (EEGLAB/HeadIT)