how to make immport data fit for secondary use
DESCRIPTION
How to make ImmPort data fit for secondary use. Barry Smith http://ontology.buffalo.edu/smith. Goals of ImmPort. Accelerate a more collaborative and coordinated research environment Create an integrated database that broadens the usefulness of scientific data - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/1.jpg)
How to make ImmPort data fit for secondary use
Barry Smithhttp://ontology.buffalo.edu/smith
![Page 2: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/2.jpg)
Goals of ImmPort• Accelerate a more collaborative and coordinated research
environment
• Create an integrated database that broadens the usefulness of scientific data
• Advance the pace and quality of scientific discovery
• Integrate relevant data sets from participating laboratories, public and government databases, and private data sources
• Promote rapid availability of important findings
• Provide analysis tools to advance immunological research
![Page 3: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/3.jpg)
Improve immunology research through enhanced
• Collaboration• Coordination• Discoverability• Integration• Analyzability
Hypothesis: all of these ends will be promoted by describing ImmPort data using terms from shared high quality ontologies
![Page 4: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/4.jpg)
ImmPort data is already being tagged with ontology terms
For example• where data is prepared to meet FDA requirements• where data is published to meet NIH mandates for
reusability• in the post-submission phase, where data is analyzed by
third parties
But this tagging is • partial• uncoordinated• uses ontologies and analysis tools of varying quality
![Page 5: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/5.jpg)
![Page 6: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/6.jpg)
SDY 165: Characterization of in vitro Stimulated B Cells from Human Subjects shared to Semi-
Public Workspace (SPW) Project
![Page 7: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/7.jpg)
SDY 165: Characterization of in vitro Stimulated B Cells from Human Subjects shared to Semi-
Public Workspace (SPW) Project
During the human B cell (Bc) recall response, rapid cell division results in multiple Bc subpopulations. RNA microarray and functional analyses showed that proliferating CD27lo cells are a transient pre-plasmablast population, expressing genes associated with Bc receptor editing. Undivided cells had an active transcriptional program of non-ASC B cell functions, including cytokine secretion and costimulation, suggesting a link between innate and adaptive Bc responses. Transcriptome analysis suggested a gene regulatory network for CD27lo and CD27hi Bc differentiation.
• In vitro stimulated B cells from human subjects • B cell receptor editing
![Page 8: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/8.jpg)
SDY 165: Characterization of in vitro Stimulated B Cells from Human Subjects shared to Semi-
Public Workspace (SPW) Project
![Page 9: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/9.jpg)
Pubmed 22468229
![Page 10: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/10.jpg)
Discoverability: examples
• Find [ImmPort] data pertaining to in vitro stimulated B cells from human subjects
• Find studies of genes associated with B cell receptor editing in human subjects
• Find all data in public and government databases relating to B cell receptor editing
![Page 11: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/11.jpg)
Discoverability through literature search
Two queries: – In vitro stimulated B cells from human subjects– B cell receptor editingon• Pubmed• MeSH (Medical Subject Headings)• Google
![Page 12: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/12.jpg)
Pubmed 22468229
![Page 13: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/13.jpg)
PubMed retrieves 144 results for “In vitro stimulated B cells from human Subjects” –
Zand paper not found
![Page 14: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/14.jpg)
PubMed retrieves 0 results for “Zand[Author] AND In vitro stimulated B cells from human subjects”
![Page 15: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/15.jpg)
Pubmed retrieves 179 results for “B cell receptor editing” – Zand paper not found
![Page 16: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/16.jpg)
MeSH results for “In vitro stimulated B cells from human subjects”
![Page 17: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/17.jpg)
MeSH results for “in vitro stimulated B cells from human subjects”
![Page 18: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/18.jpg)
MeSH results for “B Cell receptor editing”
![Page 19: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/19.jpg)
Google retrieves 180 results for “In vitro stimulated B cells from human subjects” –
Zand paper not found
![Page 20: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/20.jpg)
Jackpot
![Page 21: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/21.jpg)
How to make this [ImmPort data]SDY 165: Characterization of in vitro Stimulated B Cells from Human Subjects shared to Semi-Public Workspace (SPW) ProjectDuring the human B cell (Bc) recall response, rapid cell division results in multiple Bc subpopulations. RNA microarray and functional analyses showed that proliferating CD27lo cells are a transient pre-plasmablast population, expressing genes associated with Bc receptor editing. Undivided cells had an active transcriptional program of non-ASC B cell functions, including cytokine secretion and costimulation, suggesting a link between innate and adaptive Bc responses. Transcriptome analysis suggested a gene regulatory network for CD27lo and CD27hi Bc differentiation.
discoverable?
![Page 22: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/22.jpg)
B cell receptor editing
GO:0002452
![Page 23: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/23.jpg)
GO definition
GO provides a definition
![Page 24: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/24.jpg)
and position in GO hierarchy
-- hierarchy allows logical reasoning
![Page 25: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/25.jpg)
GOPubMed: 179 results for “B cell receptor editing”
![Page 26: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/26.jpg)
(B cell receptor editing Zand) AND ("Zand"[au])
why are zero documents retrieved?
![Page 27: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/27.jpg)
Proposal1. Tag ImmPort SDY abstracts with GO URIs2. Publish the results to the GO Annotation database
During the human B cell recall response, rapid cell division results in multiple B cell subpopulations. RNA microarray and functional analyses showed that proliferating CD27lo cells are a transient pre-plasmablast population, expressing genes associated with B cell receptor editing. Undivided cells had an active transcriptional program of non-ASC B cell functions, including cytokine secretion and costimulation, suggesting a link between innate and adaptive Bc responses. Transcriptome analysis suggested a gene regulatory network for CD27lo and CD27hi Bc differentiation.
![Page 28: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/28.jpg)
But GO is not enough
See http://ncorwiki.buffalo.edu/index.php/Immunology_Ontologies
immune disordersinfectious diseasesallergiesimmune epitopes, etc. etc.
For special case of Flow Cytometry and CyTOF:ImmPort Ontology Meeting, Stanford, September 4-5, 2013: http://x.co/1W1Om
![Page 29: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/29.jpg)
Files in SDY 165
![Page 30: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/30.jpg)
lk_race.txt
American Indian or Alaska NativeAsianBlack or African AmericanNative Hawaiian or Other Pacific IslanderNot_SpecifiedOtherUnknownWhite
![Page 31: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/31.jpg)
ImmPort Templates
https://immport.niaid.nih.gov/immportWeb/experimental/displaySubmitTemplates.do
![Page 32: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/32.jpg)
ImmPort Templates: Race
https://immport.niaid.nih.gov/immportWeb/experimental/displaySubmitTemplates.do
![Page 33: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/33.jpg)
ImmPort Templates
How specify Race if Race = ‘Other’?
![Page 34: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/34.jpg)
ImmPort Templates
How specify “Subject Phenotype”?
![Page 35: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/35.jpg)
NG / BISC proposal
create controlled vocabularies (ontology drop down lists) for fields currently populated by submitters with free text
![Page 36: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/36.jpg)
Files in SDY 165
![Page 37: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/37.jpg)
![Page 38: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/38.jpg)
lk_sample_type
proposal: where controlled vocabularies exist, provide definitions for all terms
![Page 39: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/39.jpg)
Two kinds of definitions• human readable definitions support consistency
of data entry• logical definitions – allow logical analysis of data– support aggregation of data– allow automatic validation of consistent data entry
Definitions can often be taken over from already existing public domain ontologies such as GO • use of ready-made definitions supports discoverability,
and creates automatic linkage to huge bodies of public domain data
![Page 40: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/40.jpg)
ImmPort Antibody Registry (Diehl, et al)
from BD Lyoplate Screening Panels Human Surface Markers
![Page 41: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/41.jpg)
Discoverability
![Page 42: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/42.jpg)
Where did this lk_sample_type list come from?
![Page 43: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/43.jpg)
CDISC
• Clinical Data Interchange Standards Consortium
• http://www.cdisc.org/
![Page 44: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/44.jpg)
CDISC Glossary
![Page 45: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/45.jpg)
SDTM• Study Data Tabulation Model developed by
FDA as part of CDISC– for Race, Gender, Ethnicity, … – no human readable definitions – no logical definitions
Jan 2013: release of CDISC SDTM Model by CDISC2RDF (Kerstin Forsberg of AstraZeneca)
![Page 46: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/46.jpg)
PHUSE (EU, Roche, AstraZeneca, FDA, …) project to incorporate ontology technology
into CDISC
![Page 47: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/47.jpg)
BRIDG
• http://bridgmodel.nci.nih.gov/files/BRIDG_Model_3.2_html/index.htm
• Biomedical Research Integrated Domain Group (BRIDG) Project
![Page 48: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/48.jpg)
BRIDG 3.2 Domain Analysis Model
![Page 49: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/49.jpg)
Other strategies to simplify creation of structured data for submission into ImmPort
• ELN: Electronic Lab Notebooks – PRIME: “Contur ELN has been automating the process
of data deposition into ImmPort, making it much easier for our researchers to submit data to ImmPort”
• CTMS: Clinical Trial Management Systems• EHR: Electronic Health Records– experiments to prepopulate EHR data into CTMS and
from there into case report forms (and into ImmPort?)• Minimal Information Checklists
![Page 50: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/50.jpg)
MIFLOWCYT: Minimal Information for a Flow Cytometry Experiment
![Page 51: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/51.jpg)
Checklist strategy for creating public data repositories via journals
• 75% of articles in Cytometry A are MiFlowCyt compliant
• Result: a growing repository of flow cytometry data (flowrepository.org)
• OBI = Ontology for Biomedical Investigations, an ontology to support creation of structured data about clinical and biological experiments
![Page 52: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/52.jpg)
http://mibbi.sourceforge.net/portal.shtml
![Page 53: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/53.jpg)
Proposal
advertise on ImmPort website best (= most successful) practices from• ELN: Electronic Lab Notebooks • CTMS: Clinical Trial Management Systems• EHR: Electronic Health Records• Minimal Information Checklists
![Page 54: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/54.jpg)
![Page 55: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/55.jpg)
NIAID Sample Data Sharing Plan (Last Reviewed February 12, 2013)
• Sharing of data generated by this project is an essential part of our proposed activities and will be carried out in several different ways.
• Presentations at national scientific meetings. … it is expected that approximately four presentations at national meetings would be appropriate. …
• Annual lectureship. A lectureship has brought to the University distinguished scientists and clinicians …
• Newsletter. The [disease interest group] publishes a newsletter …• Web site of the Interest Group. The [interest group] currently maintains a Web
site where information [about the disease] is posted …• Annual [Disease] Awareness week….• SAGE Library Data. It is our explicit intention that these [Serial analysis of gene
expression] data will be placed in a readily accessible public database. …
![Page 56: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/56.jpg)
NIAID Sample Data Sharing Plan
• SAGE Library Data. It is our explicit intention that these [Serial analysis of gene expression] data will be placed in a readily accessible public database. …
–but how will these data be described?
![Page 57: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/57.jpg)
Proposal
All data sharing plans for NIAID-funded research should require:• paper abstracts and SDY summaries be tagged
with ontology terms• tables and figures in papers be tagged with
ontology terms
![Page 58: How to make ImmPort data fit for secondary use](https://reader036.vdocuments.net/reader036/viewer/2022081505/568164da550346895dd72af7/html5/thumbnails/58.jpg)
See http://ncorwiki.buffalo.edu/index.php/Immunology_Ontologies
ImmPort Ontology Meeting, Stanford, September 4-5, 2013: http://x.co/1W1Om
Further information from [email protected]