metadata for carmen phillip lord and frank gibson
Post on 18-Dec-2015
221 views
TRANSCRIPT
Metadata For CARMEN
Phillip Lord and Frank Gibson
Problems
• “In the standard model, one collects data, publishes a paper or papers and then gradually loses the original dataset.”
• THE NEW KNOWLEDGE ECONOMY AND SCIENCE AND TECHNOLOGY POLICY Geoffrey Bowker, University of California, San Diego
The need for clear metadata
• Most neurosciences data is relative simple in structure
• But often contextually complex
• Sometimes associated with behavioural features
Information Extraction
• How do we get extract the information?
http://en.wikipedia.org/wiki/Image:Brain_090407.jpg
http://en.wikipedia.org/wiki/Image:ATTtelephone-large.jpg
istockphoto.com
Multi-Author data
Author PMID Type Size
1 Davierwala et al 16155567 Synthetic_Lethality 627
2 Krogan et al 14759368 Affinity_Capture-MS 164
3 Hazbun et al 14690591 Affinity_Capture-MS 3210
4 Gavin et al 11805826 Affinity_Capture-MS 3596
5 Ho et al 11805837 Affinity_Capture-MS 733
6 Ito et al 11283351 Two-hybrid 275
From Katherine James, NCL
234Two-hybrid17634282Wong et al50576Co-fractionation17507646Aronova et al49117Affinity_Capture-MS17200106Collins et al48
9064Phenotypic_Enhancement17314980Collins et al4714421Phenotypic_Enhancement16269340Schuldiner et al463416Synthetic_Rescue16729061Ye et al45290Co-fractionation16476776Frazier et al44103Biochemical_Activity16319894Ptacek et al43
4179Affinity_Capture-MS14660704Graumann et al42477Two-hybrid16172405Measday et al41215Synthetic_Lethality16118188Milgrom et al40107Affinity_Capture-MS16429126Gavin et al39
6531Affinity_Capture-MS16554755Krogan et al387076Synthetic_Growth_Defect16487579Pan et al374535Synthetic_Lethality16157669Daniel et al36214Synthetic_Lethality15725626Loeillet et al35124Synthetic_Lethality15525520Pan et al34323Synthetic_Lethality15715908Lesage et al33292Synthetic_Lethality15166135Lesage et al32175Affinity_Capture-Western15657441Ingvarsdottir et al31138Biochemical_Activity14574415Ubersax et al30369Affinity_Capture-Western15879519Millson et al29125Reconstituted_Complex15766533Zhao et al28464Two-hybrid11087867Newman et al27134Two-hybrid15590687Hannich et al26157Protein-peptide15563457Kong et al25113Affinity_Capture-MS15353583Krogan et al24181Affinity_Capture-MS15292183Panse et al23116Affinity_Capture-MS11387327Allen et al22125Affinity_Capture-Western11743162Tong et al21232Two-hybrid11489916Drees et al20182Two-hybrid9207794Fromont-Racine et al19160Two-hybrid10900456Fromont-Racine et al18258Affinity_Capture-MS14729968Baetz et al17102Affinity_Capture-MS12052880Sanders et al16370Affinity_Capture-MS14690608Krogan et al15630Affinity_Capture-MS11884590Ohi et al14150Affinity_Capture-MS12150911Grandi et al13456Affinity_Capture-MS12374754Nissan et al12134Affinity_Capture-MS12556496Lindstrom et al11104Two-hybrid16093310Miller et al10
1941Two-hybrid10688190Uetz et al9823Synthetic_Lethality14764870Tong et al8
3411Synthetic_Lethality11743205Tong et al7275Two-hybrid11283351Ito et al6733Affinity_Capture-MS11805837Ho et al5
3596Affinity_Capture-MS11805826Gavin et al43210Affinity_Capture-MS14690591Hazbun et al3164Affinity_Capture-MS14759368Krogan et al2627Synthetic_Lethality16155567Davierwala et al1
SizeTypePMIDAuthor
How do we represent…
LaboratoryExperiments
In silico Analysis
Derived data
Joseph Whitworth
http://en.wikipedia.org/wiki/Image:Joseph_whitworth.jpg
http://en.wikipedia.org/wiki/Image:Screw_thread_Z%C3%A1vit_M16.jpg
The need for standards!
• “established by consensus and approved by a recognized body, that provides, […] rules, […] for […] the optimum degree of order in a given context”
• BSI -• http://www.bsi-global.com/en/Standards-and-Publications/About-standards/Glossary/
View from microarrays
Content Standard – Minimal Information
MAGE -- Structure MO -- Terminology
From the MGED society
Life science communities
Society Domain Website
The Genomics Standards Consortium (GCS)
Genomics http://darwin.nox.ac.uk/gsc/
Microarray and Gene Expression Data Society (MGED)
Genomics www.mged.org
Proteomics Standards Initiative (PSI)
Proteomics http://psidev.info
Metabolomics Standards Initiative (MSI)
Metabolomics www.metabolomicssociety.org
Flow Cytometry experiment Community
Flow Cytometry
www.flowcyt.org
MINI – electrophysiology
• General Features
• Study Subject
• Recording Location
• Task
• Stimulus
• Recording
• Time Series Data
Recording Location
• Recording Location Structure
• Brain Area
• Slice Thickness
• Slice Orientation
• Cell Type– Cell Type co-ordintates– Location conformation
View from microarrays
Content Standard – Minimal Information
MAGE -- Structure MO -- Terminology
From the MGED society
Functional Genomics Experiment (FuGE)
• Model of common components in science investigations, such as materials, data, protocols, equipment and software.
• Provides a framework for capturing complete laboratory workflows, enabling the integration of pre-existing data formats.
Robot
Reference set of 5,000 mutant strains
‘Folate’ + - + -‘MMS’ - - + +
• Data curation.• Functional analysis.• Interactions with in silico programme.
***
Robot
Screen mutants for sensitivity to damage/nutrition
Part of CISBAN in a nutshell
CISBAN dataflow
Neil Wipat, Newcastle University
Data Entry with SYMBA
http://symba.sourceforge.net/Allyson Lister, Newcastle University
Data Entry with SyMBA
Summary
• We are generating metadata “standards” for neurosciences
• We are following a well-trodden path from bioinformatics
• We adopted FuGE and have built MINI
Future Work
• More neurosciences experimental datatypes.
• Minimal Information about a Service– Describe analysis software as well as lab
experiments.
• Outreach!
Acknowledgements
MINI: Frank Gibson, Paul G Overton, Tom V Smulders, Simon R Schultz, Stephen J Eglen, Colin D Ingram, Stefano Panzeri, Phil Bream, Evelyne Sernagor, Mark Cunningham, Christopher Adams, Christoph Echtermeyer, Jennifer Simonotto, Marcus Kaiser, Daniel C Swan, Martyn Fletcher, Phillip Lord
CISBAN: Anil Wipat (PI), Allyson Lister (Research Associate),