neurodatabase org and neuroanalysis org: tools and ...pbsb.med.cornell.edu/pdfs/sfn07-ndb.pdf ·...

1
NEURODATABASE. ORG AND NEUROANALYSIS. ORG: TOOLS AND RESOURCES FOR DATA DISCOVERY Daniel Gardner 1,2 , Eliza Chan 1 , David H. Goldberg 1 , Ajit B. Jagdale 1 , Adrian Robert 1 , Jonathan D. Victor 2 1 Lab of Neuroinformatics, Dept of Physiology, 2 Dept of Neurology &Neuroscience, Weill Cornell Med Coll, NY, NY ACKNOWLEDGEMENTS: AID NEUROSCIENCE DATA DISCOVERY • Share your data at neurodatabase.org • Download shared data from neurodatabase.org • Analyze your data or shared data using information-theoretic tools from neuroanalysis.org • Contribute your analytic algorithms to neuroanalysis.org • Build a database for your data with the Neurodatabase Construction Kit • Share terminologies for your domain of neuroscience at BrainML.org Advances in Neuroinformatics These new capabilities rely on many enabling advances in the underlying code and resources. We cite three: BrainML for neuroanalysis.org and neurodatabase.org includes: methods and algorithms for additional analyses expanded descriptors for recording location, including extensive anatomic/functional locations, Brodmann cytoarchitectonic areas, and depth/layers for cortex, cerebellum, and spinal cord many new neuron types are under development BrainML is enriched with terminologies from the NIH Blueprint Neuroscience Information Framework (NIF; see adjacent poster 100.9) to allow both neurodatabase.org queries via NIF and also NIF-compatibility via the Neurodatabase Construction Kit. Neurodatabase.org tools and DataServer have been updated to aid archiving of and access to larger datasets, up to 25 MB in size and up to 20,000 electrophysiological traces per experiment or submission. For convenience, query results yielding such large datasets offer metadata alone for preliminary screening, with selected datasets then served in chunks of 50 traces. An HTML interface is provided for queries using iPhone and similar devices. 6. Data Parsing and Algorithm Selection Towards integration of the STAToolkit with the data repository at neurodatabase.org and its linked 64-processor parallel array, a suite of tools will allow such fundamental operations as dataset selection, segmentation, concatenation, and grouping, and selection of methods for entropy and information computation. NEUROANALYSIS. ORG AND THE SPIKE TRAIN ANALYSIS TOOLKIT OFFER TWO PATHS FOR EXPLORING NEUROPHYSIOLOGY DATA USING I NFORMATION THEORY We acknowledge with thanks the many labs providing datasets to neurodatabase.org, and the originators of the information-theoretic methods we have implemented at neuroanalysis.org. Human Brain Project / Neuroinformatics research in the Laboratory of Neuroinformatics is funded via MH68012 from NIMH, NINDS, NIBIB, NIA & NSF, and MH57153 from NIMH. LNI Laboratory of Neuroinformatics Weill Medical College of Cornell University NEURODATABASE. ORG ACCEPTS, ANNOTATES, ARCHIVES, DOWNLOADS, AND DISPLAYS A BROAD RANGE OF NEUROPHYSIOLOGY DATA NEURODATABASE CONSTRUCTION KIT 2. UploadTool Guides and Eases Data Submission User-friendly instructions and explanatory, readily-navigable screens aid users in sharing their datasets. Submitters retain all rights to data. 1. NDK Version 1.0 Released Open Source, with: BrainML ModelServer (brainml.org-derived) for displaying, updating, and expanding compatible terminologies, DataServer (neurodatabase.org-derived) for archiving and serving neuroscience experimental data and metadata Step-by-step guided Java interface (UploadTool-derived) for uploading data to a BrainML-compatible data repository A user-friendly Java interface (QueryTool-derived) for searching a data repository using BrainML-compatible terms, and Data Viewer (Virtual Oscilloscope) for dynamic display of datasets. NDK is written in Java and includes SQL for MySQL and PostgreSQL, Java Struts, CSS and XSL allowing customization, current Neuroscience Information Framework-linked BrainML metadata, documentation, and a BSD-style OS license. 3. QueryTool Searches Using Extensive BrainML Experimental and Bibliographic Metadata Neurodatabase.org includes our Java Web Start UploadTool for submissions and QueryTool for search. Both tools derive neuroscience metadata via BrainML, our XML data description language, for consistency and compliance with this emerging standard. The figure shows a query being formed and databased experimental results presented to the user, who can then choose to save or display the archived datasets. 4. Virtual Oscilloscope Lists, Downloads, and Displays Data Archived at Neurodatabase.org Static images of neurophysiology data lose detail and are isolated from the actual data. The Virtual Oscilloscope (VO), a Java Web Start application, downloads actual underlying datasets from neurodatabase.org and forms dynamic views, with user control of sweep and gain. The VO can display current or voltage timeseries, spike replicas, pre- or un-binned histograms, and bivariate graphs. Data shown from labs of K.H.Knuth, S.S. Hsaio, and E.P. Gardner. NEUROINFORMATICS TOOLS ENABLE NEUROSCIENCE DATA DISCOVERY Neuroinformatics–methods for archiving, classifying, and exchanging neuroscience data—leverages information technology to foster data sharing and collaboration. Computational neuroinformatics synthesizes neuroinformatics and computational neuroscience. We present recent advances in our evolving neuroinformatic resources for neurophysiology: the Neurodatabase Construction Kit, neurodatabase.org, and neuroanalysis.org. • Neurodatabase Construction Kit, an Open Source codebase and package, enables neurobiologists to leverage our development by producing customizable versions of neurodatabases and other resources, toward public web resources or local lab notebooks. • Neurodatabase.org, our freely-accessible neurophysiology data archive and server, has been extended in scope with data from many additional preparations and techniques, improved data capacity, and expanded metadata terminology linked to the NIH Blueprint Neuroscience Information Framework. • Neuroanalysis.org enables application of information theory to neural coding, using our stand-alone Spike Train Analysis Toolkit or evolving integrated resources for parsing and analyzing data. 5. Spike Train Analysis Toolkit: Capabilities and Options The Spike Train Analysis Toolkit Version 1.0 (STAToolkit) is implemented for use on desktop workstations and available now for download at http://neuroanalysis.org/toolkit. Features include: Full source code, installation/demonstration scripts, documentation, and sample datasets. Open source, runs in Windows, Linux, and MacOS Implemented in C and includes a Matlab interface Uses a simple, platform-independent, human-readable data format Mutual information is calculated both formally (information/time for temporally rich stimuli) and categorically (information conveyed about specific experimental parameters) The current version includes these methods; others are planned: – Direct method single and multineuron versions (Strong et al 1998) – Metric space method, single and multineuron versions (Victor & Purpura 1997) – Binless embedding method (Victor 2002) Several techniques estimate entropy from a histogram: Plugin (–Σ i p i log p i ) Asymptotically debiased (Treves & Panzeri 1995, Miller 1955, Carlton 1965) Jackknife (Efron & Tibshirani 1998) Debiased Ma bound (Ma 1981) Best upper bound (Paninski 2003) Coverage-adjusted (Chao & Shen 2003) Bayesian/Dirichlet prior (Wolpert & Wolf 1995) We work with members of the computational neuroscience community to incorporate their information theoretic techniques, as well as looking beyond information theory to other methodologies for analyzing neurophysiology data. 100.10 sweep and gain control

Upload: others

Post on 21-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

NEURODATABASE.ORG AND NEUROANALYSIS.ORG: TOOLS AND RESOURCES FOR DATA DISCOVERY

Daniel Gardner1,2, Eliza Chan1, David H. Goldberg1, Ajit B. Jagdale1, Adrian Robert1, Jonathan D. Victor2

1Lab of Neuroinformatics, Dept of Physiology, 2Dept of Neurology & Neuroscience, Weill Cornell Med Coll, NY, NY

ACKNOWLEDGEMENTS:AID NEUROSCIENCE DATA DISCOVERY

• Share your data at neurodatabase.org

• Download shared data from neurodatabase.org

• Analyze your data or shared data using information-theoretic tools from neuroanalysis.org

• Contribute your analytic algorithms to neuroanalysis.org

• Build a database for your data with the Neurodatabase Construction Kit

• Share terminologies for your domain of neuroscience at BrainML.org

Advances in NeuroinformaticsThese new capabilities rely on many enabling advances in theunderlying code and resources. We cite three:

BrainML for neuroanalysis.org and neurodatabase.org includes:• methods and algorithms for additional analyses • expanded descriptors for recording location, including extensive

anatomic/functional locations, Brodmann cytoarchitectonicareas, and depth/layers for cortex, cerebellum, and spinal cord

• many new neuron types are under development

BrainML is enriched with terminologies from the NIH BlueprintNeuroscience Information Framework (NIF; see adjacent poster100.9) to allow both neurodatabase.org queries via NIF and alsoNIF-compatibility via the Neurodatabase Construction Kit.

Neurodatabase.org tools and DataServer have been updated to aidarchiving of and access to larger datasets, up to 25 MB in size andup to 20,000 electrophysiological traces per experiment orsubmission. For convenience, query results yielding such largedatasets offer metadata alone for preliminary screening, withselected datasets then served in chunks of 50 traces. An HTMLinterface is provided for queries using iPhone and similar devices.

6. Data Parsing and Algorithm SelectionTowards integration of the STAToolkit with the data repository atneurodatabase.org and its linked 64-processor parallel array, asuite of tools will allow such fundamental operations as datasetselection, segmentation, concatenation, and grouping, andselection of methods for entropy and information computation.

NEUROANALYSIS.ORG AND THE SPIKE TRAIN ANALYSIS TOOLKIT OFFER TWOPATHS FOR EXPLORING NEUROPHYSIOLOGY DATA USING INFORMATION THEORY

We acknowledge with thanks the many labs providing datasets toneurodatabase.org, and the originators of the information-theoreticmethods we have implemented at neuroanalysis.org.

Human Brain Project / Neuroinformatics research in the Laboratoryof Neuroinformatics isfunded via MH68012 fromNIMH, NINDS, NIBIB,NIA & NSF, and MH57153from NIMH.

LNILaboratory ofNeuroinformatics

Weill Medical Collegeof Cornell University

NEURODATABASE.ORG ACCEPTS, ANNOTATES, ARCHIVES, DOWNLOADS, ANDDISPLAYS A BROAD RANGE OF NEUROPHYSIOLOGY DATA

NEURODATABASE CONSTRUCTION KIT

2. UploadTool Guides and Eases Data SubmissionUser-friendly instructions and explanatory, readily-navigable screensaid users in sharing their datasets. Submitters retain all rights to data.

1. NDK Version 1.0 Released Open Source, with:• BrainML ModelServer (brainml.org-derived) for displaying,

updating, and expanding compatible terminologies,• DataServer (neurodatabase.org-derived) for archiving and serving

neuroscience experimental data and metadata• Step-by-step guided Java interface (UploadTool-derived) for

uploading data to a BrainML-compatible data repository• A user-friendly Java interface (QueryTool-derived) for searching a

data repository using BrainML-compatible terms, and• Data Viewer (Virtual Oscilloscope) for dynamic display of datasets.

NDK is written in Java and includes SQL for MySQL andPostgreSQL, Java Struts, CSS and XSL allowing customization,current Neuroscience Information Framework-linked BrainMLmetadata, documentation, and a BSD-style OS license.

3. QueryTool Searches Using Extensive BrainMLExperimental and Bibliographic MetadataNeurodatabase.org includes our Java Web Start UploadTool forsubmissions and QueryTool for search. Both tools deriveneuroscience metadata via BrainML, our XML data descriptionlanguage, for consistency and compliance with this emergingstandard. The figure shows a query being formed and databasedexperimental results presented to the user, who can then choose tosave or display the archived datasets.

4. Virtual Oscilloscope Lists, Downloads, andDisplays Data Archived at Neurodatabase.orgStatic images of neurophysiology data lose detail and are isolatedfrom the actual data. The Virtual Oscilloscope (VO), a Java WebStart application, downloads actual underlying datasets fromneurodatabase.org and forms dynamic views, with user control ofsweep and gain. The VO can display current or voltage timeseries,spike replicas, pre- or un-binned histograms, and bivariate graphs.Data shown from labs of K.H.Knuth, S.S. Hsaio, and E.P. Gardner.

NEUROINFORMATICS TOOLS ENABLENEUROSCIENCE DATA DISCOVERYNeuroinformatics–methods for archiving, classifying, and exchangingneuroscience data—leverages information technology to foster datasharing and collaboration. Computational neuroinformaticssynthesizes neuroinformatics and computational neuroscience.

We present recent advances in our evolving neuroinformaticresources for neurophysiology: the Neurodatabase Construction Kit,neurodatabase.org, and neuroanalysis.org.

• Neurodatabase Construction Kit, an Open Source codebase and package, enables neurobiologists to leverage our development by producing customizable versions of neurodatabases and other resources, toward public web resources or local lab notebooks.

• Neurodatabase.org, our freely-accessible neurophysiology data archive and server, has been extended in scope with data from many additional preparations and techniques, improved data capacity, and expanded metadata terminology linked to the NIHBlueprint Neuroscience Information Framework.

• Neuroanalysis.org enables application of information theory to neural coding, using our stand-alone Spike Train Analysis Toolkit or evolving integrated resources for parsing and analyzing data.

5. Spike Train Analysis Toolkit: Capabilities and OptionsThe Spike Train Analysis Toolkit Version 1.0 (STAToolkit) isimplemented for use on desktop workstations and available now fordownload at http://neuroanalysis.org/toolkit. Features include:• Full source code, installation/demonstration scripts,

documentation, and sample datasets.• Open source, runs in Windows, Linux, and MacOS• Implemented in C and includes a Matlab interface• Uses a simple, platform-independent, human-readable data format• Mutual information is calculated both formally (information/time

for temporally rich stimuli) and categorically (information conveyed about specific experimental parameters)

• The current version includes these methods; others are planned:– Direct method single and multineuron versions (Strong et al 1998)– Metric space method, single and multineuron versions (Victor &

Purpura 1997)– Binless embedding method (Victor 2002)

• Several techniques estimate entropy from a histogram: – Plugin (–Σi pi log pi)– Asymptotically debiased (Treves & Panzeri 1995, Miller 1955,

Carlton 1965)– Jackknife (Efron & Tibshirani 1998)– Debiased Ma bound (Ma 1981)– Best upper bound (Paninski 2003)– Coverage-adjusted (Chao & Shen 2003)– Bayesian/Dirichlet prior (Wolpert & Wolf 1995)

We work with members of the computational neuroscience communityto incorporate their information theoretic techniques, as well as lookingbeyond information theory to other methodologies for analyzingneurophysiology data.

100.10

sweep andgain control