2015 may19 cbd kd curiculum

2
Training Module 1: Collective Biomedical Data Vast collections of biomedical data are publically available; often only a few clicks away. For instance the NCBI Gene Expression Omnibus (GEO) holds almost 1.4 million transcriptome profiles from nearly 60,000 different studies (as of May 2015). Printing each transcriptome profile would take about 600 pages of papers (using Arial 8 font). Also the entire printed GEO record would produce a stack of paper 110 km high. This is likely to be only the tip of the iceberg as a lot more data generated using high throughput profiling technologies has remained private. This module will help you become familiar with public domain repositories and available tools that will allow you to gain access to these data. Short screencasts are available on the opposite page. View them at your own pace. You can also use the links provided below to reproduce the steps demonstrated in the screencast. Then spend some time exploring on your own. Although other types of large scale datasets are available this curriculum focuses for the time being on transcriptome data. Defining Transcriptome: “The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non-coding RNA transcribed in one cell or a population of cells. It differs from the exome in that it includes only those RNA molecules found in a specified cell population, and usually includes the amount or concentration of each RNA molecule in addition to the molecular identities.” http://en.wikipedia.org/wiki/Transcriptome 1.1 Gene Expression Browser (GXB) This interactive tool has been developed to support the approach described in this document. It can be used for browsing large amounts of data as seamlessly as possible. It is not meant to perform analyses. It works well for demonstration / training purposes. However its content is limited to human immunology studies. 1.2 NCBI “Gene Expression Omnibus” (GEO) http://www.ncbi.nlm.nih.gov/geo/ GEO is a large public repository for transcriptome data. It is administered by the National Center for Biotechnology and Information (NCBI) at the US National Institutes of Health in Bethesda Maryland. SC1: Search for Datasets

Upload: damien

Post on 25-Sep-2015

6 views

Category:

Documents


1 download

DESCRIPTION

Test

TRANSCRIPT

Training Module 1: Collective Biomedical DataVast collections of biomedical data are publically available; often only a few clicks away. For instance the NCBI Gene Expression Omnibus (GEO) holds almost 1.4 million transcriptome profiles from nearly 60,000 different studies (as of May 2015).Printing each transcriptome profile would take about 600 pages of papers (using Arial 8 font). Also the entire printed GEO record would produce a stack of paper 110 km high. This is likely to be only the tip of the iceberg as a lot more data generated using high throughput profiling technologies has remained private. This module will help you become familiar with public domain repositories and available tools that will allow you to gain access to these data.Short screencasts are available on the opposite page. View them at your own pace. You can also use the links provided below to reproduce the steps demonstrated in the screencast. Then spend some time exploring on your own. Although other types of large scale datasets are available this curriculum focuses for the time being on transcriptome data. Defining Transcriptome: The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non-coding RNA transcribed in one cell or a population of cells. It differs from the exome in that it includes only those RNA molecules found in a specified cell population, and usually includes the amount or concentration of each RNA molecule in addition to the molecular identities. http://en.wikipedia.org/wiki/Transcriptome 1.1 Gene Expression Browser (GXB)This interactive tool has been developed to support the approach described in this document. It can be used for browsing large amounts of data as seamlessly as possible. It is not meant to perform analyses. It works well for demonstration / training purposes. However its content is limited to human immunology studies.

1.2 NCBI Gene Expression Omnibus (GEO)http://www.ncbi.nlm.nih.gov/geo/ GEO is a large public repository for transcriptome data. It is administered by the National Center for Biotechnology and Information (NCBI) at the US National Institutes of Health in Bethesda Maryland.SC1: Search for DatasetsSC2:

1.3 Others

Several other repositories are probably worth covering and may be added at a later time. These include other public repositories such as the European Bioinformatics Institutes ArrayExpress, The Cancer Genome Atlas (TCGA), ImmGen, BioGPS .

There are also private ones that provide free basic browsing functionalities for academic and not for profit institutions: NextBio, Oncomine .

Training Module 2: Biomedical Literature Profiling

Training Module 1: Knowledge Gap IdentificationThe objective of this module is to become familiar