icar 2015 poster - araport

1
Agnes Chan 1 , Vivek Krishnakumar 1 , Chia-Yi Cheng 1 , Maria Kim 1 , Erik Ferlanti 1 , Irina Belyaeva 1 , Seth Schobel 1 , Sergio Contrino 3 , Matthew R. Hanlon 2 , Walter Moreira 2 , Steve Mock 2 , Joe Stubbs 2 , Jason R. Miller 1 , Matthew W. Vaughn 2 , Gos Micklem 3 , Christopher D. Town 1 1 J. Craig Venter Institute, Rockville, MD, USA; 2 Texas Advanced Computing Center, Austin, TX, USA; 3 Cambridge University, Cambridge, UK Col-0 Genome Reannotation, Araport11 Pre-release Araport was funded by NSF and BBSRC to develop the Arabidopsis Information Portal aimed at providing a one-stop-shop for a wide range of data sets. As part of its mandate, Araport assumed the responsibility for the Col-0 genome sequence and annotation. In the Araport11 release, we made use of 113 RNA-seq data sets along with annotation contributions from NCBI, UniProt, and labs conducting Arabidopsis thaliana research. Structural and functional annotation have been performed. Consolidation and annotation of non-coding RNAs is currently in progress. Araport - A“One-Stop” data shop for the 21 st century www.ARAPORT.org Araport11 Protein Coding Genes UniProt Update NCBI Novel Updated Transcripts Novel Transcripts Transcripts No Change Maker Novel NCBI SRA RNA-seq (113 datasets) PASA Update Trinity Assembly Web Apollo Community Annotation Web Apollo is available for the Arabidopsis community to curate and submit gene edits. Web Apollo uses a JBrowse interface and gene edits are instantly viewable by others, allowing the community to see and share annotation in real- time - just like Google Doc. What will happen to your curation? Community curation will be regularly reviewed by curators at Araport, published as a community curation track at the Araport JBrowse, with attributions to the contributors. Araport Users Data Sources Biologists/ Bioinformaticans Users who use detailed gene reports, protein reports, germplasm reports, or integrated data. Users who perform analyses such as GO term or pathway enrichment for gene lists. Users who access data via web services or bulk downloads. Software Developers Users who use Araport tools to expose their own data via interoperable web services. Users who use Araport tools to create science apps for analysis, visualization, and data integration. 1001 Genomes Variants (Ensembl) Epigenetics (EPIC-CoGe) TDNA-seq (Ecker) and over 70 tracks Co-expression (ATTED) ePictographs (BAR) Array Expression, Interactions (BAR) Win an iPad Developers’ Workshop Fall 2015 ThaleMine Chado Gene Report Gene List Analysis Query, Web Services JBrowse Custom Analysis Science Apps Araport11 Updated models RNA-seq by tissue Warehousing Publications, GeneRIF (NCBI, UniProt) TAIR10 Real-time Real-time Categories TAIR10 Araport11 Gene Loci Protein coding loci 27,416 28,565 Novel loci in Araport11 1,162 Gene loci with splice isoform 5,665 10,946 Transcripts Transcript isoforms 35,385 50,203 Transcripts altered in Araport11 CDS altered 933 UTR altered 25,079 Community Data/Tools JBrowse provides over 70 tracks of data including RNA-seq expression, 1001 genomes variants, TDNA-seq locations, and epigenomics data are available. A new variant data filter feature helps users to select for variants based on functional consequences. Click on the 1001 track menu (purple box below) to access the data filter function. ThaleMine Data Warehouse ThaleMine Gene Report presents integrated data from a variety sources, including GO annotation, array expression, co-expression, interactions, pathways, publications and homologs. Germplasm, genotype, phenotype data are coming soon. ThaleMine List Analysis tests for functional enrichments for GO terms, pathways, domains, publications, and chromosome distribution given a gene list of interest. Germplasm Genotype Phenotype Warehousing Interactions Pathways List Analysis Data Summary Data Model Web Apollo Variant Data Filter Pathways (KEGG) ~16,000 ~1,200 ~26,000 Google Play Android App 1. Araport: the Arabidopsis Information Portal. Nucleic Acids Research (2015), 43: D1003-9. PMID: 25414324 2. The Arabidopsis Information Portal: An Application Platform for Data Discovery. Proceedings of the 9th Gateway Computing Environments Workshop (2014) doi: 10.1109/GCE.2014.10 JBrowse References Bulletin Board Gene Edit Real-time

Upload: araport

Post on 17-Aug-2015

160 views

Category:

Science


0 download

TRANSCRIPT

Agnes Chan1, Vivek Krishnakumar1, Chia-Yi Cheng1, Maria Kim1, Erik Ferlanti1, Irina Belyaeva1, Seth Schobel1, Sergio Contrino3, Matthew R. Hanlon2, Walter Moreira2, Steve Mock2, Joe Stubbs2, Jason R. Miller1, Matthew W. Vaughn2, Gos Micklem3, Christopher D. Town1 1J. Craig Venter Institute, Rockville, MD, USA; 2Texas Advanced Computing Center, Austin, TX, USA;3Cambridge University, Cambridge, UK

Col-0 Genome Reannotation, Araport11 Pre-release Araport was funded by NSF and BBSRC to develop the Arabidopsis Information Portal aimed at providing a one-stop-shop for a wide range of data sets. As part of its mandate, Araport assumed the responsibility for the Col-0 genome sequence and annotation. In the Araport11 release, we made use of 113 RNA-seq data sets along with annotation contributions from NCBI, UniProt, and labs conducting Arabidopsis thaliana research. Structural and functional annotation have been performed. Consolidation and annotation of non-coding RNAs is currently in progress.

Araport - A“One-Stop” data shop for the 21st century

www.ARAPORT.org

Araport11 Protein Coding Genes

UniProt Update

NCBI Novel

Updated Transcripts

Novel Transcripts

Transcripts No Change

Maker Novel

NCBI SRA RNA-seq

(113 datasets)

PASA Update

Trinity Assembly

Web Apollo Community Annotation Web Apollo is available for the Arabidopsis community to curate and submit gene edits. Web Apollo uses a JBrowse interface and gene edits are instantly viewable by others, allowing the community to see and share annotation in real-time - just like Google Doc. What will happen to your curation? Community curation will be regularly reviewed by curators at Araport, published as a community curation track at the Araport JBrowse, with attributions to the contributors.

Araport Users Data Sources

Biologists/ Bioinformaticans

• Users who use detailed gene reports, protein reports, germplasm reports, or integrated data.

• Users who perform analyses such as GO term or pathway enrichment for gene lists.

• Users who access data via web services or bulk downloads.

Software Developers • Users who use Araport tools to

expose their own data via interoperable web services.

• Users who use Araport tools to create science apps for analysis, visualization, and data integration.

1001 Genomes Variants

(Ensembl) Epigenetics

(EPIC-CoGe)

TDNA-seq (Ecker) and over 70 tracks

Co-expression (ATTED)

ePictographs (BAR)

Array Expression, Interactions (BAR)

Win an iPad

Developers’ Workshop Fall 2015

ThaleMine

Chado

Gene Report

Gene List Analysis

Query, Web Services

JBrowse

Custom Analysis

Science Apps

Araport11 • Updated models • RNA-seq by tissue

Warehousing

Publications, GeneRIF

(NCBI, UniProt)

TAIR10

Real-time

Real-time

Categories TAIR10 Araport11 Gene Loci Protein coding loci 27,416 28,565 Novel loci in Araport11 1,162 Gene loci with splice isoform 5,665 10,946 Transcripts Transcript isoforms 35,385 50,203 Transcripts altered in Araport11 CDS altered 933 UTR altered 25,079

Community Data/Tools

JBrowse provides over 70 tracks of data including RNA-seq expression, 1001 genomes variants, TDNA-seq locations, and epigenomics data are available. A new variant data filter feature helps users to select for variants based on functional consequences. Click on the 1001 track menu (purple box below) to access the data filter function.

ThaleMine Data Warehouse ThaleMine Gene Report presents integrated data from a variety sources, including GO annotation, array expression, co-expression, interactions, pathways, publications and homologs. Germplasm, genotype, phenotype data are coming soon. ThaleMine List Analysis tests for functional enrichments for GO terms, pathways, domains, publications, and chromosome distribution given a gene list of interest.

Germplasm • Genotype • Phenotype

Warehousing

Interactions Pathways List Analysis

Data Summary Data Model

Web Apollo

Variant Data Filter

Pathways (KEGG)

~16,000 ~1,200

~26,000

Google Play Android App

1. Araport: the Arabidopsis Information Portal. Nucleic Acids Research (2015), 43: D1003-9. PMID: 25414324

2. The Arabidopsis Information Portal: An Application Platform for Data Discovery. Proceedings of the 9th Gateway Computing Environments Workshop (2014) doi: 10.1109/GCE.2014.10

JBrowse References

Bulletin Board Gene Edit

Real-time