transcriptomics jim noonan gene 760. transcriptomics

Download Transcriptomics Jim Noonan GENE 760. Transcriptomics

If you can't read please download the document

Upload: alexandrina-park

Post on 19-Dec-2015

224 views

Category:

Documents


2 download

TRANSCRIPT

  • Slide 1
  • Transcriptomics Jim Noonan GENE 760
  • Slide 2
  • Slide 3
  • Transcriptomics
  • Slide 4
  • Introduction to RNA-seq
  • Slide 5
  • RNA-seq workflow Martin and Wang Nat Rev Genet 12:671 (2011) Wang et al. Nat Rev Genet 10:57 (2009)
  • Slide 6
  • Illumina RNA-seq library preparation Capture poly-A RNA with poly-T oligo attached beads (100 ng total) (2x) RNA quality must be high degradation produces 3 bias Non-poly-A RNAs are not recovered Fragment mRNA Synthesize ds cDNA Ligate adapters Amplify Generate clusters and sequence
  • Slide 7
  • Ribosomal RNA subtraction RiboMinus
  • Slide 8
  • Use existing gene annotation: Align to genome plus annotated splices Depends on high-quality gene annotation Which annotation to use: RefSeq, GENCODE, UCSC? Isoform quantification? Identifying novel transcripts? Differential expression De novo transcript assembly: Assemble transcripts directly from reads Allows transcriptome analyses of species without reference genomes Quantifying relative expression levels in RNA-seq
  • Slide 9
  • Mapping RNA-seq reads
  • Slide 10
  • Reads per kilobase of feature length per million mapped reads (RPKM) Fragments per kilobase per million mapped reads (FPKM) (paired-end reads) Transcripts per million (TPM) Counts per million (CPM) Quantifying relative expression levels in RNA-seq What is a feature? What about genomes with poor genome annotation? What about species with no sequenced genome? For a detailed comparison of normalization methods, see: Bullard et al. BMC Bioinformatics 11:94 (2010). Robinson and Oshlack, Genome Biol 11:R25 (2010)
  • Slide 11
  • Map reads to genome Map remaining reads to known splice junctions Composite gene models Requires good gene models Isoforms are ignored
  • Slide 12
  • Which gene annotation to use?
  • Slide 13
  • Martin and Wang Nat Rev Genet 12:671 (2011) Splice-aware short read aligners
  • Slide 14
  • The Tuxedo suite Trapnell et al. Nature Protocols 7:562 (2012)
  • Slide 15
  • Cufflinks: ab initio transcript assembly Trapnell et al. Nat. Biotechnology 28:511 (2010) Step 1: map reads to reference genome
  • Slide 16
  • Trapnell et al. Nat. Biotechnology 28:511 (2010) Isoform abundances estimated by maximum likelihood Cufflinks: ab initio transcript assembly
  • Slide 17
  • Differential expression Garber et al. Nat Methods 8:469 (2011)
  • Slide 18
  • Differential expression Garber et al. Nat Methods 8:469 (2011) Popular methods: EdgeR DEseq Cuffdiff Require count data Assume negative binomial or Poisson distribution
  • Slide 19
  • Wang et al. Nat Rev Genet 10:57 (2009) What depth of sequencing is required to characterize a transcriptome?
  • Slide 20
  • Considerations Gene length: Long genes are detected before short genes Expression level: High expressors are detected before low expressors Complexity of the transcriptome: Tissues with many cell types require more sequencing Feature type Composite gene models Common isoforms Rare isoforms Detection vs. quantification Obtaining confident expression level estimates (e.g., stable RPKMs) requires greater coverage
  • Slide 21
  • Applications of RNA-seq Characterizing transcriptome complexity Alternative splicing Differential expression analysis Gene- and isoform-level expression comparisons Novel RNA species lincRNAs Pervasive transcription Allele-specific expression Effect of genetic variation on gene expression Imprinting RNA editing Novel events
  • Slide 22
  • Wang et al Nature 456:470 (2008) Alternative isoform regulation in human tissue transcriptomes
  • Slide 23
  • Wang et al. Nature 456:470 (2008) Diversity of alternative splicing events in human tissues
  • Slide 24
  • Novel RNA species: annotating lincRNAs Guttman et al Nat Biotechnol 28:503 (2010)
  • Slide 25
  • Small RNA sequencing Rother and Meister, Biochimie 93: 1905 (2011)
  • Slide 26
  • Small RNA sequencing Rother and Meister, Biochimie 93: 1905 (2011) microRNAs ~22 nt piRNAs ~25-30 nt
  • Slide 27
  • Small RNA sequencing: Illumina protocol microRNAs ~22 nt piRNAs ~25-30 nt
  • Slide 28
  • Distinguishing functional small RNAs from noise Structural similarity to known small RNAs: miR-deep, miR-cat Binding to small RNA processing proteins Genetic requirements for processing Friedlander et al. Nat Biotechnology 26:407 (2008)
  • Slide 29
  • Measuring translation by ribosome footprinting Ingolia, Nat Rev Genet 15:205(2014)
  • Slide 30
  • Measuring translation by ribosome footprinting Ingolia et al. Science 324:218 (2009)
  • Slide 31
  • Measuring translation by ribosome footprinting Ingolia et al. Science 324:218 (2009)
  • Slide 32
  • Some lincRNAs are translated in mouse ES cells Ingolia et al. Cell 147:789 (2011)
  • Slide 33
  • Detecting RNA-protein interactions: CLIP Rother and Meister, Biochimie 93: 1905 (2011)
  • Slide 34
  • Enhancer-associated RNAs (eRNAs) Ren B. Nature 465:173 (2010)
  • Slide 35
  • Enhancer-associated RNAs (eRNAs) Kim et al Nature 465:182 (2010)
  • Slide 36
  • How much of the genome is transcribed? Kellis et al. Proc. Natl. Acad. Sci. USA 111:6131 (2014) Estimates from ENCODE