introduction to rna-seq& transcriptome...

30
Introduction to RNA-Seq & Transcriptome Analysis Jessica Holmes PowerPoint by Shounak Bhogale RNA-Seq Lab | Shounak Bhogale | 2019 1 6/11/19

Upload: others

Post on 07-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Introduction to RNA-Seq & Transcriptome Analysis

Jessica Holmes

PowerPoint by Shounak Bhogale

RNA-Seq Lab | Shounak Bhogale | 2019 16/11/19

Page 2: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Exercise

1. Use the RNA-STAR to align RNA-Seq reads

2. Use htseqCounts to count the reads.

3. Use edgeR to find differential expressed genes.

4. Use IGV for visualization (If time permits).

RNA-Seq Lab | Shounak Bhogale | 2019 26/11/19

Page 3: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Pipeline Overview

v

RNA-Seq Lab | Shounak Bhogale | 2019 36/11/19

Page 4: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

sample replicate # fastq name # reads

TP0 Replicate 1,2 a_0.fastaq, b_0.fastaq 100,000

TP8 Replicate 1,2 a_8.fastaq, b_8.fastaq 100,000

name description

mouse_chr12.fna Fasta file with the sequence of chromosome 12 from the mouse genome

Mouse_chr12.gtf GTF file with gene annotation, known genes

RNA-Seq: 100 bp, single end data

Genome & gene information

Input Data

RNA-Seq Lab | Shounak Bhogale | 2019 46/11/19

Page 5: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Step 1A: Logging into Galaxy

Go to: http://compgen.knoweng.org/galaxy

Click Enter

Click Login

Input your login credentials.

Click Login.

RNA-Seq Lab | Shounak Bhogale | 2019 56/11/19

Page 6: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Step 1B: Galaxy Start Screen

The resulting screen should look like the figure below:

RNA-Seq Lab | Shounak Bhogale | 2019 66/11/19

Page 7: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Step 2A: Accessing Input Files

At the top of the page, click Shared Data.

Then click Histories.

RNA-Seq Lab | Shounak Bhogale | 2019 76/11/19

Page 8: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Step 2B: Accessing Input Files

Click sb_transcriptomics

You should see this page.

Click Import History.

RNA-Seq Lab | Shounak Bhogale | 2019 86/11/19

Page 9: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Step 2C: Accessing Input Files

Click Import

You should see an imported history like the following.

RNA-Seq Lab | Shounak Bhogale | 2019 96/11/19

Page 10: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

In this exercise, we will be aligning RNA-Seq reads to a reference genome in the absence of gene models. Splice junctions will be found de novo.

Remember, we are not going to provide any genic structure information.

.

Step 1: Alignment using RNA-STAR

RNA-Seq Lab | Shounak Bhogale | 2019 106/11/19

Page 11: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Search RNA STAR in the search box.

RNA-Seq Lab | Shounak Bhogale | 2019 11

Click on RNA STAR under NGS: RNA Analysis

6/11/19

Page 12: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 12

You should see this on the screen.

6/11/19

Page 13: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 13

Set fastaq file as a_0.fastq

Set reference genomes as input.

Select mouse_chr12.fna as the fasta file.

Set mouse_chr12.gtf as theinput gene model.

Set length as 99.

Keep rest of the parameters as default.

Click execute after makingthese changes.

6/11/19

Page 14: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 14

Click on the ’pencil’ symbol next to the dataset 9 to change the name of the dataset.

Once the run is complete three files generated will turn green.

Change the name in the name databox to a_0.bam and then click save.

6/11/19

Page 15: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 15

Repeat above steps for remaining three fastaq sets – b_0.fastq, a_8.fastq andb_8.fastq

Similarly change the names of remaining bam files once they are generated.

Now we are ready for the next step.

6/11/19

Page 16: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Step 2: Read aligned countsUse htseq-count to generate the aligned counts for each bam files generated in step 1.

RNA-Seq Lab | Shounak Bhogale | 2019 166/11/19

Page 17: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 17

Search htseq-count in the search box.

Click on htseq-count to open the tool.

6/11/19

Page 18: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 18

Select a_0.bam as the input.

Change stranded to Reverse.

Keep rest of the parametersas default and click Execute.

6/11/19

Page 19: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 19

Change the name of the following datasets.

Here we generated the counts for each gene in the 4 datasets.Now we are ready for the final step.

6/11/19

Page 20: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Step 3: Finding differentially expressed genesNow we will use edgeR to analyze the count files generated in step 2 to find differentially expressed genes between two time points.

RNA-Seq Lab | Shounak Bhogale | 2019 206/11/19

Page 21: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 21

Search edgeR in the search box on the left.

Click on edgeR to open the tool.

This should come up on the screen.

6/11/19

Page 22: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 22

Select all.txt as the input dataset.

Select Single Count Matrix

Enter Mouse as factor name.

Enter groups as TP8,TP8,TP0,TP0

6/11/19

Page 23: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 23

Set contrast as TP8-TP0.

Click on Filter low counts.Set Yes to filter lowly expressed genes.

Set Counts.

Set minimum Counts as 5.

Set Yes for normalized counts table and rscript.

And now Execute!6/11/19

Page 24: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

RNA-Seq Lab | Shounak Bhogale | 2019 24

3 files will be generated after the run is complete.

Click on the eye symbol next to the report file.

Scroll down through the report.For our dataset there are no DEGs.

6/11/19

Page 25: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Conclusion

We did the following today.

1. Alligned RNA-seq data using RNA-STAR

2. Used htseq-count to count the aligned reads for each gene.

3. Used edgeR to find DEGs in the data

RNA-Seq Lab | Shounak Bhogale | 2019 256/11/19

Page 26: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Useful linksOnline resources for RNA-Seq analysis questions –

² http://www.biostars.org/ - Biostar (Bioinformatics explained)

² http://seqanswers.com/ - SEQanswers (the next generation sequencing community)

² Most tools have a dedicated lists

Information about the various parts of the Tuxedo suite is available here -http://ccb.jhu.edu/software.shtml

Genome Browsers tutorials –

² http://www.broadinstitute.org/igv/QuickStart/ - IGV tutorials

² http://www.openhelix.com/ucsc/ - UCSC browser tutorials

(openhelix is a great place for tutorials, UIUC has a campus-wide subscription)

26

Contact us at:

[email protected]

[email protected]

RNA-Seq Lab | Shounak Bhogale | 20196/11/19

Page 27: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Extra MaterialIGV

RNA-Seq Lab | Shounak Bhogale | 2019 276/11/19

Page 28: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

The Integrative Genomics Viewer (IGV) is a tool that supports the visualization of

mapped reads to a reference genome, among other functionalities. We will use it to

observe where hits were called for the alignment for the two samples (TP0 and

TP8), and the differentially(!) expressed genes.

.

Visualization Using IGV

RNA-Seq Lab | Shounak Bhogale | 2019 286/11/19

Page 29: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

In this step, we will start IGV and load the select mouse_chr12.fna file, the known genes file

(mouse_chr12.fna), the hits for both sample groups, and the merged transcriptome.

Step 9: Start IGV

RNA-Seq Lab | Shounak Bhogale | 2019 29

Graphical Instruction: Load Genome

1. Within IGV, click the ‘Genomes’ tab on the menu bar.

2. Click the the ‘Load Genome from File’ option.

3. In the browser window, select mouse_chr12.fna(genome).

Graphical Instruction: Load Other Files

1. Within IGV, click the FILE tab on the menu bar.

2. Click the ‘Load from File’ option.

3. Select the mouse_chr12.gtf file (known genes file).

4. Perform Steps 1-3 for the files to the right.

Files to Load

mouse_chr12.fna

a_0.bam

b_0.bam

a_8.bam

b_8.bam

6/11/19

Page 30: Introduction to RNA-Seq& Transcriptome Analysispublish.illinois.edu/computational-genomics-course/files/...2019/06/04  · Exercise 1.Use the RNA-STAR to align RNA-Seqreads 2. Use

Step 10: Visualization With IGV

RNA-Seq Lab | Shounak Bhogale | 2019 30

Click here and type the following location of a non-differentially expressed gene:NC_000078.6:17,788,398-17,793,435

6/11/19