introduction to rna-seq& transcriptome...
Post on 07-Jun-2020
7 Views
Preview:
TRANSCRIPT
Introduction to RNA-Seq & Transcriptome Analysis
Jessica Holmes
PowerPoint by Shounak Bhogale
RNA-Seq Lab | Shounak Bhogale | 2019 16/11/19
Exercise
1. Use the RNA-STAR to align RNA-Seq reads
2. Use htseqCounts to count the reads.
3. Use edgeR to find differential expressed genes.
4. Use IGV for visualization (If time permits).
RNA-Seq Lab | Shounak Bhogale | 2019 26/11/19
Pipeline Overview
v
RNA-Seq Lab | Shounak Bhogale | 2019 36/11/19
sample replicate # fastq name # reads
TP0 Replicate 1,2 a_0.fastaq, b_0.fastaq 100,000
TP8 Replicate 1,2 a_8.fastaq, b_8.fastaq 100,000
name description
mouse_chr12.fna Fasta file with the sequence of chromosome 12 from the mouse genome
Mouse_chr12.gtf GTF file with gene annotation, known genes
RNA-Seq: 100 bp, single end data
Genome & gene information
Input Data
RNA-Seq Lab | Shounak Bhogale | 2019 46/11/19
Step 1A: Logging into Galaxy
Go to: http://compgen.knoweng.org/galaxy
Click Enter
Click Login
Input your login credentials.
Click Login.
RNA-Seq Lab | Shounak Bhogale | 2019 56/11/19
Step 1B: Galaxy Start Screen
The resulting screen should look like the figure below:
RNA-Seq Lab | Shounak Bhogale | 2019 66/11/19
Step 2A: Accessing Input Files
At the top of the page, click Shared Data.
Then click Histories.
RNA-Seq Lab | Shounak Bhogale | 2019 76/11/19
Step 2B: Accessing Input Files
Click sb_transcriptomics
You should see this page.
Click Import History.
RNA-Seq Lab | Shounak Bhogale | 2019 86/11/19
Step 2C: Accessing Input Files
Click Import
You should see an imported history like the following.
RNA-Seq Lab | Shounak Bhogale | 2019 96/11/19
In this exercise, we will be aligning RNA-Seq reads to a reference genome in the absence of gene models. Splice junctions will be found de novo.
Remember, we are not going to provide any genic structure information.
.
Step 1: Alignment using RNA-STAR
RNA-Seq Lab | Shounak Bhogale | 2019 106/11/19
Search RNA STAR in the search box.
RNA-Seq Lab | Shounak Bhogale | 2019 11
Click on RNA STAR under NGS: RNA Analysis
6/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 12
You should see this on the screen.
6/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 13
Set fastaq file as a_0.fastq
Set reference genomes as input.
Select mouse_chr12.fna as the fasta file.
Set mouse_chr12.gtf as theinput gene model.
Set length as 99.
Keep rest of the parameters as default.
Click execute after makingthese changes.
6/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 14
Click on the ’pencil’ symbol next to the dataset 9 to change the name of the dataset.
Once the run is complete three files generated will turn green.
Change the name in the name databox to a_0.bam and then click save.
6/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 15
Repeat above steps for remaining three fastaq sets – b_0.fastq, a_8.fastq andb_8.fastq
Similarly change the names of remaining bam files once they are generated.
Now we are ready for the next step.
6/11/19
Step 2: Read aligned countsUse htseq-count to generate the aligned counts for each bam files generated in step 1.
RNA-Seq Lab | Shounak Bhogale | 2019 166/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 17
Search htseq-count in the search box.
Click on htseq-count to open the tool.
6/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 18
Select a_0.bam as the input.
Change stranded to Reverse.
Keep rest of the parametersas default and click Execute.
6/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 19
Change the name of the following datasets.
Here we generated the counts for each gene in the 4 datasets.Now we are ready for the final step.
6/11/19
Step 3: Finding differentially expressed genesNow we will use edgeR to analyze the count files generated in step 2 to find differentially expressed genes between two time points.
RNA-Seq Lab | Shounak Bhogale | 2019 206/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 21
Search edgeR in the search box on the left.
Click on edgeR to open the tool.
This should come up on the screen.
6/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 22
Select all.txt as the input dataset.
Select Single Count Matrix
Enter Mouse as factor name.
Enter groups as TP8,TP8,TP0,TP0
6/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 23
Set contrast as TP8-TP0.
Click on Filter low counts.Set Yes to filter lowly expressed genes.
Set Counts.
Set minimum Counts as 5.
Set Yes for normalized counts table and rscript.
And now Execute!6/11/19
RNA-Seq Lab | Shounak Bhogale | 2019 24
3 files will be generated after the run is complete.
Click on the eye symbol next to the report file.
Scroll down through the report.For our dataset there are no DEGs.
6/11/19
Conclusion
We did the following today.
1. Alligned RNA-seq data using RNA-STAR
2. Used htseq-count to count the aligned reads for each gene.
3. Used edgeR to find DEGs in the data
RNA-Seq Lab | Shounak Bhogale | 2019 256/11/19
Useful linksOnline resources for RNA-Seq analysis questions –
² http://www.biostars.org/ - Biostar (Bioinformatics explained)
² http://seqanswers.com/ - SEQanswers (the next generation sequencing community)
² Most tools have a dedicated lists
Information about the various parts of the Tuxedo suite is available here -http://ccb.jhu.edu/software.shtml
Genome Browsers tutorials –
² http://www.broadinstitute.org/igv/QuickStart/ - IGV tutorials
² http://www.openhelix.com/ucsc/ - UCSC browser tutorials
(openhelix is a great place for tutorials, UIUC has a campus-wide subscription)
26
Contact us at:
hpcbiohelp@illinois.edu
hpcbiotraining@illinois.edu
RNA-Seq Lab | Shounak Bhogale | 20196/11/19
Extra MaterialIGV
RNA-Seq Lab | Shounak Bhogale | 2019 276/11/19
The Integrative Genomics Viewer (IGV) is a tool that supports the visualization of
mapped reads to a reference genome, among other functionalities. We will use it to
observe where hits were called for the alignment for the two samples (TP0 and
TP8), and the differentially(!) expressed genes.
.
Visualization Using IGV
RNA-Seq Lab | Shounak Bhogale | 2019 286/11/19
In this step, we will start IGV and load the select mouse_chr12.fna file, the known genes file
(mouse_chr12.fna), the hits for both sample groups, and the merged transcriptome.
Step 9: Start IGV
RNA-Seq Lab | Shounak Bhogale | 2019 29
Graphical Instruction: Load Genome
1. Within IGV, click the ‘Genomes’ tab on the menu bar.
2. Click the the ‘Load Genome from File’ option.
3. In the browser window, select mouse_chr12.fna(genome).
Graphical Instruction: Load Other Files
1. Within IGV, click the FILE tab on the menu bar.
2. Click the ‘Load from File’ option.
3. Select the mouse_chr12.gtf file (known genes file).
4. Perform Steps 1-3 for the files to the right.
Files to Load
mouse_chr12.fna
a_0.bam
b_0.bam
a_8.bam
b_8.bam
6/11/19
Step 10: Visualization With IGV
RNA-Seq Lab | Shounak Bhogale | 2019 30
Click here and type the following location of a non-differentially expressed gene:NC_000078.6:17,788,398-17,793,435
6/11/19
top related