short read alignment
DESCRIPTION
Short read alignment. BNFO 601. Short read alignment. Input: Reads: short DNA sequences usually up to 100 base pairs (bp) produced by a sequencing machine Reads are fragments of a longer DNA sequence present in the sample given as input to the machine Usually number in the millions - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/1.jpg)
Short read alignment
BNFO 601
![Page 2: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/2.jpg)
Short read alignment
• Input:– Reads: short DNA sequences usually up to 100
base pairs (bp) produced by a sequencing machine
• Reads are fragments of a longer DNA sequence present in the sample given as input to the machine
• Usually number in the millions
– Genome sequence: a reference DNA sequence much longer than the read length
![Page 3: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/3.jpg)
Short read alignment
• Applications– Genome assembly– RNA splicing studies– Gene expression studies– Discovery of new genes– Discovering of cancer causing mutations
![Page 4: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/4.jpg)
Short read alignment
• Two approaches– Hashing based algorithms
• BFAST• SHRIMP• MAQ• STAMPY (statistical alignment)
– Burrows Wheeler transform• Bowtie• BWA
![Page 5: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/5.jpg)
BFAST overview
PLoS ONE 4(11): e7767.
![Page 6: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/6.jpg)
BFAST algorithmPLoS ONE 4(11): e7767.
![Page 7: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/7.jpg)
BFAST masked keys
![Page 8: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/8.jpg)
Short read alignment
Empirical performance:• Simulated data:
– Extract random substrings of fixed length with random mutations and gaps
– Realign back to reference genome
• Real data: – Paired reads: two ends of the same molecule– Count number of paired reads within 500 to 10000
bases of each other
![Page 9: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/9.jpg)
Short read alignment
Courtesy of Genome Res. June 2011 21: 936-939;
![Page 10: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/10.jpg)
Short read alignment
Courtesy of Genome Res. June 2011 21: 936-939;
![Page 11: Short read alignment](https://reader036.vdocuments.net/reader036/viewer/2022062809/568157f2550346895dc56e20/html5/thumbnails/11.jpg)
Short read alignment