Transcript
Page 1: Sequence Alignment technology

Sequence Alignment technology

Chengwei LeiFang Yuan

Saleh Tamim

Page 2: Sequence Alignment technology

Goal

• Save time“PASS: a Program to Align Short SequencesDavide Campagna et al. Bioinformatics (2009)”

• Save money“Optimal pooling for genome re-sequencing with ultra-high-throughput short-read technologies, Iman Hajirasouliha, Bioinformatics (2008) ”

Page 3: Sequence Alignment technology

Keywords in both paper

• Reference sequence: A long Genomic sequence.

• Short reads: Input short strings. e.g. ATGCGTAC

Page 4: Sequence Alignment technology

Save time – PASS program

• PASS, a new algorithm to align short DNA sequences allowing gaps and mismatches.

• The performance of the program is very striking both for sensitivity and speed. For instance, gap alignment is achieved hundreds of times faster than BLAST and several times faster than SOAP, especially when gaps are allowed.

Page 5: Sequence Alignment technology

PASS• Program to Align Short Sequences• Performs gapped and ungapped alignment onto a reference sequence• Seed words (11 and 12 bases)• Short reads (7 and 8 bases)• PST - calculated with the Needleman and Wunsch algorithm supplied with PASS• Handles data generated by Solexa, SOLiD or 454 technologies

Page 6: Sequence Alignment technology

Approach/Algorithm

Page 7: Sequence Alignment technology

Analysis and Results• Comparison of PASS with SOAP

• PASS has better sensitivity with words of 11 and runs at least 10 times faster

Page 8: Sequence Alignment technology

Save money - Optimal pooling method

• A set of experiments using the Solexa technology, based on bacterial artificial chromosome (BAC) clones, and address an experimental design problem.

• Basic idea: More than one BAC per lane in order to maximize the throughput of the Solexa technology, hence minimize its cost.

Page 9: Sequence Alignment technology

Input strings (short reads) Reference sequences

Inputs

Page 10: Sequence Alignment technology

Normal pooling method• One other hurdle in designing a globally optimal experiment is

the rapid proliferation of number of possible configurations. For instance, if we would like to pool m=150 BACs into 15 groups of size=10, we would need to consider

Infeasible to search all these configurations

Optimal Pooling method

Page 11: Sequence Alignment technology

Input strings (short reads) Reference sequence

Optimal Pooling method

Page 12: Sequence Alignment technology

Input strings (short reads) Reference sequences

Optimal Pooling method

Page 13: Sequence Alignment technology

Input strings (short reads) Reference sequences

Optimal Pooling method

Pool

Page 14: Sequence Alignment technology

Problem

• How to separate the groups of short reads?

Page 15: Sequence Alignment technology

Input strings (short reads) Reference sequence

Optimal Pooling method

Pool

Page 16: Sequence Alignment technology

Two cases

Page 17: Sequence Alignment technology

Result

Page 18: Sequence Alignment technology

Conclusion

• Program for Save time“PASS: a Program to Align Short SequencesDavide Campagna et al. Bioinformatics (2009)”

• Algorithm for Save money“Optimal pooling for genome re-sequencing with ultra-high-throughput short-read technologies, Iman Hajirasouliha, Bioinformatics (2008) ”

Page 19: Sequence Alignment technology

Q & A


Top Related