algorithms research tandy warnow ut-austin. “algorithms group” ut-austin: warnow, hunt ucb: rao,...

14
Algorithms research Tandy Warnow UT-Austin

Upload: abel-lindsey

Post on 08-Jan-2018

214 views

Category:

Documents


0 download

DESCRIPTION

Main research foci Solving maximum parsimony and maximum likelihood more effectively “Fast converging methods” Gene order and content phylogeny Reticulate evolution Multiple sequence alignment at the genomic level

TRANSCRIPT

Page 1: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Algorithms research

Tandy WarnowUT-Austin

Page 2: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

“Algorithms group”

• UT-Austin: Warnow, Hunt• UCB: Rao, Karp, Papadimitriou, Russell,

Myers• UCSD: Huelsenbeck• UNM: Moret, Bader, Williams• External participants: Mossel (UCB),

Huson (Germany), Steel (NZ), and others

Page 3: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Main research foci

• Solving maximum parsimony and maximum likelihood more effectively

• “Fast converging methods”• Gene order and content phylogeny• Reticulate evolution• Multiple sequence alignment at the genomic

level

Page 4: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

GRAPPA (Genome Rearrangement Analysis under Parsimony and other

Phylogenetic Algorithms)http://www.cs.unm.edu/~moret/GRAPPA/• Heuristics for NP-hard optimization problems• Fast polynomial time distance-based methods• Contributors: U. New Mexico,U. Texas at

Austin, Universitá di Bologna, Italy• Poster: Jijun Tang

Page 5: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Maximum Parsimony on Rearranged Genomes (MPRG)

• The leaves are rearranged genomes.• Find the tree that minimizes the total number of rearrangement events

A

B

C

D

3 6

2

3

4

A

B

C

D

E F

Total length= 18

Page 6: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Benchmark gene order dataset: Campanulaceae

• 12 genomes + 1 outgroup (Tobacco), 105 gene segments• NP-hard optimization problems: breakpoint and inversion

phylogenies

1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)

Page 7: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Benchmark gene order dataset: Campanulaceae

• 12 genomes + 1 outgroup (Tobacco), 105 gene segments• NP-hard optimization problems: breakpoint and inversion

phylogenies

1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)2000: Using GRAPPA v1.1 on the 512-processor Los Lobos

Supercluster machine: 2 minutes (200,000-fold speedup per processor)

Page 8: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Benchmark gene order dataset: Campanulaceae

• 12 genomes + 1 outgroup (Tobacco), 105 gene segments• NP-hard optimization problems: breakpoint and inversion

phylogenies

1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)2000: Using GRAPPA v1.1 on the 512-processor Los Lobos

Supercluster machine: 2 minutes (200,000-fold speedup per processor)

2003: Using latest version of GRAPPA: 2 minutes on a single processor (1-billion-fold speedup per processor)

Page 9: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Reticulate Evolution

• Group leader: Randy Linder• Software: (1) producing random networks,

(2) simulating sequences down networks, (3) performance evaluation of methods (4) inferring reticulate networks

• Current reconstruction methods limited to one reticulation event

• Poster: Luay Nakhleh

Page 10: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

20-taxon 1-hybrid network. 0.1 scaling factor.

Page 11: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

MP/ML heuristics

• Disk-Covering Methods (DCMs): Divide-and-conquer strategies that boosting the performance of base methods for MP/ML (Warnow)

• Mr Bayes (Huelsenbeck)• New I-DCM3 technique improves upon the

Ratchet and TBR• Poster: Usman Roshan (DCM-MP)

Page 12: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Gutell dataset: 854 rRNA sequences

Iterative-DCM3 trials find trees of MP score 103210 in 30 hours,whereas ratchet500 trials take 45 hours to find trees of same score

Page 13: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Other planned projects (partial list)

• Multiple Sequence Alignment (Myers and Williams)

• Steiner Tree algorithms - error bounds and new heuristics (Rao)

• MCMC methods (Russell and Huelsenbeck)• Symbolic representation of data (Hunt)• Parallel algorithms (Bader and Williams)

Page 14: Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao, Karp, Papadimitriou, Russell, Myers UCSD: Huelsenbeck

Questions for group

• How should we measure performance?• How should we use simulated data? • How should we use real datasets?• How can we study criteria (MP, ML, etc.) as

opposed to methods?• Should we sponsor DIMACS-style challenges?• Others? (please bring questions, comments,

answers, to the break-out session)