inferência filogenética

22
Inferência Inferência Filogenética Filogenética Construção de Árvores Filogenéticas II Ana Margarida Sousa Instituto Gulbenkian de Ciência Grupo de Biologia Evolutiva [email protected]

Upload: leo-vaughan

Post on 04-Jan-2016

67 views

Category:

Documents


0 download

DESCRIPTION

Inferência Filogenética. Construção de Árvores Filogenéticas II. Ana Margarida Sousa Instituto Gulbenkian de Ciência Grupo de Biologia Evolutiva. [email protected]. Árvore verdadeira. A. B. C. wt. D. E. F. G. BclI. BclI. BglII. BclI. BclI. BclI. BclI. Sau3AI. Sau3AI. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Inferência Filogenética

Inferência FilogenéticaInferência Filogenética

Construção de Árvores Filogenéticas II

Ana Margarida Sousa

Instituto Gulbenkian de CiênciaGrupo de Biologia Evolutiva

[email protected]

Page 2: Inferência Filogenética

AA

BB

CC

DD

EE

FF

GG

wtwt

Árvore verdadeira

Page 3: Inferência Filogenética

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

BglIIBglII

BclI

BclI

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

A

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

BclI

BclI

Sau3A

I

Sau3A

I

BglIIBglII

BclI

BclI

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

B

CSau

3AI

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclIBglIIBglII

BclI

BclI

Sau3A

I

Sau3A

I

BglIIBglII

BclI

BclI

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BamHI

BamHI

Sau3A

I

Sau3A

I

BclI

BclI

DSau

3AI

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

BclI

BclI

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BamHI

BamHI

Sau3A

I

Sau3A

I

BamHI

BamHI

BamHI

BamHI

Sau3A

I

Sau3A

I

BclI

BclI

ESau

3AI

Sau3A

IBcl

IBcl

IBcl

IBcl

I

Sau3A

I

Sau3A

I

BclI

BclI

BclI

BclI

Sau3A

I

Sau3A

I

BglIIBglII

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BamHI

BamHI

BclI

BclI

FBcl

IBcl

I

Sau3A

I

Sau3A

I

BclI

BclIBcl

IBcl

IBcl

IBcl

IBcl

IBcl

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

IBcl

IBcl

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

GBcl

IBcl

I

Sau3A

I

Sau3A

I

BclI

BclI

BglIIBglII

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

Sau3A

I

BclI

BclI

Dados I - DADOS DE RESTRIÇÃO

Page 4: Inferência Filogenética

A

B

C

D

E

F

G

7 53 1A 11011100100000111010000001110110101000001110000001110B 11111010100100111010000001110100101000101110000001010C 00011000110011111010000001101100101000001100000011010D 00011000100000011010000011110111101110001100000011010E 10000001101001111111110100100100111000111101010110100F 10000110101000011110111111100110101001111101101001011G 10010000100000111110010100100100101000011101001001010

Page 5: Inferência Filogenética

7 553A GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGCTTCTACAAAGATGGCTTTATGCTTGATGGGGAACTCATGATCAAGGGCGTAGACTTTAACACAGGGTCCGGCCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGATTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTGCTACAGGAATACTTCCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGGTCTACGATATGATAGAATTACAGCAATTGTACGAGCAGAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCCB GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGCTTCTACAAAGATGGCTTTATGCTTGATGGAGAACTCATGATCAAGAGCGTAGACTTTAACACAGGGTCCGGCCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGTTGCCTCTGCTACAGGAATACTTCCCTGAAATCAAATGGCAAGCGGTTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAGCAGAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCCC GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGTTGGAAGCGTTTACTGAACGATGACCGTTGCTTCTACAAAGATGGCTTTATGCTTGATGGGGAATTCATGATCAAGGGCGTAGACTTTAACACAGGGTCCGGCCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTGCTACAGGAATACTTCCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGATCTACGATATGGTAGAATTATAGCAATTGTACGAGCAGAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCCD GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTTTACTGAACGATGACCGTTGCTTCTACAAAGATGGCTTTATGCTTGATGGGGAACTCATGATCAAGGGCGTAGACTTTAACACAGGGTCCGGCCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGTCTCTGCTACAGGAATACTTCCCTGAAATCAAATGGCAAGCGACTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAGCAGAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCCE GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGTTTCTACAAAGATGGCTTTATGCTTGATGGGGAACTCATGATCAAGGACGTAGATTTTAACACAGGGTCCGACCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTACTACAGGAATACTTTCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAACAAAAGCGAGCAGAAGGCCATGAGGGTTTCATTGTGAAAGACCCF GGAACATCTGCGTAGACAATACTGCTAACAGTTATTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGTTTCTACAAAGATGGCTTTATGCTTGATGGGGAATTCATGATCAAGGGCGTAGATTTTAACACAGGGTCCGACCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTACTACAGGAATATTTTCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAGCAAAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCCG GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGTTTCTACAAAGATGGCTTTATGCTTGATGGGGAATTCATGATCAAGGGCGTAGATTTTAACACAGGGTCCGACCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTACTACAGGAATACTTTCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAGCAAAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCC

Dados II - SEQUÊNCIAS NUCLEOTÍDICAS

Page 6: Inferência Filogenética

Mapas físicos (dados de restrição)

Matriz de dados (0/1)Matriz de dados (0/1)

Matriz de distâncias

UPGMA NJ ME

Seqs nucleotídicas Seqs nucleotídicas (alinhamento)(alinhamento)

MP, ML

Matriz de distâncias

UPGMA NJ MEMP, ML

Dados boleanos (0/1)

Métodos Programa

Cálculo distâncias restdistUPGMA neighborNJ neighborME fitchMP parsML restml

Dados de sequência

Métodos Programa

Cálculo distâncias dnadistUPGMA neighborNJ neighborME fitchMP dnaparsML dnaml

Page 7: Inferência Filogenética

BOOTSTRAPBOOTSTRAP

Matriz de dados (0/1)Matriz de dados (0/1)

Gerar 100 pseudo-réplicas

100 Matrizes de distância

100 árvores NJ

Árvore consenso pela maioriaÁrvore consenso pela maioria

Seqs nucleotídicas Seqs nucleotídicas (alinhamento)(alinhamento)

Gerar 100 pseudo-réplicas

100 árvores NJ

Árvore consenso pela maioriaÁrvore consenso pela maioria

Métodos Programa

Pseudo-replicas seqbootÁrvore consenso consens

Page 8: Inferência Filogenética

1. Copiar o ficheiro de entrada (formato .txt) para a pasta onde se encontra o programa executável que vai utilizar (ex: restdist.exe).

2. Clicar duas vezes sobre o executável para abrir o programa.

3. Escrever o nome do ficheiro de entrada (não esquecer a extensão “.txt”).

4. Alterar as opções pretendidas conforme indicado no menu.

5. Escrever ‘y’. Automaticamente é gerado um ‘outfile’ e/ou um ‘treefile’.

6. Transferir estes ficheiros para outra pasta e mudar-lhes o nome.

7. Abrir o ficheiro ‘treefile’ com o programa TreeView para analisar a árvore produzida.

Sequência de passos para utilizar qualquer um dos programas do pack PhylipPhylip.

Page 9: Inferência Filogenética
Page 10: Inferência Filogenética
Page 11: Inferência Filogenética
Page 12: Inferência Filogenética
Page 13: Inferência Filogenética
Page 14: Inferência Filogenética
Page 15: Inferência Filogenética
Page 16: Inferência Filogenética
Page 17: Inferência Filogenética
Page 18: Inferência Filogenética

Inferência Bayesiana usando o programa MrBayesMrbayes.exe

Dados mistos: Dados de restrição + Dados de sequênciaDados mistos: Dados de restrição + Dados de sequência

#NEXUSbegin data;dimensions ntax=14 nchar=5128;format datatype=mixed (Restriction:1-304,DNA:305-5128) interleave=yes gap=- missing=?;matrix

A0000100010100?01001000000000001000100001000001011000000000000001010010000000000010000011100000000110000000000000001000010100000100000100000000001000000000000000000000000001010001000000010001010000000001000010100000000100010000000001000001000101000010000010001000001000100000000010100101010000010100000100

B0100010000000?0?001101001000001110000100100001001000000000000001011110000010110000000000101000010001100000010000010000001000000010000101000100000100001100000010001000011001000000100100000001100000000000000001100010000101100000001001010101000000000001000100000100000010100001000000100001000000010011101000

C 0010010000000?0?001101101000001110000100000100001000000000000001111010000010111000011000101000010001000000000000010000001000000010000101000100000100001100000010000000000111000001100100000001100000000000010001100010000101100000000001000101000000000001000100000100000010100001000100100001000000010000101000

Page 19: Inferência Filogenética

B11P10 TAAAAATCTGAGTGACTATCTCACAGTGTACGGAC-CTAAAGTTCCCCCAB13P10 TAAAAATCTGAGTGATTATCTCACAGTGTACGGAC-CTAAAGTTCCCCCAB14P10 TAAAAATCTGAGTGATTATCTCACAGTGTACGGAC-CTAAAGTTCCCCCA[ 4810 4820 ][ * * ]a5P10 TAGGGGGTACCTAAAGCCCAGCCAa7P10 TAGGGGGTACCTAAAGCCCAGCCAa8P10 TAGGGGGTACCTAAAGCCCAGCCAa9P10 TAGGGGGTACCTAAAACCCAGCCAa11P10 TAGGGGGTACCTAAAGCCCAGCCAa13P10 TAGGGGGTACCTAAAGCCCAGCCAa14P10 TAGGGGGTACCTAAAGCCCAGCCAB3P10 TAGGGGGTACCTAAAGCCCAGCCAB7P10 TAGGGGGTACCTAAAGCCCAGCCAB9P10 TAGGGGGTACCTAAAGCCCAGCCAB10P10 TAGGGGGTACCTAAAGCCCAGCCAB11P10 TAGGGGGTACCTAAAACCCAGTCAB13P10 TAGGGGGTACCTAAAACCCAGTCAB14P10 TAGGGGGTACCTAAAACCCAGTCA;end;begin mrbayes;

delete 1 4 6 7 12 13 14;charset Restriction=1-304;charset DNA=305-5128;partition Names=2: Restriction, DNA;set partition=Names;lset applyto=(2) nst=6 rates=gamma;unlink shape=(all) pinvar=(all) statefreq=(all) revmat=(all);prset ratepr=variable;

mcmcp ngen=1000 printfreq=100 samplefreq=100 nchains=4 savebrlens=yes filename=Allenz+Allseqs0; mcmc; end;

Page 20: Inferência Filogenética

Sequência de passos para utilizar o programa MrBayesMrbayes.exe

1. Gravar o ficheiro de entrada na mesma localização que o programa executável.

2. Iniciar o programa.

3. Escrever o comando ‘execute’ e depois o nome do ficheiro de entrada (não esquecer a extensão ‘.txt’).

4. Aumentar o número de gerações para 1 000 000.

5. Verificar se ao fim deste nº de gerações o valor do desvio padrão entre as cadeias é ≤ 0.01.

6. Se sim pode parar o programa.

7. Escrever o comando ‘sump burnin = 2500’ (resumir os valores dos parâmetros).

8. Escrever o comando ‘sumt burnin = 2500’.

9. Verificar o resultado abrindo o ficheiro com extensão ‘.con’ com o programa TreeView.

Page 21: Inferência Filogenética
Page 22: Inferência Filogenética