biologia in silico - centro de informática - ufpe ivan g. costa filho [email protected] centro de...
TRANSCRIPT
![Page 1: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/1.jpg)
Biologia In Silico - Centro de Informática - UFPE
Ivan G. Costa [email protected]
Centro de InformáticaUniversidade Federal de Pernambuco
Processamento de Cadeias de Caracteres
![Page 2: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/2.jpg)
Biologia In Silico - Centro de Informática - UFPE
Tópicos• Cadeias de Caracteres Biológicas• Problemas Básicos
– alinhamento par/múltiplo– busca de motifs– modelagem de famílias de proteínas
• Métodos– Algoritmos dinâmicos– cadeias escondidas de Markov– métodos probabilísticos
![Page 3: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/3.jpg)
Biologia In Silico - Centro de Informática - UFPE
Disciplina• Aulas – Marco/Abril
– introdução de conceitos/métodos básicos– Aulas práticas
• Seminários - Abril/Maio– apresentação de tópicos da disciplina
• Individual - pós• duplas – graduação
• Projeto Maio a Junho– analise de dados reais (de artigos
discutidos) em grupo
![Page 4: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/4.jpg)
Biologia In Silico - Centro de Informática - UFPE
Avaliação• 40% - apresentação dos seminários
– avaliação pelos companheiros de classe e presença
• 20% - listas de exercícios• 40% - projeto em grupo
– nota individual - cada grupo é responsável por descrever a participação
![Page 5: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/5.jpg)
Biologia In Silico - Centro de Informática - UFPE
Bibliografia
• R Durbin, Sean R Eddy, A Krogh, Biological Sequence Analysis : Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press.
• An Introduction to Bioinformatics Algorithms, Neil Jones e Pavel Pevzner, MIT Press, 2004
• Ver pagina para literatura especifica de cada aula …
– www.cin.ufpe.br/~igcf
![Page 6: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/6.jpg)
Biologia In Silico - Centro de Informática - UFPE
Biologia Molecular
![Page 7: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/7.jpg)
Biologia In Silico - Centro de Informática - UFPE
Entender a vida a nível celular
• Como a informação genética é herdada
• Como a informação genética influencia processos celulares
• Como genes trabalham juntos para realizar uma função celular
![Page 8: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/8.jpg)
Biologia In Silico - Centro de Informática - UFPE
Informação Genética - DNA
• DNA (ácido desoxirribonucleico) – Cadeia de
nucleotídeos – 4 tipos: A;C;G;T– forma fita dupla a
partir da complementaridade.
• A = T e C = G
![Page 9: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/9.jpg)
Biologia In Silico - Centro de Informática - UFPE
Dogma Central - Transcrição
• Transcrição – DNA para RNA
• RNA (acido ribonucléico)– fita simples.– 4 tipos: A;C;G;U– Moléculas instáveis– Transporte de
informação do núcleo ao citoplasma
![Page 10: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/10.jpg)
Biologia In Silico - Centro de Informática - UFPE
Dogma Central - Transcrição
• Transcrição – copia seqüência de bases do DNA para o RNA (com U ao invéss de T).
![Page 11: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/11.jpg)
Biologia In Silico - Centro de Informática - UFPE
Dogma Central - Tradução• Tradução
– RNA -> Proteínas– realizada pelo ribossomo– Código genético
• Proteínas– cadeia de aminoácidos– 20 tipos diferentes– adquire uma estrutura tri-
dimensional– entidades funcionais da
célula
![Page 12: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/12.jpg)
Biologia In Silico - Centro de Informática - UFPE
Tradução - Código Genético
• Combinações de códons (3 bases) codificam um dos 20 aminoácidos.
![Page 13: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/13.jpg)
Biologia In Silico - Centro de Informática - UFPE
Dogma Central• Dogma: fluxo de
informação DNA mRNA Proteína• Gene: segmento de DNA
codificando uma proteína.• Transcrito: segmento de
RNA transcrito de uma gene.
• Um gene corresponde a uma proteína e uma função celular.
![Page 14: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/14.jpg)
Biologia In Silico - Centro de Informática - UFPE
Controle da Expressão Gênica• Como se da o controle da
expressão gênica?• Certas proteínas, fatores de
transcrição, se ligam ao DNA e são responsáveis por iniciar a transcrição.
![Page 15: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/15.jpg)
Biologia In Silico - Centro de Informática - UFPE
Controle da Regulação Gênica
![Page 16: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/16.jpg)
Biologia In Silico - Centro de Informática - UFPE
• Manage molecular biological data– Store in databases, organise, formalise, describe...
• Compare molecular biological data• Find patterns in molecular biological data
– phylogenies– correlations (sequence / structure / expression / function
/ disease)
Goals:• characterise biological patterns & processes• predict biological properties
– low level data ⇒ high level properties (eg., sequence ⇒ function)
Bioinformatics
![Page 17: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/17.jpg)
Biologia In Silico - Centro de Informática - UFPE
Bioinformatics: neighbour disciplines
• Computational biology– Broader concept: includes computational
ecology, physiology, neurology etc...• -omics:
– Genomics– Transcriptomics– Proteomics
• Systems biology– Putting it all together...– Building models, identify control & regulation
![Page 18: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/18.jpg)
Biologia In Silico - Centro de Informática - UFPE
Molecular biology data...
>alpha-DATGCTGACCGACTCTGACAAGAAGCTGGTCCTGCAGGTGTGGGAGAAGGTGATCCGCCACCCAGACTGTGGAGCCGAGGCCCTGGAGAGGTGCGGGCTGAGCTTGGGGAAACCATGGGCAAGGGGGGCGACTGGGTGGGAGCCCTACAGGGCTGCTGGGGGTTGTTCGGCTGGGGGTCAGCACTGACCATCCCGCTCCCGCAGCTGTTCACCACCTACCCCCAGACCAAGACCTACTTCCCCCACTTCGACTTGCACCATGGCTCCGACCAGGTCCGCAACCACGGCAAGAAGGTGTTGGCCGCCTTGGGCAACGCTGTCAAGAGCCTGGGCAACCTCAGCCAAGCCCTGTCTGACCTCAGCGACCTGCATGCCTACAACCTGCGTGTCGACCCTGTCAACTTCAAGGCAGGCGGGGGACGGGGGTCAGGGGCCGGGGAGTTGGGGGCCAGGGACCTGGTTGGGGATCCGGGGCCATGCCGGCGGTACTGAGCCCTGTTTTGCCTTGCAGCTGCTGGCGCAGTGCTTCCACGTGGTGCTGGCCACACACCTGGGCAACGACTACACCCCGGAGGCACATGCTGCCTTCGACAAGTTCCTGTCGGCTGTGTGCACCGTGCTGGCCGAGAAGTACAGATAA>alpha-AATGGTGCTGTCTGCCAACGACAAGAGCAACGTGAAGGCCGTCTTCGGCAAAATCGGCGGCCAGGCCGGTGACTTGGGTGGTGAAGCCCTGGAGAGGTATGTGGTCATCCGTCATTACCCCATCTCTTGTCTGTCTGTGACTCCATCCCATCTGCCCCCATACTCTCCCCATCCATAACTGTCCCTGTTCTATGTGGCCCTGGCTCTGTCTCATCTGTCCCCAACTGTCCCTGATTGCCTCTGTCCCCCAGGTTGTTCATCACCTACCCCCAGACCAAGACCTACTTCCCCCACTTCGACCTGTCACATGGCTCCGCTCAGATCAAGGGGCACGGCAAGAAGGTGGCGGAGGCACTGGTTGAGGCTGCCAACCACATCGATGACATCGCTGGTGCCCTCTCCAAGCTGAGCGACCTCCACGCCCAAAAGCTCCGTGTGGACCCCGTCAACTTCAAAGTGAGCATCTGGGAAGGGGTGACCAGTCTGGCTCCCCTCCTGCACACACCTCTGGCTACCCCCTCACCTCACCCCCTTGCTCACCATCTCCTTTTGCCTTTCAGCTGCTGGGTCACTGCTTCCTGGTGGTCGTGGCCGTCCACTTCCCCTCTCTCCTGACCCCGGAGGTCCATGCTTCCCTGGACAAGTTCGTGTGTGCCGTGGGCACCGTCCTTACTGCCAAGTACCGTTAA
• DNA sequences
![Page 19: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/19.jpg)
Biologia In Silico - Centro de Informática - UFPE
Molecular biology data...
• Amino acid sequences
• Protein structure:– X-ray crystallography– NMR
![Page 20: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/20.jpg)
Biologia In Silico - Centro de Informática - UFPE
Cell biology & proteomics data...
• Subcellular localization
![Page 21: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/21.jpg)
Biologia In Silico - Centro de Informática - UFPE
• Homology / Alignment• Simple pattern (“word”) recognition • Statistical methods
– Weight matrices: calculate amino acid probabilities– Other examples: Regression, variance analysis,
clustering• Machine learning
– Like statistical methods, but parameters are estimated by iterative training rather than direct calculation
– Examples: Neural Networks (NN), Hidden Markov Models (HMM), Support Vector Machines (SVM)
• Combinations
Prediction Methods
![Page 22: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/22.jpg)
Biologia In Silico - Centro de Informática - UFPE
Similarity between sequencesIf two sequences look similar, the explanation
may be:• Homology (common descent)• Convergent evolution (common function → common selective pressure)• Chance!
![Page 23: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/23.jpg)
Biologia In Silico - Centro de Informática - UFPE
Sequences are related
• Darwin: all organisms are related through descent with modification• => Sequences are related through descent with modification• => Similar molecules have similar functions in different organisms
Phylogenetic tree based on ribosomal RNA: three domains of life
![Page 24: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/24.jpg)
Biologia In Silico - Centro de Informática - UFPE
Sequences are related II
Phylogenetic tree of globin-type proteins found in humans
![Page 25: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/25.jpg)
Biologia In Silico - Centro de Informática - UFPE
Why compare sequences?
• Determination of evolutionary relationships
• Prediction of protein function and structure (database searches).
Protein 1: binds oxygen
Sequence similarity
Protein 2: binds oxygen ?
![Page 26: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/26.jpg)
Biologia In Silico - Centro de Informática - UFPE
Biological Databases• Vast biological and sequence data is freely available
through online databases• Use computational algorithms to efficiently store large
amounts of biological data Examples
• NCBI GeneBank http://ncbi.nih.gov Huge collection of databases, the most prominent being the nucleotide sequence database
• Protein Data Bank http://www.pdb.orgDatabase of protein tertiary structures• SWISSPROT http://www.expasy.org/sprot/ • Database of annotated protein sequences• PROSITE http://kr.expasy.org/prositeDatabase of protein active site motifs
![Page 27: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/27.jpg)
Biologia In Silico - Centro de Informática - UFPE
Alinhamento de Sequencias
![Page 28: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/28.jpg)
Biologia In Silico - Centro de Informática - UFPE
BLAST• A computational tool that allows us
to compare query sequences with entries in current biological databases.
• A great tool for predicting functions of a unknown sequence based on alignment similarities to known genes.
![Page 29: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/29.jpg)
Biologia In Silico - Centro de Informática - UFPE
BLAST
![Page 30: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/30.jpg)
Biologia In Silico - Centro de Informática - UFPE
Some Early Roles of Bioinformatics• Sequence comparison• Searches in sequence databases
![Page 31: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/31.jpg)
Biologia In Silico - Centro de Informática - UFPE
Biological Sequence Comparison• Needleman-
Wunsch, 1970– Dynamic
programming algorithm to align sequences
![Page 32: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/32.jpg)
Biologia In Silico - Centro de Informática - UFPE
Busca de Sinais de Localização
![Page 33: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/33.jpg)
Biologia In Silico - Centro de Informática - UFPE
Protein sorting in eukaryotes
• Proteins belong in different organelles of the cell – and some even have their function outside the cell
• Günter Blobel was in 1999 awarded The Nobel Prize in Physiology or Medicine for the discovery that "proteins have intrinsic signals that govern their transport and localization in the cell"
![Page 34: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/34.jpg)
Biologia In Silico - Centro de Informática - UFPE
Secretory proteins have a signal peptide
Initially, they are transported across the ER membrane
Protein sorting: secretory pathway / ER
![Page 35: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/35.jpg)
Biologia In Silico - Centro de Informática - UFPE
Signal peptides
A signal peptide is an N-terminal part of the amino acid chain, containing a hydrophobic region.
Signal peptides differ between proteins, and can be hard to recognize.
![Page 36: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/36.jpg)
Biologia In Silico - Centro de Informática - UFPE
Simple pattern (“word”) recognitionExample: PROSITE entry PS00014, ER_TARGET:Endoplasmic reticulum targeting sequence (”KDEL-signal”). Pattern: [KRHQSA]-[DENQ]-E-L
NB: only yes/no answers!
![Page 37: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/37.jpg)
Biologia In Silico - Centro de Informática - UFPE
ALAKAAAAMALAKAAAANALAKAAAARALAKAAAATALAKAAAAVGMNERPILTGILGFVFTMTLNAWVKVVKLNEPVLLLAVVPFIVSV
• Estimate probabilities for nucleotides / amino acids• Information content in sequences; logos; Position- Weight
Matrices.• Quantitative answers.
Statistical Methods
![Page 38: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/38.jpg)
Biologia In Silico - Centro de Informática - UFPE
Busca de Motifs
![Page 39: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/39.jpg)
Biologia In Silico - Centro de Informática - UFPE
Random Sampleatgaccgggatactgataccgtatttggcctaggcgtacacattagataaacgtatgaagtacgttagactcggcgccgccg
acccctattttttgagcagatttagtgacctggaaaaaaaatttgagtacaaaacttttccgaatactgggcataaggtaca
tgagtatccctgggatgacttttgggaacactatagtgctctcccgatttttgaatatgtaggatcattcgccagggtccga
gctgagaattggatgaccttgtaagtgttttccacgcaatcgcgaaccaacgcggacccaaaggcaagaccgataaaggaga
tcccttttgcggtaatgtgccgggaggctggttacgtagggaagccctaacggacttaatggcccacttagtccacttatag
gtcaatcatgttcttgtgaatggatttttaactgagggcatagaccgcttggcgcacccaaattcagtgtgggcgagcgcaa
cggttttggcccttgttagaggcccccgtactgatggaaactttcaattatgagagagctaatctatcgcgtgcgtgttcat
aacttgagttggtttcgaaaatgctctggggcacatacaagaggagtcttccttatcagttaatgctgtatgacactatgta
ttggcccattggctaaaagcccaacttgacaaatggaagatagaatccttgcatttcaacgtatgccgaaccgaaagggaag
ctggtgagcaacgacagattcttacgtgcattagctcgcttccggggatctaatagcacgaagcttctgggtactgatagca
![Page 40: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/40.jpg)
Biologia In Silico - Centro de Informática - UFPE
Implanting Motif AAAAAAAGGGGGGG
atgaccgggatactgatAAAAAAAAGGGGGGGggcgtacacattagataaacgtatgaagtacgttagactcggcgccgccg
acccctattttttgagcagatttagtgacctggaaaaaaaatttgagtacaaaacttttccgaataAAAAAAAAGGGGGGGa
tgagtatccctgggatgacttAAAAAAAAGGGGGGGtgctctcccgatttttgaatatgtaggatcattcgccagggtccga
gctgagaattggatgAAAAAAAAGGGGGGGtccacgcaatcgcgaaccaacgcggacccaaaggcaagaccgataaaggaga
tcccttttgcggtaatgtgccgggaggctggttacgtagggaagccctaacggacttaatAAAAAAAAGGGGGGGcttatag
gtcaatcatgttcttgtgaatggatttAAAAAAAAGGGGGGGgaccgcttggcgcacccaaattcagtgtgggcgagcgcaa
cggttttggcccttgttagaggcccccgtAAAAAAAAGGGGGGGcaattatgagagagctaatctatcgcgtgcgtgttcat
aacttgagttAAAAAAAAGGGGGGGctggggcacatacaagaggagtcttccttatcagttaatgctgtatgacactatgta
ttggcccattggctaaaagcccaacttgacaaatggaagatagaatccttgcatAAAAAAAAGGGGGGGaccgaaagggaag
ctggtgagcaacgacagattcttacgtgcattagctcgcttccggggatctaatagcacgaagcttAAAAAAAAGGGGGGGa
![Page 41: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/41.jpg)
Biologia In Silico - Centro de Informática - UFPE
Where is the Implanted Motif?
atgaccgggatactgataaaaaaaagggggggggcgtacacattagataaacgtatgaagtacgttagactcggcgccgccg
acccctattttttgagcagatttagtgacctggaaaaaaaatttgagtacaaaacttttccgaataaaaaaaaaggggggga
tgagtatccctgggatgacttaaaaaaaagggggggtgctctcccgatttttgaatatgtaggatcattcgccagggtccga
gctgagaattggatgaaaaaaaagggggggtccacgcaatcgcgaaccaacgcggacccaaaggcaagaccgataaaggaga
tcccttttgcggtaatgtgccgggaggctggttacgtagggaagccctaacggacttaataaaaaaaagggggggcttatag
gtcaatcatgttcttgtgaatggatttaaaaaaaaggggggggaccgcttggcgcacccaaattcagtgtgggcgagcgcaa
cggttttggcccttgttagaggcccccgtaaaaaaaagggggggcaattatgagagagctaatctatcgcgtgcgtgttcat
aacttgagttaaaaaaaagggggggctggggcacatacaagaggagtcttccttatcagttaatgctgtatgacactatgta
ttggcccattggctaaaagcccaacttgacaaatggaagatagaatccttgcataaaaaaaagggggggaccgaaagggaag
ctggtgagcaacgacagattcttacgtgcattagctcgcttccggggatctaatagcacgaagcttaaaaaaaaggggggga
![Page 42: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/42.jpg)
Biologia In Silico - Centro de Informática - UFPE
Implanting Motif AAAAAAGGGGGGG
with Four MutationsatgaccgggatactgatAgAAgAAAGGttGGGggcgtacacattagataaacgtatgaagtacgttagactcggcgccgccg
acccctattttttgagcagatttagtgacctggaaaaaaaatttgagtacaaaacttttccgaatacAAtAAAAcGGcGGGa
tgagtatccctgggatgacttAAAAtAAtGGaGtGGtgctctcccgatttttgaatatgtaggatcattcgccagggtccga
gctgagaattggatgcAAAAAAAGGGattGtccacgcaatcgcgaaccaacgcggacccaaaggcaagaccgataaaggaga
tcccttttgcggtaatgtgccgggaggctggttacgtagggaagccctaacggacttaatAtAAtAAAGGaaGGGcttatag
gtcaatcatgttcttgtgaatggatttAAcAAtAAGGGctGGgaccgcttggcgcacccaaattcagtgtgggcgagcgcaa
cggttttggcccttgttagaggcccccgtAtAAAcAAGGaGGGccaattatgagagagctaatctatcgcgtgcgtgttcat
aacttgagttAAAAAAtAGGGaGccctggggcacatacaagaggagtcttccttatcagttaatgctgtatgacactatgta
ttggcccattggctaaaagcccaacttgacaaatggaagatagaatccttgcatActAAAAAGGaGcGGaccgaaagggaag
ctggtgagcaacgacagattcttacgtgcattagctcgcttccggggatctaatagcacgaagcttActAAAAAGGaGcGGa
![Page 43: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/43.jpg)
Biologia In Silico - Centro de Informática - UFPE
Where is the Motif??? atgaccgggatactgatagaagaaaggttgggggcgtacacattagataaacgtatgaagtacgttagactcggcgccgccg
acccctattttttgagcagatttagtgacctggaaaaaaaatttgagtacaaaacttttccgaatacaataaaacggcggga
tgagtatccctgggatgacttaaaataatggagtggtgctctcccgatttttgaatatgtaggatcattcgccagggtccga
gctgagaattggatgcaaaaaaagggattgtccacgcaatcgcgaaccaacgcggacccaaaggcaagaccgataaaggaga
tcccttttgcggtaatgtgccgggaggctggttacgtagggaagccctaacggacttaatataataaaggaagggcttatag
gtcaatcatgttcttgtgaatggatttaacaataagggctgggaccgcttggcgcacccaaattcagtgtgggcgagcgcaa
cggttttggcccttgttagaggcccccgtataaacaaggagggccaattatgagagagctaatctatcgcgtgcgtgttcat
aacttgagttaaaaaatagggagccctggggcacatacaagaggagtcttccttatcagttaatgctgtatgacactatgta
ttggcccattggctaaaagcccaacttgacaaatggaagatagaatccttgcatactaaaaaggagcggaccgaaagggaag
ctggtgagcaacgacagattcttacgtgcattagctcgcttccggggatctaatagcacgaagcttactaaaaaggagcgga
![Page 44: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/44.jpg)
Biologia In Silico - Centro de Informática - UFPE
Why Finding (15,4) Motif is Difficult?
atgaccgggatactgatAgAAgAAAGGttGGGggcgtacacattagataaacgtatgaagtacgttagactcggcgccgccg
acccctattttttgagcagatttagtgacctggaaaaaaaatttgagtacaaaacttttccgaatacAAtAAAAcGGcGGGa
tgagtatccctgggatgacttAAAAtAAtGGaGtGGtgctctcccgatttttgaatatgtaggatcattcgccagggtccga
gctgagaattggatgcAAAAAAAGGGattGtccacgcaatcgcgaaccaacgcggacccaaaggcaagaccgataaaggaga
tcccttttgcggtaatgtgccgggaggctggttacgtagggaagccctaacggacttaatAtAAtAAAGGaaGGGcttatag
gtcaatcatgttcttgtgaatggatttAAcAAtAAGGGctGGgaccgcttggcgcacccaaattcagtgtgggcgagcgcaa
cggttttggcccttgttagaggcccccgtAtAAAcAAGGaGGGccaattatgagagagctaatctatcgcgtgcgtgttcat
aacttgagttAAAAAAtAGGGaGccctggggcacatacaagaggagtcttccttatcagttaatgctgtatgacactatgta
ttggcccattggctaaaagcccaacttgacaaatggaagatagaatccttgcatActAAAAAGGaGcGGaccgaaagggaag
ctggtgagcaacgacagattcttacgtgcattagctcgcttccggggatctaatagcacgaagcttActAAAAAGGaGcGGa
AgAAgAAAGGttGGG
cAAtAAAAcGGcGGG
..|..|||.|..|||
![Page 45: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/45.jpg)
Biologia In Silico - Centro de Informática - UFPE
Próxima Aula• Ler capitulo 1 do Durbin • Introdução a algoritmos
dinâmicos (10/08)
![Page 46: Biologia In Silico - Centro de Informática - UFPE Ivan G. Costa Filho igcf@cin.ufpe.br Centro de Informática Universidade Federal de Pernambuco Processamento](https://reader035.vdocuments.net/reader035/viewer/2022070507/570638401a28abb8238f1217/html5/thumbnails/46.jpg)
Biologia In Silico - Centro de Informática - UFPE
Agradecimentos• Alguns slides extraidos de
– Biological Sequence Analysis course, CBS, Universidade Tecnica da Dinamarca
– Neil Jones, University of California at San Diego