1 multiple sequence alignment lesson 3. 2 1. what is a multiple sequence alignment?
Post on 21-Dec-2015
265 views
TRANSCRIPT
![Page 1: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/1.jpg)
1
Multiple sequence alignmentMultiple sequence alignment
Lesson 3Lesson 3
![Page 2: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/2.jpg)
2
1. What is a multiple sequence 1. What is a multiple sequence alignment?alignment?
![Page 3: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/3.jpg)
3
VTISCTGSSSNIGAG-NHVKWYQQLPGVTISCTGTSSNIGS--ITVNWYQQLPGLRLSCSSSGFIFSS--YAMYWVRQAPGLSLTCTVSGTSFDD--YYSTWVRQPPGPEVTCVVVDVSHEDPQVKFNWYVDG--ATLVCLISDFYPGA--VTVAWKADS--AALGCLVKDYFPEP--VTVSWNSG---VSLTCLVKGFYPSD--IAVEWWSNG--
Similar to pairwise alignment BUT n sequences are aligned instead of just n=2
Multiple sequence Multiple sequence alignmentalignment
![Page 4: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/4.jpg)
4
MSA = Multiple Sequence AlignmentEach row represents an individual sequenceEach column represents the ‘same’ position
VTISCTGSSSNIGAG-NHVKWYQQLPGVTISCTGTSSNIGS--ITVNWYQQLPGLRLSCSSSGFIFSS--YAMYWVRQAPGLSLTCTVSGTSFDD--YYSTWVRQPPGPEVTCVVVDVSHEDPQVKFNWYVDG--ATLVCLISDFYPGA--VTVAWKADS--AALGCLVKDYFPEP--VTVSWNSG---VSLTCLVKGFYPSD--IAVEWWSNG--
Multiple sequence Multiple sequence alignmentalignment
![Page 5: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/5.jpg)
5
Multiple sequence alignmentMultiple sequence alignment
Homosapiens
Pantroglodytes
Musmusculus
Canisfamiliaris
Gallusgallus
Anophelesgambiae
Drosophilamelanogaster
Caenorhabditis elegans
Arabidobsisthaliana
Rattusnorvegicus
![Page 6: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/6.jpg)
6
Histone H4 proteinHistone H4 protein
![Page 7: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/7.jpg)
7
Multiple sequence alignmentMultiple sequence alignment
NADH dehydrogenase subunit 4
Histone H4 protein 4
►Which is better – pairwise alignment of a pair of rows in MSA?
![Page 8: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/8.jpg)
8
2. How MSAs are computed2. How MSAs are computed
![Page 9: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/9.jpg)
9
Alignment – Dynamic Alignment – Dynamic ProgrammingProgramming
There is a dynamic programming algorithm for n sequences similar to the pairwise alignment
Complexity :
O(n|sequences|)
![Page 10: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/10.jpg)
10
Alignment methodsAlignment methods
This is not practical complexity, therefore heuristics are used:
• Progressive/hierarchical alignment (Clustal)
• Iterative alignment (mafft, muscle)
![Page 11: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/11.jpg)
11
ABCDE
Compute the pairwise Compute the pairwise alignments for all against all alignments for all against all
(6 pairwise alignments).(6 pairwise alignments).The similarities are The similarities are
converted to distances and converted to distances and stored in a tablestored in a table
First step:
Progressive alignmentProgressive alignment
ABCDE
A
B8
C1517
D161410
E32313132
![Page 12: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/12.jpg)
12
A
D
C
B
E
Cluster the sequences to create a Cluster the sequences to create a tree (tree (guide treeguide tree):):• represents the order in which pairs ofrepresents the order in which pairs of sequences are to be aligned sequences are to be aligned• similar sequences are neighbors in thesimilar sequences are neighbors in the tree tree • distant sequences are distant from eachdistant sequences are distant from each other in the tree other in the tree
Second step: ABCDE
A
B8
C1517
D161410
E32313132
The guide tree is imprecise The guide tree is imprecise and is NOT the tree which and is NOT the tree which truly describes the truly describes the evolutionary relationship evolutionary relationship between the sequences!between the sequences!
![Page 13: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/13.jpg)
13
Third step:A
D
C
B
E
1. Align the most similar (neighboring) pairs
sequence
sequence
sequence
sequence
![Page 14: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/14.jpg)
14
Third step:A
D
C
B
E
2. Align pairs of pairs
sequence
profile
![Page 15: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/15.jpg)
15
Third step:A
D
C
B
E sequence
profile
Main disadvantages:
• Sub-optimal tree topology
• Misalignments resulting from globally aligning pairs of sequences.
![Page 16: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/16.jpg)
16
ABCDE
IterativeIterative alignmentalignment
Guide tree
MSA
Pairwise distance table
A
DCB
Iterate until the MSA does not change (convergence)
E
![Page 17: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/17.jpg)
17
3. MSA – What is it good for?3. MSA – What is it good for?
A.A. Conserved positionsConserved positions
B.B. ConsensusConsensus
C.C. PatternsPatterns
D.D. ProfilesProfiles
E.E. Much more…Much more…
![Page 18: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/18.jpg)
18
3. MSA – What is it good for?3. MSA – What is it good for?
A.A. Conserved positionsConserved positions
B.B. ConsensusConsensus
C.C. PatternsPatterns
D.D. ProfilesProfiles
E.E. Much more…Much more…
![Page 19: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/19.jpg)
19
Consensus sequenceConsensus sequence
ATCTTGT
AACTTGT
AACTTCT
AACTTGT
A consensus sequence holds the most frequent character of the alignment at each column
![Page 20: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/20.jpg)
20
Consensus sequence – an Consensus sequence – an exampleexample
TACGAT
TATAAT
TATAAT
GATACT
TATGTT
TATGTT
The -10 region of six promoters. There are many variants to the
“consensus.”
TACGAT
TATAAT
TATAAT
GATACT
TATGAT
TATGTT
![Page 21: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/21.jpg)
21
Consensus sequence – an Consensus sequence – an exampleexample
TACGAT
TATAAT
TATAAT
GATACT
TATGAT
TATGTT
TATAAT
1 .Strict majority . *In case of equal
frequencies – choose one according to the alphabet order.
![Page 22: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/22.jpg)
22
Consensus sequence – an Consensus sequence – an exampleexample
Had we searched the region upstream of genes for this consensus, we would have identified only 2 out of the 6 sequences. So we will miss many cases.
By chance, we expect a “hit” every 4,096 bp.
TACGAT
TATAAT
TATAAT
GATACT
TATGAT
TATGTT
TATAAT
![Page 23: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/23.jpg)
23
Consensus sequence – an Consensus sequence – an exampleexample
We can search while allowing 1 mismatch.
we would have identified 3 out of the 6 sequences. So we will miss less cases.
By chance, we expect a “hit” every ~200bp → more “noise”.
TACGAT
TATAAT
TATAAT
GATACT
TATGAT
TATGTT
TATAAT
![Page 24: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/24.jpg)
24
Consensus sequence – an Consensus sequence – an exampleexample
We can search while allowing 2 mismatches.
we would have identified all 6 sequences. So we won’t miss.
By chance, we expect a “hit” every ~30bp → A LOT OF “noise”.
TACGAT
TATAAT
TATAAT
GATACT
TATGAT
TATGTT
TATAAT
![Page 25: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/25.jpg)
25
Consensus sequence – an Consensus sequence – an exampleexample
2. Majority only when it is a clear case. In the remaining cases – use wildcards.
Y = PyrimidineR = PurineN = Any nucleotide
TACGAT
TATAAT
TATAAT
GATACT
TATGAT
TATGTT
TATRNT
![Page 26: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/26.jpg)
26
Reminder: Purines & PyrimidinesReminder: Purines & Pyrimidines
Y = PyrimidineR = PurineN = Any nucleotide
![Page 27: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/27.jpg)
27
Consensus sequence – an Consensus sequence – an exampleexample
Had we searched the region upstream of genes with the redundant consensus, we would have identified 4/6 sequences.
By chance, we expect a “hit” every ~500 bp.
TACGAT
TATAAT
TATAAT
GATACT
TATGAT
TATGTT
TATRNT
![Page 28: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/28.jpg)
28
Consensus sequence – an Consensus sequence – an exampleexample
There is always a tradeoff between sensitivity and specificity.Sensitivity: the fraction of true positive predictions among all positive predictions. Specificity: the fraction of true negative predictions among all negative predictions.
TATRNT TATAAT
![Page 29: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/29.jpg)
29
Consensus sequence – an exampleConsensus sequence – an exampleSensitivity: the fraction of true positive predictions among all positive predictions
Specificity: the fraction of true negative predictions among all negative predictions
Permissive consensus: higher sensitivity, lower specificity (more true positives , more false positives ↔ less true negatives , less false negatives ) Nonpermissive consensus: higher specificity, lower sensitivity (less true positives , less false positives ↔ more true negatives , more false negatives )
![Page 30: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/30.jpg)
30
3. MSA – What is it good for?3. MSA – What is it good for?
A.A. Conserved positionsConserved positions
B.B. ConsensusConsensus
C.C. PatternsPatterns
D.D. ProfilesProfiles
E.E. Much more…Much more…
![Page 31: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/31.jpg)
31
PatternsPatterns
TACGAT
TATAAT
TATAAT
GATACT
TATGAT
TATGTT
[TG-]A-]TC[-]GA[-]CTA[-]T[
Patterns are more informative than consensuses sequences.
Pattern specify for each position the possible characters for this position.
![Page 32: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/32.jpg)
32
Patterns - syntaxPatterns - syntax
• The standard IUPAC one-letter codes. • ‘x’ : any amino acid. • ‘][’ : residues allowed at the position. • ‘{}’ : residues forbidden at the position. • ‘()’ : repetition of a pattern element are indicated in
parenthesis. X(n) or X(n,m) to indicate the number or range of repetition.
• ‘-’ : separates each pattern element. • ‘‹’ : indicated a N-terminal restriction of the pattern. • ‘›’ : indicated a C-terminal restriction of the pattern. • ‘.’ : the period ends the pattern.
![Page 33: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/33.jpg)
33
• W-x(9,11)-]FYV[-]FYW[-x(6,7)-]GSTNE[
PatternsPatterns
Any amino-acid, between 9-11
times
F or Y or
V
WOPLASDFGYVWPPPLAWSROPLASDFGYVWPPPLAWSWOPLASDFGYVWPPPLSQQQ
![Page 34: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/34.jpg)
34
3. MSA – What is it good for?3. MSA – What is it good for?
A.A. Conserved positionsConserved positions
B.B. ConsensusConsensus
C.C. PatternsPatterns
D.D. ProfilesProfiles
E.E. Much more…Much more…
![Page 35: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/35.jpg)
35
Profile =Profile = PSSM =PSSM = PPositionosition SSpecificpecific SScorecore MMatrixatrixACCCAA
AACCGG
AACCTT
123456
A1.6700.33.33
C0.331100
G0000.33.33
T0000.33.33
![Page 36: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/36.jpg)
36
P(AACCAA)= 1 × 0.67 × 1 × 1 × 0.33 × 0.33 P(GACCAA)= 0
Sequences with higher probabilities → higher chance of being related to the PSSM.
123456
A1.6700.33.33
C0.331100
G0000.33.33
T0000.33.33
Profiles / PSSMsProfiles / PSSMs
![Page 37: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/37.jpg)
37
One compares each n-mer to the profile and computes the probabilities. Sequences with probabilities > threshold are considered as hits.
Searching with PSSMSearching with PSSM
GACGGTACGTAGCGGAGCGACCAA
Computes the probability of the first 6-mer
123456
A1.6700.33.33
C0.331100
G0000.33.33
T0000.33.33
![Page 38: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/38.jpg)
38
6-mers with probabilities > threshold are considered as hits .
Searching with PSSMSearching with PSSM
P2
P3
P4
GACGGTACGTAGCGGAGCGACCAA
GACGGTACGTAGCGGAGCGACCAA
GACGGTACGTAGCGGAGCGACCAA
GACGGTACGTAGCGGAGCGACCAAP1
123456
A1.6700.33.33
C0.331100
G0000.33.33
T0000.33.33
![Page 39: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/39.jpg)
39
Profile-pattern-consensusProfile-pattern-consensus
AACTTG
AAGTCG
CACTTC
12345
A0.66100.
T0001.
C0.3300.660.
G000.330.
AACTTG
[AC-]A-]GC[-T-]TC[-]GC[
multiple alignment
consensus
pattern
profile
NANTNN
![Page 40: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/40.jpg)
40
4. HMM:4. HMM:HHidden idden MMarkov arkov MModelsodels
![Page 41: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/41.jpg)
41
Definitions & UsesDefinitions & Uses
• A probabilistic model which deals with sequences of symbols.Uses: inferring hidden states.
• Originally used in speech recognition (the symbols being phonemes)
• Useful in biology – the sequence of symbols being the DNA\Proteins.
![Page 42: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/42.jpg)
42
Markov ChainsMarkov Chains• A sequence of random variables X1,X2,… where each present state depends only on the previous state.
• Weather example:
The weather in day xdepends only on day x-1:
• We can easilycompute the probability of:Sunny Sunny Rainy Sunny Sunny
![Page 43: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/43.jpg)
43
Markov ChainsMarkov Chains
• Similarly we can assume a DNA sequence is Markovian • ACGGTA…(vertical or horizontal!)• These conditional probabilities can be illustrated as follows
(in DNA)
• Each arrow has a transition probability: PCA = P(xi=A|Xi-1=C)
• Thus – the probability of a sequence x will be :
A T
C G
ii xxLiLL PxPxxxPxP 11111 )(),...,,()(
![Page 44: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/44.jpg)
44
Hidden Markov ModelsHidden Markov Models
• The state sequence itself follows a simple Markov chain. But-
• In a HMM it is no longer possible to know the state by looking at the symbols – the state is hidden.
P
B
PPP
BB
Si+1SiSi-1
Ki+1KiKi-1
S1
K1
Sn
Kn. . . . . .
. . . . . .
![Page 45: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/45.jpg)
45
The weather HMM exampleThe weather HMM example
• In this weather example only the actions are observable and the weather is hidden:
![Page 46: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/46.jpg)
46
• {S, K, Π, P, B}
• S : {s1…sN } are the values for the hidden states
• K : {k1…kM } are the values for the observations
• The hidden states emit/generate the symbols (observations)
• Π = {Πi} are the initial state probabilities
• P = {Pij} are the state transition probabilities
• B = {bik} are the emission probabilities
HMM formalitiesHMM formalities
P
B
PPP
BB
Si+1SiSi-1
Ki+1KiKi-1
S1
K1
Sn
Kn. . . . . .
. . . . . .
![Page 47: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/47.jpg)
47
Another HMM example –Another HMM example –the dishonest casinothe dishonest casino
• In a casino, they use a fair dice most of the time, but occasionally switch to an unfair dice. The switch between dice can be represented by an HMM:
1: 1/62: 1/63: 1/64: 1/65: 1/66: 1/6
1: 1/102: 1/103: 1/104: 1/105: 1/106: 1/2
FAIR UNFAIR
0.05
0.1
0.950.9
1: 1/62: 1/63: 1/64: 1/65: 1/66: 1/6
1: 1/102: 1/103: 1/104: 1/105: 1/106: 1/2
0.05
0.1
0.950.9
UNFAIR
FAIR
![Page 48: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/48.jpg)
48
Dishonest casino - continuedDishonest casino - continued
• The symbols (observations) are the sequence of rolls:
3 5 6 2 1 4 6 3 6…
• What is hidden?
If the die is fair or unfair:
f f f f u u u f f
This is a Markov chain.
Except for that, we have:
• Emission probabilities:
Given a state, we have 6 possible matching symbols,
each with an emission probability.
1: 1/62: 1/63: 1/64: 1/65: 1/66: 1/6
1: 1/102: 1/103: 1/104: 1/105: 1/106: 1/2
FAIR UNFAIR
0.05
0.1
0.950.9
![Page 49: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/49.jpg)
49
HMM of MSAHMM of MSA
• MSA can be represented by an HMM
– Insertion of A/C/G/T
– Match or Mismatch
– Deletion
![Page 50: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/50.jpg)
50
HMM of MSAHMM of MSA
• MSA can be represented by an HMM
– Insertion of A/C/G/T
– Match or Mismatch
– Deletion
![Page 51: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/51.jpg)
51
HMM of MSA can get more complex…HMM of MSA can get more complex…
![Page 52: 1 Multiple sequence alignment Lesson 3. 2 1. What is a multiple sequence alignment?](https://reader035.vdocuments.net/reader035/viewer/2022062216/56649d565503460f94a350f8/html5/thumbnails/52.jpg)
52
Questions where HMM’s are Questions where HMM’s are used:used:
• Does this sequence belong to a particular
family?
• Can we identify regions in a sequence (for
instance – alpha helices, beta sheets)?
• Pairwise/multiple sequence alignment
• Searching databases for protein families
(building profiles).