classwork ii: nj tree using mega. 1.go to cdd webpage and retrieve alignment of cd00157 in fasta...

20
Classwork II: NJ tree using MEGA. 1. Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2. Import this alignment into MEGA and convert it to MEGA format http://www.megasoftware.net/mega3/mega.html . 3. Construct NJ tree using different distance measures with bootstrap. 4. Analyze obtained trees.

Upload: sydney-hood

Post on 04-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Classwork II: NJ tree using MEGA.

1. Go to CDD webpage and retrieve alignment of cd00157 in FASTA format.

2. Import this alignment into MEGA and convert it to MEGA format http://www.megasoftware.net/mega3/mega.html .

3. Construct NJ tree using different distance measures with bootstrap.

4. Analyze obtained trees.

Page 2: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

2.1 Maximum parsimony: definition of informative sites.

Maximum parsimony tree – tree, that requires the smallest number of evolutionary changes to explain the differences between external nodes.

Site, which favors some trees over the others.

1 2 3 4 5 6 7

A A G A C T G A G C C C T G A G A T T T C A G A G T T C * * Site is informative (for nucleotide sequences) if there are at

least two different kinds of letters at the site, each of which is represented in at least two of the sequences.

Page 3: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

2. Maximum parsimony.

1.G

2.C

G A

3.A

4.A

A

1.G

3.A

A

2.C

4.A

1.G

4.A

A A

2.C

3.A

Tree 1. Tree 2. Tree 3.

Site 3

Site 3 is not informative, all trees are realized by the same number of substitutions.

Advantage: deals with characters, don’t need to compute distance matrices.

Disadvantage:

- multiple substitutions are not considered

- branch lengths are difficult to calculate

- slow

Page 4: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

2.3 Maximum parsimony method.

1. Identify all informative sites in the alignment.

2. Calculate the minimum number of substitutions at each informative site.

3. Sum number of changes over all informative sites for each tree.

4. Choose tree with the smallest number of changes.

Page 5: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Maximum likelihood methods.

• Similarity with maximum parsimony:

- for each column of the alignment all possible trees are calculated

- trees with the least number of substitutions are more likely

• Advantage of maximum likelihood over maximum parsimony:

- takes into account different rates of substitution between different amino acids and/or different sites

- applicable to more diverse sequences

Page 6: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Classwork: maximum marsimony.

1. Search the NCBI Conserved Domain Database for pfam00127.

2. Construct maximum parsimony tree using MEGA3.

3. Analyze this tree and compare it with the phylogenetic tree from the research paper.

Page 7: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Protein engineering and protein design.

Protein engineering – altering protein sequence to change protein function or structure

Protein design – designing de novo protein which satisfies a given requirement

Page 8: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Protein engineering strategies.

Goals:

• Design proteins with certain function

• Increase activity of enzymes

• Increase binding affinity and specificity of proteins

• Increase protein stability

• Design proteins which bind novel ligands

Page 9: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Protein engineering uses combinatorial libraries.

• Random mutagenesis introduces different mutations in many genes of interest.

• Active proteins are separated from inactive ones: - in vivo (measuring effect on the whole cell)

- in vitro (phage display, gene is inserted into phage DNA, expressed, selected if it binds immobilized target protein)

Page 10: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Specificity of Kunitz inhibitors can be optimized by protein engineering.

• Kunitz domains – specific inhibitors of trypsin-like proteinases, highly conserved structure with only 33% identity.

• Each Kunitz domain recognizes one or more proteinases through the binding loop (yellow).

• Phage display method found mutants of Kunitz inhibitors which have higher specificity than native ones.

• Modeling of mutant proteins showed that enhanced specificity is caused by increased complementarity between binding loop and the active site.

Page 11: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Native state can be stabilized by reducing the difference in entropy

between folded and unfolded conformations

U

F

G

Reaction coordinate

ΔG

STHG

Page 12: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Model system: lysozyme from bacteriophage T4.

• Lysozyme has the ability to lyse certain bacteria by hydrolyzing the b-linkage between N-acetylmuramic acid (NAM) and N-acetylglucosamine (NAG) of the peptidoglycan layer in the bacterial cell wall.

• Conformational transition in lysozyme involves the relative movement of its two lobes to each other in a cooperative manner

Page 13: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Disulfide bridges increase protein stability.

• Increasing stability by reducing the number of unfolded conformations (since enthalpic contribution will be the same for folded and unfolded states).

• Task: to find positions on backbone where Cysteines can be introduced for disulfide bonds formation.

Page 14: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Strategy of introducing a new disulfide bond.

B. Mathews, 1989:• Analysis of disulfide bonds geometries in existing structures.

• Analysis of all pairs of amino acids which are close in space.

• Energy optimization of candidate disulfide bonds.

• Analysis of destabilizing effect of exchanging native amino acids into Cys.

As a result: three disulfide bonds were introduced through mutagenesis experiments in lysozyme

Page 15: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Stability of mutants compared to wild-type protein.

Measure of stability – melting temperature at which 50% of enzyme is inactivated during reversible heat denaturation. For wild-type Tm = 42 C.

• all mutants were more stable than wild-type.

• the longer the loop between Cys, the larger the effect (the more restricted is unfolded state).

• the more disulfide bonds were introduced, the more stable was the mutant.

From B. Mathews et al

Page 16: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Attempts to fill cavities to stabilize lysozyme failed…

• Introduction of cavities of size –CH3 group destabilizes protein by ~ 1kcal/mol.

• T4 lysozyme has two cavities; mutations Leu Phe and Ala Val destabilize the protein by ~ 0.5-1.0 kcal/mol.

• New side-chains (Val and Phe) adopt unfavorable conformations in cavities.

Page 17: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Classwork IV: analyzing the lysozyme’s mutants.

• Retrieve structure neighbors (1PQM and 1KNI) of 2LZM.• Which mutant might have an increased stability and why?

Page 18: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Can structural scaffolds be reduced in size with maintaining function?

A. Braisted & J.A. Wells used Z-domain (58 residues) of bacterial protein A:

• removed third helix (truncated protein - 38 residues);

• mutated residues in the first and second helices;

• used phage display to select active forms;

• restored the binding of truncated protein.

Page 19: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Designing an amino acid sequence that will fold into a given structure.

• Inverse protein folding problem: designing a sequence which will fold into a given structure – much easier than folding problem!

• B. Dahiyat & S. Mayo: designed a sequence of zinc finger domain that does not require stabilization by Zn.

• Wild type protein domain is stabilized by Zn (bound to two Cys and two His); mutant is stabilized by hydrophobic interactions.

Page 20: Classwork II: NJ tree using MEGA. 1.Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. 2.Import this alignment into MEGA and convert

Paracelsus challenge: convert one fold into another by changing 50% of residues.

• Challenge because all proteins with > 30% identity seem to have the same fold.

• L.Regan et al: Protein G (mainly beta-sheet) was converted to Rop protein (alpha-helical) by changing only 50% residues