protein engineering and directed evolution may 24, 2011

55
Protein Engineering and Directed Evolution May 24, 2011

Upload: herbert-mcdowell

Post on 30-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Protein Engineering and Directed Evolution May 24, 2011

Protein Engineering and Directed Evolution

May 24, 2011

Page 2: Protein Engineering and Directed Evolution May 24, 2011

Protein Engineering

Sequence

Structure

Function

A linear combination of 20 amino acids. Protein sequence dictated by gene sequence

The protein structure is dictated by its linear sequence

Protein function is dictatedby its three dimensional structure

Optimize functionsof proteins/enzymes

Structural informationcan be a guide

Generate variantsthrough alterationof DNA sequence

Page 3: Protein Engineering and Directed Evolution May 24, 2011
Page 4: Protein Engineering and Directed Evolution May 24, 2011

Enzymes in Food ProcessingIncreasing the quality of beer

The enzymes alpha-amylase, glucoamylase and glucose isomerase convert starch to high fructose corn syrup (HFCS). Alpha-amylase is used to liquefy starch slurry so that the starch is solubilized and readied for the next steps. Alpha-amylase splits the large amylose and amylopectin molecules that make up the starch into soluble dextrin fragments.

Starch Processing

www.genencor.com

Page 5: Protein Engineering and Directed Evolution May 24, 2011

What is protein engineering?• Optimize the properties of enzymes/proteins through

changes in protein sequence. Asking enzymes to function more efficiently, in harsh conditions and last longer, etc.– Industrial applications– Chemical applications– Agriculture applications– Pharmaceutical applications– Many more

• Introduce new properties into enzyme. Go beyond the biological context.– Perform catalysis under completely foreign conditions– Catalyze what is not observed in nature– De novo design of enzyme function (difficult, although

progressing)

Page 6: Protein Engineering and Directed Evolution May 24, 2011

Which properties can we improve?

• Catalysts

• Immune Response

• Control Units

• Structural Scaffolds

Biological Functions:

• Catalyze many classes of reactions

• High specificity / selectivity

• Low energy input

GOOD Properties:

• Low activities for non-natural reactions

• Marginally stable

• Industrial conditions

• Low expression in heterologous hosts

BAD Properties:

Chemical Synthesis

Bioremediation

Chemical Sensors

Pharmaceuticals

Metabolic Control

Engineer

Page 7: Protein Engineering and Directed Evolution May 24, 2011

• Enzymes are inherent designable• Only “optimized” in the context of their biological systems• Naturally evolvable and evolved. Divergent evolution is nature’s way of generating diversity

These enzymes also function using the same catalytic residues (serine, histidine and aspartic acid). However, they catalyze the cleavage of different substrates (divergence)

P1 (Cleavage site) Large Small Positivee.g. Tryptophan Alanine Lysine

Nature evolve new sequences, behaviors and biological functions

Evolution--mutation, recombination, and natural selection--has generated a fantastic array of functional molecules

Page 8: Protein Engineering and Directed Evolution May 24, 2011

We can use the evolution algorithm to create “new” enzymes

Cirino

Evolutionary approaches are generally more powerful as long as a suitable search strategy is available.

Page 9: Protein Engineering and Directed Evolution May 24, 2011

number of mutations

1

10

10

10

-1

-2

-3

wild-type enzymeperforming natural

function

wild-type enzymeperforming new function

Cirino

We can design enzymes to accommodate our needs:- novel specificity / activity / stability

New functions can be achieved if:

1) It is physically possible2) It is evolutionarily feasible

(a path of functional enzymes exists in sequence space)

3) We can generate genetic diversity

4) We can select / screen for improvements

Page 10: Protein Engineering and Directed Evolution May 24, 2011

Sequence Space

2 residues

2 amino acids

3 residues 2 amino acids

500 residues 20 amino acids

20500 = 10650 sequences

19500 ~ 104 dimensions

Huge:

Highly dimensional:

Cirino

Page 11: Protein Engineering and Directed Evolution May 24, 2011

{Sequence Space}

Fit

nes

s

The Fitness Landscape

starting point

Finish point

local maxima

Fitness LandscapeThe mapping from genotype (target sequence) to phenotype (fitness; as measured in the experiment). Directed evolution is an optimization on the fitness landscape.

Arnold, Nat. Mol Cell Bio. 2009

Evolution is a random walk on a fitness landscape in sequence space, survival of the fittest

Cirino

Page 12: Protein Engineering and Directed Evolution May 24, 2011

Ruggedness in Proteins

Wild-type

Improved Mutant AB

Intermediate Mutant A

Intermediate Mutant B

Fitness

Fitness

The Red and Green Residues are Interacting

Fitness

Mutations can be beneficiary and can also be deleteriousTwo single mutations may each be deleteriousBut the combination of two may be beneficiary

Cirino

Page 13: Protein Engineering and Directed Evolution May 24, 2011

Success requires an intelligent working strategy!

Protein space:sequences for a300-amino protein

20300 impossibly largeand mostly empty

Search technologies are limited (~10 clones)

Beneficial mutationsare rare

6-9

Local exploration of sequencespace around existingfunctional proteins (1-2 amino acid substitutions)

Generating new, useful proteins requiresaccumulation of multiple mutations

Rapid screen (or selection) to identify smallimprovements

Cirino

Page 14: Protein Engineering and Directed Evolution May 24, 2011

Example -

Glyphosate: Very effective herbicide-Toxic towards most crops-Decrease crop yield

Herbicide tolerance: more than 75% of genetically modifiedplants are engineered for herbicide tolerance

Acetyl-Glyphosate: Not herbicidal

What can directed evolution do? Examples

Page 15: Protein Engineering and Directed Evolution May 24, 2011

DNA Shuffling to improve activity

Page 16: Protein Engineering and Directed Evolution May 24, 2011

Graduate Improvement of Catalytic Properties

Page 17: Protein Engineering and Directed Evolution May 24, 2011

A very robust enzyme applications in transgenic plant development

Page 18: Protein Engineering and Directed Evolution May 24, 2011

GFP Can be evolved to Other FPs

Tsien, Annu Rev. Biochem 1998

Page 19: Protein Engineering and Directed Evolution May 24, 2011

Red Fluorescent Protein (RFP)

Isolated from nonbioluminescent reef corals

Tsien, Nature Biotechnology, 2000

Page 20: Protein Engineering and Directed Evolution May 24, 2011

Classic mutagenesisChemical mutagenesis with mutagens. Works mostly on the whole cell level.

Deamination with nitrous acid

C U pairs with A instead of GA H pairs with C instead of T

Alkylation with EMS or nitrosoguanidine

G 6Eq pairs with T instead of C

Page 21: Protein Engineering and Directed Evolution May 24, 2011

Classic mutagenesis by radiation

UV crosslinks two neighboringpyrimidine bases. Errors andmutations are introduced duringDNA repair by host enzymes.

Page 22: Protein Engineering and Directed Evolution May 24, 2011

Directed Mutagenesis/Evolution Strategies• Most mutagenesis strategies rely on PCR.• Most times knowing the gene sequence is essential.• Site-directed mutagenesis (point mutation).

– Introduce specific mutation at a specified location in the gene • Random mutagenesis .

– Introduce random mutations at a specified position or throughout the gene of interest

• DNA shuffling.– Shuffle mutants of the same gene to achieve diversity

• DNA family shuffling.– Shuffling homologous genes from different species to explore large

sequence space• Genome Shuffling.

– Shuffling genomes through homologous recombination.

Page 23: Protein Engineering and Directed Evolution May 24, 2011

Site-directed mutations • To probe the importance of a specific amino acid in a

protein sequence.– Is the amino acid involved in catalysis?– Does the amino acid dictate specificity?– Is the amino acid essential for protein function?

• If the importance of the amino acid is known (from crystal structure, biochemical analysis)– Mutate the amino acid to enhance enzyme properties– Alter the size of the amino acid to tighten/loosen enzyme

substrate specificity

Page 24: Protein Engineering and Directed Evolution May 24, 2011

Subtilisin stability can be improved by point mutations

• What is subtilisin:– A serine protease from Bacillus

bacteria– Broadly specific for proteins that

commonly soil cloth– Used widely as the “enzymatic

additive” in commercial laundry detergents

• Wild type subtilisin can be easily inactived– In the presence of bleach, the

protein becomes inactive very quickly (~90% inactivation)

– The inactivation is due to oxidation of the methionine at position 222 (M222)

Page 25: Protein Engineering and Directed Evolution May 24, 2011

Second Generation Subtilisin• M222 was systematically mutated to each of 19 other amino acids and

the stability of the mutant enzymes were investigated (Genencor). (Estell, JBC, 1985)

per

cen

t en

zym

e ac

tivi

ty

Time (min)

1M H2O2

Page 26: Protein Engineering and Directed Evolution May 24, 2011

Site directed mutagenesis is limited

• Site directed mutagenesis is limited in its scope. – Difficult to predict which substitution can be beneficiary.– More than one residue can contribute to enzyme activity and stability. – “Key” residues unknown. – The availability of the crystal structure helps, but does not allow a

reliable prediction of what/where the mutations should be.

• Protein engineers therefore need to generate all possible amino acid changes at one, or a combination of residues.

• How do we modify the PCR-based mutagenesis procedures to – 1) all possible mutations at a single position? – 2) introduce multiple mutations

Page 27: Protein Engineering and Directed Evolution May 24, 2011

Degenerate Oligonucleotides

5’ ACG GTC GAT GTA CCA GGG CCC AAC 3’

100% 100% 100% 100%

During normal primer synthesis,the desired nucleotide is addedat to the growing oligonucleotide.Each nucleotide pool is 100% pure.

N

25% A25% C25% G25% T

To make a degenerate primer,A mixed nucleotide pool is used in additional to the four pure pools.During DNA synthesis, N canbe added to the oligonucleotideinstead of one of the four purenucleotides.

5’ ACG GTC GAT GTA NNN GGG CCC AAC 3’

64 possible combinationscovering all 20 amino acidsincluding stop codons.

Page 28: Protein Engineering and Directed Evolution May 24, 2011

Saturation mutagenesis example

These authors found 5 residuesthat interact with the substrate directly.

Saturation mutagenesis were performedsimultaneously at all five positions.

Library size: 20X20X20X20X20

3.2 million possible combinations

The desired mutant contained mutationsin four of the five residues. The newenzyme property cannot be achieved with single residue mutations.

M. jannaschii TyrRS bound to tyrosineWang and Schultz, 2003

Page 29: Protein Engineering and Directed Evolution May 24, 2011

Error-prone PCR generate random point mutation(s)

Parentgene

Cirino

Page 30: Protein Engineering and Directed Evolution May 24, 2011

Error-prone PCR: Random Mutagenesis• Altering the PCR conditions to make it prone to errors during

amplification random incorporation of substitutions.

• Normal PCR reaction: MgCl2, 0.2mM dNTPs, template DNA, primers, DNA polymerase, thermal cycling (95C, 55C, 72C)– Taq polymerase error rate: 2 X 10-4

– pfu polymerase error rate: 7 X 10-7

• Error-prone PCR conditions which INCREASE error rates of Taq polymerase and accumulate mutations– Staggered dNTP concentration (0.2 mM dATP & dGTP, 1.0 mM

dCTP & dTTP)– Addition of MnCl2 (affects Taq error rate)– Increase the number of PCR cycle– Increase the length of molecule to be amplified

Cirino

Page 31: Protein Engineering and Directed Evolution May 24, 2011

NdeI BamHI

pET28 expression plasmid

Library of mutant expressionplasmids

Error-prone PCR Mutation and Amplification.

Cut PCR product with NdeI and BamHI, Purify insert library

Library creationGOI

Add primers contain restriction sites.

Screen fordesired properties

Page 32: Protein Engineering and Directed Evolution May 24, 2011

Error Prone PCR – Subtilisin Example

• Goal: To have subtilisin function in a nonaqueous solvent.

• Unlike the previous example, this property cannot be predicted and one has no idea where to start.

• Solution: error prone PCR. 10 successive rounds of mutagenesis were performed. In each round, the improved mutant was selected. The gene encoding the mutant serves as template for the next round of error prone PCR.

Chen and Arnold, PNAS, 1993, p5618

log scalechangein activity

Page 33: Protein Engineering and Directed Evolution May 24, 2011

Aiming for great sequence diversityIf the fitness landscape is rugged, point mutations alone are likely to lead to local optima. Point mutations are too gradual to allow the block changes that are required for continued sequence evolution

{Sequence Space}

Fit

nes

s

The Fitness Landscape

Page 34: Protein Engineering and Directed Evolution May 24, 2011

DNA shuffling recombines different mutants to allow greater sequence space exploration

Differentmutants from asingle gene

all combinationsof mutations

Cirino

Page 35: Protein Engineering and Directed Evolution May 24, 2011

DNA shuffling recombines mutants• DNA Recombination allows us to look at a larger portion of sequence

space (compared to what point mutagenesis allows).

• Those sequences which are being explored are already “solutions” (i.e., the sequences already correspond to fold and function, at least in another protein) reduction in search space

• Combines additive mutations and removes deleterious mutations (e.g., after several rounds of error-prone PCR)

• More likely to result in “new” functions (compared to accumulating single point mutations)

Page 36: Protein Engineering and Directed Evolution May 24, 2011
Page 37: Protein Engineering and Directed Evolution May 24, 2011

DNA Shuffling

1. Digest PCR products of homologous genes. Create pool of ssDNA fragments (short single strand DNA). Perform “primerless” PCR to

reassemble genes.

1.

3.

Cut and clone reassembled genes for expression.

2.

DNase I digestion

Page 38: Protein Engineering and Directed Evolution May 24, 2011

Genetic recombination assay

Wildtype Lac Zα on pUC 18 plasmid

Transform E. coli in presence of X-Gal

Lac Zα Mutants

75 b.p.

stop codons

Transform E. coli in presence of X-Gal

Stemmer, W. P. PNAS Vol. 91 pp. 10747-10751 1994

Page 39: Protein Engineering and Directed Evolution May 24, 2011

Genetic recombination assay (cont.)

Recombine mutant genes

white

white

white

blue

Transform E. coli in presence of X-Gal. Count blue and white colonies to measure recombination frequency.

Mutant 1

Mutant 2

Ratio of active recombinant colonies after assembling 50-100bp fragments was 24% (n=386)

Page 40: Protein Engineering and Directed Evolution May 24, 2011

Negative mutations are suppressedStarting mutants may have bothpositive and negative mutations.The net change of the mutant maybe positive negative mutationsmasked

DNA shuffling generates all possible combination of pointmutants large library

Backcrosses with wild type regioncan remove negative mutations.

Recombinants with largenumber of negative mutationsare eliminated from the next round of DNA shuffling.Positive mutants are selected to

go to the next round of shuffling.

Page 41: Protein Engineering and Directed Evolution May 24, 2011

Error prone PCR and Shuffling together are powerful protein engineering techniques

.

0 1 2 3 4 5 6

Generation

Re

lati

ve

ac

tiv

ity

1

10

20

wtrandom mutagenesis

recombination

random mutagenesis

Further shuffling

Page 42: Protein Engineering and Directed Evolution May 24, 2011

Family shuffling

Key: the starting genes are already nature’s solutions after natural evolution. They contain functional domains.

Page 43: Protein Engineering and Directed Evolution May 24, 2011

Example of family shufflingGoal: Increase the activity of cephalosporinase towards moxalactam (an antibiotic)

1. Select four related cephalosporinase from different species

Nature, 391, 1998, p288

2. Generate point mutants of each gene and shuffle the mutants of each gene separately (8 fold improvement in activity for each cephalosporinase.

3. Combine all the mutants from all four genes and perform family shuffling.The best mutants from family shuffling were 270-540 fold more active.

Page 44: Protein Engineering and Directed Evolution May 24, 2011

Cephalosporinase Family Shuffling

Page 45: Protein Engineering and Directed Evolution May 24, 2011

Genome Shuffling of Antibiotic Producing Streptomyces Strains

• Streptomyces are important industrial organisms for the production of antibiotics, anticancer drugs and other small molecule pharmaceutical compounds

• Examples: Tetracyclines, erythromycin, daunorubicin, mithramycin, lovastatin (Zocor)

• Streptomyces are soil borne, gram-positive bacteria that live under unfavorable conditions (starvation, among a population of other bacteria)

• The antibiotics are produced as secondary metabolites, mostly for self-defense.

Page 46: Protein Engineering and Directed Evolution May 24, 2011

Classic Mutagenesis is often used to find high-producing mutant strains

• How do we find a mutant strain of Streptomyces fradiae that produces higher amounts of antibiotic tylosin (Eli Lilly)?

• The directed evolution of microorganisms have traditionally been through the asexual process of classical strain improvement (CSI): sequential random mutagenesis and screening.

• The sequential mutagenesis are performed using mutagens and UV radiation.

• Most of times, the nature of the mutation is not important. (Black box approach)

• Although CSI is the method of choice in pharmaceutical companies, the process is inefficient and usually take decades and $$$$ to isolate a significantly improved mutant.

Page 47: Protein Engineering and Directed Evolution May 24, 2011

CSI vs. Genome shufflingIn CSI, during one round of mutagenesis, a large number of mutants can be recovered.Usually, only the best performing mutant strain will be selected and be subjected to additional mutagenesis. Genome shuffling takes all the mutants that show improvementover parent strain and shuffle the genomes together to generate combinations of mutations (mimicking the natural evolution of species). This process is analogousto DNA shuffling, but on a much more grand scale (genomes vs.genes).

Maxygen, Nature, 2002

Page 48: Protein Engineering and Directed Evolution May 24, 2011

How is genome shuffling possible?• Combine the cellular contents of several mutant strains

through protoplast fusion.

• During protoplast fusion, homologous recombination between homologous chromosomal regions will take place, allowing mutations to be passed from one strain to another.

• Fused protoplasts can be regenerated into single cells carrying shuffled genomes.

Page 49: Protein Engineering and Directed Evolution May 24, 2011

Genome Shuffling

Page 50: Protein Engineering and Directed Evolution May 24, 2011

Screening / Selecting Improved Variants(generally considered the hard part)

Key Point: You get what you screen for! And other properties or functions not selected for may be lost.

Some Concerns:• How well does your screen reflect your desired function?

•Sensitivity of the screen (what is the background – how well can you identify small improvements?)

•Screening capabilities / sampling of library / library size

•Equipment requirements (robotics, cell sorter, imaging)

Page 51: Protein Engineering and Directed Evolution May 24, 2011

How do we look for the desired mutant?

Page 52: Protein Engineering and Directed Evolution May 24, 2011

Selection vs screening• Selection is unambiguous (as long as all the control experiments have

been done). Easy to identify a mutant enzyme that has evolved to allow the bacteria to survive under certain selection criteria.

• For many enzymes, selection is difficult or impossible to setup (i.e. many enzyme functions are not essential to bacterial function, such as therapeutic proteins)

• Screening is the systematic method to find the mutant of choice from large library of mutants.– Screening cell phenotype if possible.– Color or fluorescence screening is efficient and easy.– The least efficient method is to analyze each sample manually for the

desired properties (e.g. product formation)

Page 53: Protein Engineering and Directed Evolution May 24, 2011

Screening – improving with technology

A few clones needed for screening / site-directedmutagenesis

hand-pick colonies by an unluckygraduate student

colorassay

product assay

Picked by arobot

Page 54: Protein Engineering and Directed Evolution May 24, 2011

Examples of selection• A mutant aminoacyl-tRNA synthetase that can

incorporate an unnatural amino acid in an antibiotic selection marker

• Improved antibiotic resistance enzymes that allow the cells to survive higher concentrations of an antibiotic

• A regulatory protein that can turn on gene expression when induced by a small molecule

Page 55: Protein Engineering and Directed Evolution May 24, 2011

Example of screening • Antibodies with high affinities for antigen.

• Design an small molecule inhibitor that tightly binds to and blocks a cell-surface protein or an enzyme inside a cell

• To generate a growth factor or hormone with increased affinity for its receptor

• Mutant enzyme catalyzing a novel reaction