structural biology 8/27/10. why determine structures? visualize primary sequence in context of...

38
Structural Biology 8/27/10

Upload: randell-parker

Post on 12-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Structural Biology

8/27/10

Page 2: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Why determine structures?

Visualize primary sequence in context of folded protein (buried vs. solvent exposed)

Highlight residues important for intermolecular interactions (co-crystals, packing, or computational (docking))

Allow for the design of properly folded mutant proteins

Visualize surface features to aid in identifying or designing binding partners(e.g. clefts, promontories, hydrophobic, or specifics of fold)

Provide a template for modeling studies to understand the function of related molecules

Allow use of structural databases to gain insight into function/evolution

Page 3: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Structural Biology techniques

● Electron microscopy ≈ 5Å ?

● NMR equivalent resolution ≈ 2Å

● X-ray crystallography ≈ 1Å

● Hybrid techniques EM + NMR/Crystallography

Page 4: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

4.0 Å

3.0 Å

1.8 Å

1.0 Å

Molecules

secondarystructureelements

residues

atoms

Low

HighR

esol

utio

n

Center for Biological Sequence analysis DTU

Resolution particulars

Page 5: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

X-ray crystallography & Nuclear Magnetic Resonance (NMR)

X-ray crystallography utilizes information gleaned from bouncing X-rays offan ordered array of molecules. NMR utilizes information about magnetic

environment of nuclei with non-zero spin

NMR provides several snapshots of the object of interest all ~equally validX-ray crystallography provides one snapshot of the object of interest.

NMR cannot be practically used for large molecules (at least not yet). X-ray can be used for even very large molecules and complexes.

Most importantly, structures that have been determined using bothtechniques are very similar!

Page 6: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

NMR(very basic drill)

Purify proteinADC

PRE-AMPRECEIVER DETECTOR

TRANSMITTER

CONTINUOUS REFERENCE

BINARY NUMBERS TO COMPUTERS

= 500 MHz = 500,000,000 Hz

499,995,000 < o < 500,005,000 Hz

sample

PROBE

Magnet

+-

+-5,000 Hzo- =

Collect data

Analyze Data (make assignments) Apply distance constraintsand calculate structures

Page 7: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Varian Inova-600 spectrometer

Page 8: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

NMR - How it works

NMR uses the behavior of nuclei with magnetic moments in an applied magnetic field.

For a given type of nucleus (1H), introduce RF radiation and excite transitions of nuclei from low to high energy state. Monitor emittedRF radiation as nuclei descend to low energy state (decay).

Biochemistry 5th edition,Berg, Tymoczko&Stryer

FID (Free induction decay)

Page 9: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

NMR (continued)

De-convolute (separate) all the RF emissions from the FIDto get a spectrum

Individual nuclei transition at slightly different frequencies (resonances)depending on their chemical environments (electron clouds, other nuclei). The difference in resonance frequencies of nuclei from those of the same nuclei in a standard compound, are called chemical shifts. Therefore,Each protein has a unique spectrum for a given nuclei (1H, 13C, 15N,etc)

Example reference compound trimethylsilane (TMS)

Page 10: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

A nuclear Overhauser effect (NOE) experiment give peaks between protons that are close in space even though they’re not bonded.

A correlation spectroscopy (COSY) experiment results in peaks between protons that are connected through covalent bonds. In this way, individual amino-acids have a characteristic signature (i.e. Ala vs. Ser).

Intro to Protein Structure, Branden & Tooze

By using COSY and NOESY experiments, one can identify various AAs and their neighboring AAs (sequential assignment). Once assignments are made,NOE info gives distance constraints. Distance constraints between atoms,once the atoms have been identified, reveal the structure!

Page 11: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Refinement is used in conjunction with known geometricand energetic constraints (in addition to the acquired distance constraints).

Because of the limited number of distance constraints and the natureof solution-structure determination, one ends up with a set of structuresthat satisfy the distance criteria. So called “lowest penalty structures”.

Kim et al.,Nature 404, 151 - 158 (09 March 2000)

Page 12: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

X-ray crystallography(basic drill)

Grow crystals Collect diffraction data

Solve structure

Page 13: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Protein phase diagram

nucleationClear

metastable

[protein]

[pre

cipi

tant

]

(constant temperature, pressure, pH)

precipitate

undersaturated

Page 14: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

How do get a protein crystal?This is the hard part!

Start with very pure protein

Get a supersaturated solution

Wait (sometimes a long time!)

Keep trying….

Page 15: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Crystallization

Most common technique is vapor diffusion.

Reservoir (0.5 mL of 20% PEG 8,000, 200mM MgCl2, 100mM Tris pH 8)

Drop (2L protein (20 mg/mL), 2L Reservoir solution)

Cover with clear tape and place at RT or 4ºC

Reservoir will slowly pull water out of drop and drop will concentrate.Hopefully you’ll get crystals. Many commercially available screens.

0.3mm

Page 16: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

If you want to make a well-behaved, soluble expression construct spanning a region of a protein with unknown structure, you would:

C) Use several different 2º structure prediction algorithms with your sequence of interest and any homologs

E) Make several different constructs with different starts and stops

A) Do a data base search to identify other proteins with similar protein sequences

D) Compare all of these 2º structure predictions (and decide!)

F) All of the above

B) Use several different sequence alignment algorithms to align any homologous sequences

But remember, before you try crystallizing…..

Then you must work out expression, purification details!

Page 17: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

When X-rays shine on atoms, the atoms become new sources ofX-radiation. Each atom reflects X-rays in all directions. There is structural information in the “scattered X-rays, but it’s too weak when the atoms are from just one protein molecule.

A crystal aligns a very large number of molecules in the same orientation.

This provides the potential for a much stronger signal than when usingjust one molecule.

X-rays

crystal

Scattered X-raysreinforce in certaindirections and cancelin most others

Page 18: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Home X-ray setup

Page 19: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Cryo-protected crystal in rayon loop

Page 20: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Another way of thinking about it…

Crystal is composed of many families of “planes” of atoms. Each family of planes are parallel and each is separated fromthe next by a specific distance “d”. Reflection of X-rays from these planes is reinforced when the geometric situation pictured above is achieved.

Bragg’s law - 2dsinθ = nn usually = 1, is wavelength and is known

Page 21: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Two dimensional crystal

“a” and “b” are the lengths of the sidesof the unit cell (each unit cell in black).O is the origin.

The sets of planes (green, blue, pink)are called Miller planes. The green setintersects the cell edge “a” at a=1/2and cell edge “b” at b=1. Therefore,the green set of planes are the (2,1)of Miller planes. What you do is invertthe 1/2 and it becomes 2. If the planesintersected “a” at 1/3, and “b” at 1/4,they would be the (3,4) family of Millerplanes. Etc. You just look at the unitcell in the upper left corner – The planesare drawn in all the cells to show theyintersect all the cells in the same way

Note: if you slowly rotated this crystal in the X-ray beam, you would satisfythe requirements of Bragg’s law. Each set of planes would diffract in differentdirections.

Page 22: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

This is a real diffraction pattern of a crystal in a special orientation(X-rays are being shined directly into the side of a unit cell)

a

b

h k lGreen (2,1,0)Blue (1,1,0)Pink (1,-1,0)Orange (-4,4,0)

Every reflection arises from a different set of Miller planes.Every reflection has an index h,k,l – no two are the same.

Page 23: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

So since we know the crystal to “film” distance, the wavelength, and where the spots are on the film, we can use geometry and calculate the size of “a” and “b”.

This diffraction shows thatthis crystal has systematic absenses. But given the regularity of the diffractionpattern, we can easily measurethe spacings along “a” and “b”.

Direct beam

Where (1,0,0)would be

1.5 mm

Page 24: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

X-rays80mm

detector

(1,0,0) 1.5 mm

So 1.5/80 = tan 2θtan 2θ = 0.018752θ = tan-1(0.01875) = 1.074º θ = 0.537º

2dsinθ = 1.5418Å (CuK)Re-arranging:d = 1.5418Å/2sinθSolving:a = 82.2Å

Page 25: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

We can do the same for b and c. Actually, programs have gotten sosophisticated, you feed any random orientation picture to a programand it scans the image, finds the spots and uses them to determinea,b,c, and any angles between them and the lattice type, the symmetryand the orientation of the crystal! In other words, the program knowsthe Miller indices for all the spots.

So you simply start turning the crystal and collecting images. For example, you turn the crystal 1º and take a 1º oscillation picture.Do this for 180º, and you have a full data set.

Integrate spot(Add counts in

Pixels)

Integrate background(then subtract from spot)

Page 26: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Do this for all (e.g.) ~40,000 spots in you data set.

Now you must scale the spots from one image to the next (sometimes your shooting through a thicker part of the crystal etc.)

When all spots have been integrated and scaled, you have a data set.

Now each spot (h,k,l) should really be considered to be a wave. The intensity of the spot is the amplitude and the number of oscillationsacross the unit cell is revealed by its Miller indices. The (1,0,0) reflection would have one wavelength (of a sinusoidal wave) in the unit cell along the a direction, the (2,0,0) would be two wavecrests, etc.

These waves can be added together – sometimes reinforcing, some-times cancelling out. When they’ve all been added together, theydescribe the shape of the “thing” that scattered them originally.

Page 27: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

X-ray diffraction data

2 0 3 1483.63 -1 -3 19999.9 3 -1 -2 6729.63 -1 -1 30067.13 -1 1 8227.03 -1 2 29901.53 -1 3 24487.53 -1 4 502.1

h k l I

Each data point has indexand intensity

Bragg’s Law: n = 2d sin

Now all we need is the “phase”for each data point (reflection)

3 dimensions

Page 28: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

f(x) = F0cos2(0x + 0)+

F1cos2(1x + 1)+

F2cos2(2x + 2)+

F3cos2(3x + 3)+

F4cos2(4x + 4)+...

Fncos2(nx + n)

f(x) = Fhcos2(hx + h)h=1

n

Gale Rhodes Crystallography Made Crystal Clear(2nd edition)

Fourier Series(1D example)

Page 29: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

The only trouble is, we must know the offset (phase) for each of the waves. In the previous 1D example, the phases were either 0º or180º. Remember, we have ~40,000 of these “waves”. We know howtall they are and we know their wavelengths, but we don’t know the phases. The so-called “Phase Problem”

origin

width of cell

1 wavelength

origin

width of cell

This ? Or this ?

Page 30: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

One way to address this is to introduce a “heavy” atom into a crystal and collect another data set (say HG dataset).

Now sinusoidal waves can also be represented as vectors.

0, 360

= 45º

The length of the vector is the amplitude ofthe wave, the direction is the phase.

Now we have two data sets. One set is HGthe other is native (NAT). We can use a techniquecalled the Patterson function to locate thecoordinates of the Hg atom. The Pattersonfunction doesn’t require phases.

Once we locate (in x,y,z) the Hg atom, we actually know its contributionto each diffraction spot – its little vector!

Page 31: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Now Miller indices are a very convenient way of thinking aboutdiffraction from a crystal. A more accurate way of thinking about what makes a given data point (h,k,l) relatively intense or weakis given by this formula:

Fhkl is a vector. It is the sum of all the little vectors from all the atomsin the cell. But we have located the Hg atom so we know its x, y, z. So we know the direction and the phase for the contribution to the reflection made by the Hg atom! We will call this Fhg.

FhgFhkl

Page 32: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

So what we have are a bunch of |Fhkl|s – we have magnitudesbut not directions. So we will represent them as circles with radii thatare proportional to their magnitude.

|FNAT|

Native reflection hkl

|FHG|

HG reflection hkl

And for the hkl reflection, we know vector Fhg (Note an Fhg for each hkl)

We also know that |FNAT| + Fhg = |FHG|

Or |FHG| - Fhg = |FNat|

Page 33: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

|FNAT||FHG|

Fhg

-Fhg

-Fhg

Native HA

|FHG| is offset by -Fhg

Page 34: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Another derivative (or other help)…

Page 35: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Structure solved at CAMD

IQGAP1 “GAP-related domain”43kD

Page 36: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

IQGAP1 GRD vs p120 RasGAP

Page 37: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

HIV matrix

Page 38: Structural Biology 8/27/10. Why determine structures? Visualize primary sequence in context of folded protein (buried vs. solvent exposed) Highlight residues

Tiam1 Rac1