assigning transmembrane segments to helices in intermediate-resolution structures angela enosh sarel...

32
Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from a presentation made by Angela Enosh

Post on 18-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Assigning Transmembrane Segments to Helices in Intermediate-Resolution

Structures

Angela Enosh Sarel J. Fleishman Nir Ben-Tal &Dan Halperin

Adapted from a presentation made by Angela Enosh

Page 2: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Lecture Outline

Background The assignment problem The algorithm Validation

Page 3: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

TM proteins form helixbundles

Figure 1: 3D structure of Bacteriorhodopsin

Transmembrane (TM) proteins cross membrane planes

Constitute approximately 50% of contemporary drug targets

Helices typically cross the membrane

Loops are typically located on the external/internal side of the membrane, connecting consecutive helices

Page 4: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Adapted from http://vertrees.org/ by Jason Vertrees

Page 5: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

TM proteins amino-acid sequence

TM / EM segment 2D-arrangement can be predicted on basis of the sequence data alone

membran

e membran

e

Page 6: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

TM protein 3D structure

Technical problems hamper TM protein structure determination

Only 30 distinct folds have been solved using high resolution methods such as X-ray crystallography

Page 7: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Cryo-electron microscopy (Cryo-EM)

Determines protein structure with low resolution ( >4Å)

Individual amino-acids cannot be identified

Supplies the locations of the helices Exact structure is left ambiguous

Page 8: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Cryo-electron microscopy (cryo-EM)

Bovine rhodopsin; adapted from Krebs et al. (2003) J. Biol. Chem. 278, 50217.

*

Page 9: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Problem description Input and Target

Position, orientation and azimuth of helices with respect to the membrane planes

Partitioning of the sequence into TM segments (helices) and extra membrane segments (loops)

Target: Find correspondence between the TM helix-segments and the cryo-EM helices

Attempt to reduce the number of possible assignments

Page 10: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Find the native assignment of:

TM segments (I-VII) to cryo-EM helices (A-G).

Given the helices seen in cryo-EM maps (A-G) Given the sequence classified as TM/EM segments (I-VII)

Example

Page 11: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

The Algorithm Stage I: Pruning by distance constraints

Eliminate helices assignments based on the estimated maximal length of the loops.

Construction of an assignment graph that contains only the set of feasible assignments.

Page 12: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

The Algorithm Stage II: Ranking the feasible assignments

Use known protein structures taken from the Protein Data Bank (PDB)

Score each assignment based on the capability of loops to connect pairs of helices in 3D.

Page 13: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Formal Statement of the problem

Sequence of all segments:

TM segments:

EM segments:

}...,{ , 21 iikiiii tttTST

}...,{ , 21 iikiiii xxxXSX

},...,,,{ 12211 nn TXXTXTS

Page 14: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Formal Statement of the problem (cont.)

3D Helix denotedcoordinates of the atoms

Membrane defined by inner and outer plane Maximal distance between two points that

can be connected by is denoted it is deduced from the distance between consecutive atoms, typically 3.8Å

The external and internal are denoted

}...{ 21 iikiii cccC C

iX )(max_dist iX

C

)(ext),int(Ci iCC

Page 15: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Formal Goals Find all feasible assignments of ‘s and

‘s An assignment is a permutation where

is assigned to Attribute a score to each assignment

based on the compatibility with locations of the helices

Remark: N-Termini and C-Termini can be deduced experimentally

iTiC

iT

)(iC

)(F

Page 16: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Stage I:Pruning by Distance Constraints

Acyclic Graph: Vertices: Edges:

),( int extEEVG },1:),{( njyCTV ji

)}max_dist(X)ext(C),ext(C(:),(),{(

)}max_dist(X)int(C),int(C(:),(),{(

imj1

imj1int

distCTCTE

distCTCTE

mijiext

miji

Page 17: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

C

B

I-II

12 AA

II-III

4 AA

Valid path in G ~ feasible assignment

Short EM segments less feasible assignments

Graph Example

BI C

C

BA

AI CI

II A II IIB C

AIII BIII CIII

Page 18: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Graph construction Construction is bottom up A valid path in the graph is a path

which:

Starts at first level Ends at last level Alternating sequence of internal/external edges Does not contain two vertices with same helix

},...,,,{ 12211 nn veevev

Page 19: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Stage II:Ranking Feasible Assignments

A score is assigned to each feasible assignment stored in G

For each we define

defines the feasibility of connecting two helices in 3D-space by

1

1 )1()(),,()(

n

i iiik

kk CCXfF

)(F

!1 nk

f

iX

Page 20: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Based on the length of and a statistical analysis conducted on solved structures of soluble proteins

Only helix-loop-helix motifs used, denoted motif (A,L,B)

We examine all motifs with the same loop length (2-7)

Evaluationf

iX

Page 21: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Loop length classification

Only proteins which were less than 20% similar were selected

Page 22: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

All motifs with length are placed in a common orthogonal reference frame so that all A’s overlap

The starting points of the B’s are placed in separate data structures

KD-trees are used for efficient axis aligned queries

Evaluation: preprocessingf

)72( KD ll

l

Page 23: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Distribution of end points of short loops

Kinematics considerations allow a reachable space limited only by the length of the loop

Example: loop length of 4 results in 8 degrees of freedom

In reality the end points tend to be highly nonuniform

Highly significant with loops of length two to five

Still noticeable in loops of lengths up to seven

Page 24: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Distribution of the end points of EM loops of length 4

Page 25: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Distribution of the end points of EM loops of lengths 3 (left) and 4 (right)

Page 26: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

The 2 helices are placed in the same reference frame

Q is a cube around the start of B with a side size of Å

We define a colony function

the score depends on: number of neighboring points in the vicinity of q distances between these neighboring points and q

)(*10 iXlength

Qr

rqdist

iii eCCXf kk),(

)1()(),,(

Evaluation: scoringf

Page 27: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

The score of the assignment is the total score of its extra membrane segments

Define a weight for each edge

For each pathwe define to be:

),,()()1()(

iii kk CCXfeweight

eeweightF )()(

EvaluationF

},...,,,{ 12211 nn veevev F

Page 28: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

19 TM proteins with a known high resolution structure were tested

Two distinct cases:• Accurate data• Noisy data regarding the locations and

orientations of the helices

Validation

Page 29: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Dealing with uncertainty in cryo-EM data

Unknown orientation of the helix with respect to its axis

Unknown translation of the helix

Solution: A cylinder envelope is constructed around the end Termini

Page 30: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Name #h Loop lengths (#AA) Possible feasible rank

Bacterio-rhodopsin

7 3,14,2,3,10,47!=5040 948 13

Sensory rhodopsin

7 7,12,2,3,3,47!=5040 512 48

Lactose permease

12 3,2,1,3,1,24,3,1,3,1,112!>108 12 1

Cytochrome c oxidase E

5 5,6,1,15!=120 2 1

Cytochrome c oxidase H

3 7,23!=6 6 1

Acetylcholine receptor

4 4,4,1034!=24 22 1

Performance of the Algorithm

Page 31: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Summary

Provides more than a single assignment

The complexity of the problem scales with the

number of amino-acids in the extra-membrane

segments – not with the number of TM helices

Page 32: Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from

Questions