base stacking classification via automated clustering method eli hershkovits 1, xavier le faucheur...

21
Base stacking classification via automated clustering method Eli Hershkovits 1 , Xavier Le Faucheur 1 , Neocles Leontis 2 , Allen Tannenbaum 1 1 Georgia Institute of Technology, 2 BGSU

Upload: bennett-bruce

Post on 17-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Base stacking classification via automated clustering

method Eli Hershkovits1, Xavier Le Faucheur1, Neocles Leontis2, Allen Tannenbaum1

1Georgia Institute of Technology, 2BGSU

Data Classification

• Coordinate system and parameterization

• Clustering of the data (“by eye” or Automated clustering)

Base stackingRing Coordinate system

• the three orthogonal directions are calculated with Cremer and Pople method.

• The coordinates y1 and y2 can be used to define face of the ring (up or down.)

X1

Y1Z1

X2

Y

2

Z2

r12

Base stackingRelative Coordinate system

• Relative rings coordinates are defined by the spherical coordinates r and

r

r r

Primary Classification

• For each base stacking candidate the two closest rings are chosen to represent the pair. This choice gives a classification to four groups: Pyrimidine-pyrimidine Pyrimidine-imidazole, Imadizole-pyrimidine and Imidazole-imidazole.

• There are four possible combinations of face-face interactions: Up-up, Up down, Down-up, Down,down.

Parameters relevant for clustering

0

20

40

60

80

100

120

140

160

1 22 43 64 85 106 127 148 169 190 211 232 253 274 295 316 337 358 379

0

20

40

60

80

100

120

1 22 43 64 85 106 127 148 169 190 211 232 253 274 295 316 337 358 379

r

Parameters relevant for clustering

r

0

10

20

30

40

50

60

70

0 1 2 3 4 5 6

Parameters relevant for clustering

0

10

20

30

40

50

60

70

0 50 100 150 200 250 300 350 400

Secondary classification

• The polar coordinates “r” , “” and “” are correlated and show distinction to two clusters” “Proper stacking” and improper stacking.

• Those classifications give 4*4*2 = 32 classes

Pyr - Pyr

Relative orientation

proper improper

UU 143C:G142 155C:C154

DD 511A:A509 743G:C699

UD 144A:G135 172U:G164

DU 147G:U146 897A:G765

Im - Pyr

Relative orientation

proper improper

UU 132A:A131 231G:C230

DD 2813A:A2811 2792A:U2791

UD 226A:A215 273G:C271

DU 174A:C173

Pyr-Im

Relative orientation

proper improper

UU 129A:A128 1360C:A1358

DD 129A:A116 2058G:G636

UD 176U:A174 922A:G921

DU 893G:G892 866U:A776

Im-Im

Relative orientation

proper improper

UU 159G:G158 223G:G222

DD 2564G:A2513 1190G:A1189

UD 1626A:A1624

DU 1664A:G1663

ExamplesPyr-Pyr up up

ExamplesPyr-Pyr up down

ExamplesIm-Pyr up up

ExamplesIm-Pyr up down

ExamplesIm-Im up up

ExamplesIm-Im up down

Possible problems

• For stacking of residues that are not neighbors the distribution of is broad.

• Possible overlap between clusters.

Stacking of RNA on protein

• Stacking interactions between nucleic acids and amino acids are not abundant (9 for the large subunit RR0033.)

• Most of the stacking interactions are with Histidine (6.) From the staking cases 5 are with the pyrimidine ring.