conformational proteomics of macromolecular architecture

433

Upload: gulielmus-ribeirus

Post on 31-Mar-2015

370 views

Category:

Documents


13 download

DESCRIPTION

In this book we have gathered a few examples of macromolecular archi- tectures that elaborate their machinery in biological systems. The invited authors are selected among the excellent ones performing in the frontline of their fields. The aim is to present to the young generation of research- ers the fascination of structural science.

TRANSCRIPT

Page 1: Conformational Proteomics of Macromolecular Architecture
Page 2: Conformational Proteomics of Macromolecular Architecture

Conformational Proteomics of

Macromolecular Architecture

Approaching the Structure of Large Molecular Assemblies and

Their Mechanisms of Action

Page 3: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 4: Conformational Proteomics of Macromolecular Architecture

R. Holland Cheng & Lena Hammar, editors

Karolinska Institutet. Sweden

Conformational Proteomics of

Macromolecular Architecture

Approaching the Structure of Large Molecular Assemblies and

Their Mechanisms of Action

vp World Scientific N E W JERSEY * LONDON S INGAPORE * B E l J l N G . S H A N G H A I - HONG KONG * TAIPEI * C H E N N A I

Page 5: Conformational Proteomics of Macromolecular Architecture

Published by

World Scientific Publishing Co. Pte. Ltd.

5 Toh Tuck Link, Singapore 596224 USA ofice: Suite 202, 1060 Main Street, River Edge, NJ 07661 UK ofice: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library

To give full credit to the unique structural information given in illustrations, an all color electronic version is enclosed, see attached CD.

CONFORMATIONAL PROTEOMICS OF MACROMOLECULAR ARCHITECTURE Approaching the Structure of Large Molecular Assemblies and Their Mechanisms of Action (With CD-Rom)

Copyright 0 2004 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof: may not be reproduced in any form or by atiy means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 981-238-614-9 ISBN 981-238-615-7 (pbk)

Printed in Singapore by World Scientilic Printers ( S ) Pte Ltd

Page 6: Conformational Proteomics of Macromolecular Architecture

PREFACE

In this book we have gathered a few examples of macromolecular archi- tectures that elaborate their machinery in biological systems. The invited authors are selected among the excellent ones performing in the frontline of their fields. The aim is to present to the young generation of research- ers the fascination of structural science. It is also an attempt to provide background knowledge and theories for discussions at the forum of struc- tural dynamics.

Above all, this book is dedicated to the young scientists and fresh- men researchers on their way into dedicated studies. With this selection of articles from the wild and progressive biological field it is our hope that the book can find its way to students in advanced structural sciences and conformational proteomics. Hopefully, it will serve as a platform for seminars and discussions, and provide a platform for future explorers.

The selection covers a portion of today’s structural biology, focusing on dynamic aspects on the large macromolecular assemblies. Among these are the viruses, the ribosome, the giant enzymes, and the molecular machineries for moving around. We stress the importance of the tools, the ways and means by which to acquire new information and knowl- edge. Therefore the selection of articles is also aimed to illustrate con- temporary techniques at their present best, in a fast developing stage as they are. It is fascinating to find that biological science and technology is so well integrated in the minds of those that dwell in the frontline ac- tively advancing the field.

Dynamic Architectures

In the mid of the last century there was a gap between high and low reso- lution - the small molecules were seen by X-ray crystallographers and the large structures by electron microscopists, both handling brake- through techniques in structural biology. In the gap between them a new field developed, where details down to the atomic level could gradually be mapped and understood in their surroundings. Where the building blocks of the large molecular architectures of life could be revealed, and

V

Page 7: Conformational Proteomics of Macromolecular Architecture

vi R. Holland Cheng &Lena Hammar

the construction be appreciated. - We now have a long record of macro- molecular architectures revealed in detailed structure. Thus, high and low-resolution methods are converging, and additional ones appear. Along with that comes the quiz of dynamics, the functionality, and the structure in action, locomotion. Actually, it is with the structure action is performed. Therefore we now search not only the details of the structure or the beauty of the architectonic construction per se, but the nature of its embedded dynamics!

Today, we have acquired methods allowing snapshots of life events, glimpses of the machinery of giant molecules, membrane inter- comunication and other macromolecular architectures. Around the corner one can envision what to be understod as life systems. That is the assembly in its biological environment, in the cell, in the tissue and the body. The comprehendable information from high and low resolution structural methods have come more close than ever before. The development of real time techniques and the nanotechnology makes science more exciting than ever. In this volume a selection of the vision- aries, with their strong knowledge and elegant tools, present to you a few of the present day’s frontlines in the dynamic structural science.

The Theory of Quasi-equivalence Through more than 40 years the theorem of Don Caspar and Aron Klug has guided our structural thinking. Plain and simple in its approach, the theory of quasi-equivalence brings us into the futuristic structural viro- logy - the simple insight that the possibilities lay there in the deviations as well as in the perfect lattice (Caspar D, and Klug A. Cold Spring Harbor Symposia on Quantitative Biology, XXVII: 1-24, 1962). It appeared in the search for a common construction principle of virus capsids. It is timely that we here can present in the beginning of this volume a historian’s summery on how the insight emerged together with Don Caspar’s pre- sent view on its usefulness. The latter is further supported by second generation’s studies in this realm, as seen throughout this volume. A vi- rus structure database, based on the nomination standard originating from the theorem is available and is presented in the last chapter.

Page 8: Conformational Proteomics of Macromolecular Architecture

Preface vii

Approaching Large Assemblies in Membranes Membranes continue to fascinate us by their dynamic character and capa- city to create compartments for different cellular activities. They provide a support for assembly of large molecular constructs and contain the complicated water channel. To solve the structure of these requires method innovators at their best, as presented in the Chapters 5-7.

Protein Shuttle

The evolution has created specialized transport and communication sys- tems between membrane compartments. The clathrin with its companion represent a cellular assembly designed to recycle multiple membrane receptors; An adoptable cage of polyhedral shape that has been in the textbooks for years, but only recently been understood in some detail (Chapter 8).

Molecular Machines Giant Enzymes, Ribosome, and Motion engines can all be regarded as Complex Macromolecular Machines (Chapters 9-16). It is striking that several giant enzymes, elaborating with multi-copies of a few compo- nents, shares architectonic designs and dynamic constructions with vi- ruses. Contrasting to that is the ribosome, the master of concerted activ- ity. Like a very efficient printing office for peptide polymers, it is con- structed from a multitude of different pieces acting together. Here, mo- tion of individual parts reveals some of its dynamic logic. Two motion engines are presented, the propeller motion device of the bacterial fla- gella (Chapter 15) and the muscular engine for our own motion (Chapter 16). Both structures, the result of masterly teamwork and dedicated in- sight in dynamics, exemplify the role of conformational quasiequiva- lence in highly ordered assemblies.

Conformational Proteomics The proteomics part of our approach is considerable, which demands new and refined tools for data acquisition. This is amply exemplified in

Page 9: Conformational Proteomics of Macromolecular Architecture

... V l l l R. Holland Cheng &Lena Hammar

the different chapters of this volume. Some tools and comments are pro- vided in the last section (Chapters 17-20).

Acknowledgements As editors we would like to express our gratitude to the chapter authors who have devoted enormous efforts to fulfill the goals of this book. Their dedication and positivism has been truly stimulating, and made the editing of this work an exciting experience. We also like to thank all the participants of the Structural Forum at Karolinska Institute for venting contemporary problems in structural biology, and thereby provide a base for this selection of articles, and Amersham-Biotech Inc. and CristalRe- search AB, for their contributions. Special thanks are due to professore eremite Bror Strandberg, Uppsala, and Bjorn Afzelius, Stockholm, pio- neers in Swedish crystallography and electron microscopy, respectively, for sharing with us their never ending scientific enthusiasm and support in this project. The publisher is worth great appreciation for creative support throughout the collaboration on this volume.

Holland Cheng and Lena Hammar

R Holland Cheng, PhD Professor of Molecular and Cellular Biology Adv Microscopy & Proteomics University of California Briggs Hall, Davis, CA 95616, USA [email protected]

Lena Hammar, PhD Associate Professor in Biochemistry Karolinska Institute Structural Virology Novum, Halsovagen 7 Huddinge, SE 14157, Sweden 1ena.hammar 63 biosci.ki.se

Karolinska Institute Structural Virology Novum, Halsovagen 7 Huddinge, SE 14157, Sweden holland.cheng @biosci.ki.se

Page 10: Conformational Proteomics of Macromolecular Architecture

CONTENTS

PREFACE

PART I. GEOMETRY AND ACTION IN VIRUSES

Chapter 1.

Chapter 2.

Chapter 3.

Chapter 4.

Early Theories of Virus Structure Gregory J. Morgan

Quasi-Equivalence and Adaptability in Living Molecular Assemblies Donald L. Caspar and Lena Hammar

The Role of Disordered Segments in Viral Coat Proteins Lars Liljas

Prefusion Dynamics in an Enveloped Virus - Alphavirus Model Lena Hammar, Lars Haag, Bomu Wu and R. Holland Cheng

V

1

3

41

53

78

PART 11. APPROACHING LARGE ASSEMBLIES IN MEMBRANES 109

Chapter 5. Strategy to Obtain High Resolution Structure of

Hiroshi Aoyanza, Eiki Yamashita, Keisuke Sukarai and Tomitake Tsukihara

Membrane Proteins by X-ray Crystallography 111

Chapter 6. On the Possibility of Determining Structures of Membrane Proteins in Two-Dimensional Crystals Using X-Ray Free Electron Lasers Michael Becker and Edgar Weckert

133

ix

Page 11: Conformational Proteomics of Macromolecular Architecture

X Contents

Chapter 7. Functional Details on Membrane Proteins Observed Through an Electron Beam 148 Yoshinori Fujiyoshi

PART 111. PROTEIN SHUTTLE 159

Chapter 8. Clathrin and Companions Barbara Pearse

PART IV. GIANT ENZYMES

Chapter 9. Multifunctional Enzyme Complexes: Multistep Catalysis by Molecular Machines Richard N. Perham

Chapter 10. Metamorphosis of an Enzyme Rudolf Ladenstein, Winfried Meining, Xiaofeng Zhang, Markus Fischer and Adelbert Bacher

Chapter 1 1. Optimizing an Enzyme for Its Physiological Role: Structural and Functional Comparisons of ATP Sulfurylases from Three Different Organisms Andrew J. Fisher, Ian J. Macrae, John D. Beynon, Eric B. Lansdon and Irwin H. Segel

161

169

171

198

222

PART V. RIBOSOMES 243

Chapter 12. Ribosomal Crystallography: Dynamics, Flexibility and Peptide Bond Formation 245 Ada Yonath

Chapter 13. The Dynamics of the Ribosome as Inferred by Cryo-EM: Induced and Self-organized Motions Joachim Frank

29 1

Page 12: Conformational Proteomics of Macromolecular Architecture

Contents xi

Chapter 14. How do Translation Factors Catalyze Protein Synthesis 307 Martin Laurberg, Ole Kristensen, Maria Selmer, Xiao-Dong Su and Anders Liljas

PART VI. MOTION ENGINES 33 1

Chapter

Chapter

5 . Dynamic Aspects of the Bacterial Flagellum 333 Keiichi Namba

6. Myosin Polymorphism and Muscle Contraction 345 Kenneth C. Holmes and Rasmus R. Schroeder

PART VII. AROUND THE BENCH PROTEOMICS 359

Chapter 17. Is Crystallization a "Bottleneck" of Modern Structural Crystallomic? Jan Sedzik

Chapter 18. Sensor Surface Interactions in the Study of Macromolecular Assemblies Jose' M. Casasnovas, Sevak Markarian and Lena Hammar

361

379

Chapter 19. PPiDB - A Protein-Protein Interactions Database 391 Prasanna R. Kolatkar and Lin Kui

Chapter 20. Virus Particle Explorer (VIPER): A Repository of Virus Capsid Structures 403 Vijay S. Reddy, Padmaja Natarajan, Gabriel Lander, Chunxu Qu, Charles L. Brooks, III and John E. Johnson

Index 413

Page 13: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 14: Conformational Proteomics of Macromolecular Architecture

PART I GEOMETRY AND ACTION IN VIRUSES

Page 15: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 16: Conformational Proteomics of Macromolecular Architecture

Chapter 1

EARLY THEORIES OF VIRUS STRUCTURE

Gregory J. Morgan*

This paper traces the beginnings of structural virology, from the early 1950’s to the presentation of the Caspar-Klug theory of virus structure in 1962. It focuses primarily on the virus research of Francis Crick, James Watson, Rosalind Franklin, Aaron Klug, and Donald Caspar. Collaborative efforts in X-ray crystallography and electron microscopy in combination with intellectual triggers from the Art world provided the soil from which the early theories of virus structure grew and matured. Keywords: Virus structure, Caspar-Klug theory, geodesic domes, tensegrity, self-assembly, quasi-equivalence, history of molecular biology, 5-fold symmetry

INTRODUCTION In the 1950’s and early 1960’s the field of structural virology blossomed. Theoretical speculation was driven by the technical development in biochemistry, electron microcopy, and X-ray crystallography. The improved methodology for data collection and evaluation made it possible to explore the structure of large molecular assemblies, and among them viruses. In the early to mid 1950’s, Francis Crick and James Watson speculated that “spherical viruses” possess cubic symmetry and quite possibly icosahedral symmetry. Using single crystal X-ray diffrac- tion, Donald Caspar confirmed the icosahedral symmetry of Tomato Bushy Stunt Virus (TBSV or BSV). Watson showed that the rod-shaped Tobacco Mosaic Virus (TMV) is helical and Rosalind Franklin later determined its exact helical parameters. Aaron Klug found that Turnip Yellow Mosaic Virus (TYMV) and poliovirus also possess icosahedral

“Department of Philosophy, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, USA, [email protected]

3

Page 17: Conformational Proteomics of Macromolecular Architecture

4 Gregory J. Morgan

symmetry. The Crick-Watson theory of virus structure applied best to 60 equivalently placed, identical subunit structures. However, emerging evidence suggested that many viral shells exceeded 60 subunits. Aaron Klug and Don Caspar endeavored to explain the lacuna. Analogies with Linus Pauling’s alpha helix and Francis Crick’s coiled coil motif, as well as with Buckminster Fuller’s geodesic domes and Kenneth Snelson’ s tensegrity sculptures allowed them to conceive an innovative principle of virus structure, which they called “quasi-equivalence.” They realized that large numbers of identical protein subunits could bond together in quasi- equivalent ways to build genome-carrying shells while maintaining the same inter-subunit contact pattern. They generalized Buckminster Fuller’s architectural design principles to cover spherical viruses and derived a single formula that describes all possible icosahedral quasi- equivalent structures. In June 1962, at the Cold Spring Harbor Symposium, “Basic Mechanisms in Animal Virus Biology”, Aaron Klug presented their theory in a famous paper entitled “Physical Principles in the Construction of Regular Viruses” (Casper & Klug, 1962). The Caspar-Klug theory was widely accepted until the early 1980s when the first apparent deviation was discovered in Caspar’s laboratory (Rayment et al., 1982). However, this structure - polyoma virus - was solved in a way that arguably conserved the basic principle of quasi-equivalence.

Don Caspar will in Chapter 2 outline the dynamic implications of quasi-equivalence. In the present chapter, I will narrate the story of the scientific constellations that drove the development of Caspar and Klug’s insight. To my knowledge, this work constitutes the first detailed history of the beginnings of structural virology. Further considerations of symmetrical designs are presented elsewhere (Morgan, 2003; Morgan, 2004).

1950’s TOBACCO MOSAIC VIRUS STUDIES In molecular biology, the 1950’s were the decade of the helix: the alpha helix, the double helix, and the helical nature of TMV all significantly influenced the awakening discipline. As we will see, many of the general ideas on spherical virus structure emerged from collaborative efforts on the helical TMV, established during this period (For more on TMV

Page 18: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 5

history, See Creager, 2002). At the beginning of the decade, the University of Michigan physicist Richard (Dick) Crane wrote an influential article explaining why one should expect to see helices in nature (Crane, 1950). Crane drew inspiration from ideas of efficient assembly processes, which were applicable to the assembly of viruses from smaller subassemblies. The Scientific Monthly article that Crane referred to as “mainly speculation, not research,”” was written on a train while he traveled to Pasadena for a sabbatical. He argued that assembly processes involving sub-processes are more efficient than those that do not because mistakes that arise in a subassembly can be more easily discarded and not incorporated into the final product (Crane, 1950). But Crane’s contribution to the early history of molecular biology goes beyond the importance of the efficiency of subassembly processes. His 1950 article contained an important related idea: Any structure built from identical subunits making two determinate identical contacts each would be repetitive along a screw axis. In other words, structures made by adding bivalent subunits according to the rule that every “bond” between subunits is identical will lead to a helical structurhe. Crane illustrated his idea using a small spiral staircase made of matchboxes. Each matchbox is related to the next matchbox by the same anglesh and distances-the result is a spiral structure, or more correctly, a helix. Crick met Crane in Michigan in the late 1 9 4 0 ’ ~ ~ and late 1950’s“ and Watson was familiar with his work. Crick and Watson’s 1956 papers and Caspar and Klug’s later work extends Crane’s project. Whereas Crane considers bivalent subunits that form linear chains, Caspar and Klug would later consider multivalent subunits that form nonlinear closed-shells.

Virus Crystallography in the 1950’s “Virus crystallography”, the famous crystallographer J.D. Bernal at Birkbeck College stated, “that’s my property!”d He thereby reassured Rosalind Franklin that she could proceed with work on TMV at Birkbeck

’Letter Crane to the author 27 May 1999. bCrane Interview. ‘Watson, Personal communication, 1 April 2002. dVittorio Luzzati, Personal communication, 3 Sep 2001.

Page 19: Conformational Proteomics of Macromolecular Architecture

6 Gregory J. Morgan

contrary to the wishes of John Randall, her former boss at Kings College, who did not want her to work further on any helical biological materiaLa

Rosalind Franklin

As is well known, before she came to work at Birkbeck College, Rosalind Franklin had worked on DNA at King's College using X-ray diffraction (Watson, 1968; Sayre 1975). The rod-shaped TMV was to be the first of several viruses Franklin hoped to examine using the same technique. A long-term goal of her research was to determine the structure of viral nucleic acid and its relation to protein, a relationship she thought was essential in understanding life. Her group, in addition to herself, came to consist of Aaron Klug, John Finch and Ken Holmes, the latter two being PhD students who joined her in 1955. (For Ken Holmes' later work, see Chapter 16.)

Although she arrived at Birkbeck in March 1953, there were delays due to difficulties in obtaining the necessary apparatusb and samples." Finally in November, Roy Markham of the Molten0 Institute sent her samples of TMV.d By late 1955, Franklin had managed to obtain a radial density distribution for re-aggregated TMV protein using material supplied by Gerhard Schramm. By comparing the density distribution with Caspar' s protein and RNA density distribution obtained by that time, one was able to conclude that the TMV RNA was neither in the center of the rod-shaped virus nor 20w from the center as Caspar had thought, but lay 40A from the center. Franklin wanted to publish the comparative analysis but first required that Caspar publish his results. Caspar procrastinated so, using his dissertation, Franklin wrote a first draft of Caspar's paper herself!" The first draft was finished February

She also presented these results at the 1956 Ciba Foundation colloquium (Franklin, Klug, and Holmes, 1957).

"Vittorio Luzzati, Personal communication, 3 Sep 2001. bAnnual Report Jan 1953-Jan 1954, Rosalind Franklin, UMBC Box 3, Folder 6. 'Vittorio Luzzati, personal communication, Sep 4,2001. dL,etter Markham to Franklin, 23 Nov 1953, Churchill Archive. "Caspar, Interview. fLetter Franklin to Watson, 10 February 1956, Norman Archive.

Page 20: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 7

Aaron Klug

Aaron Klug originally intended to study medicine, but discovered that he wanted a deeper understanding of nature so he shifted academic focus, first to chemistry, then to physics. He enrolled for an MSc at the University of Cape Town, where he met R.W. James, from whom he learned crystallography. In 1949, Klug left for Trinity College, Cambridge, hoping to work on protein X-ray crystallography at the Caveiidish Laboratory. Unfortunately there were no openings in the MRC unit where Max Perutz and John Kendrew worked. Instead Klug completed his PhD on the kinetics of phase changes in solids. At the end of 1953, Klug moved to Birkbeck College, London, as a Nuffield fellow. Here Klug met Rosalind Franklin and began working with her on the structure of TMV.

In fact, it was happenstance that Klug began working with Franklin. Franklin and Klug shared adjoining rooms in the top of 21 Torrington Square, a house converted into laboratories. He had begun to work with Harry Carlisle on the structure of ribonuclease, but progress was slow. One day he met Franklin on the stairs of Torrington Square while she was carrying “beautiful” diffraction photographs of TMV and was drawn to her “fascinating” work.a After some theoretical work on TMV, Klug began working on spherical viruses so he would have his own project independent of Rosalind Franklin’s TMV work. He chose TYMV despite the fact its crystal had a larger unit cell than BSV (700A versus 380A) and thus was probably a more difficult project. Klug was aware that Roy Markham’s had shown that during the purification of TYMV one could distinguish a “top component” as well as the infectious virus (Markham 1951).” Some correctly thought that the top component was hollow viral shells of TYMV (Schmidt, Kaesberg, and Beeman, 1954). Klug hoped that in the longer term he would be able to compare the X-ray diffraction diagrams of normal virus and top component and infer structural information about the viral RNA. Klug’s work on TYMV did not progress as fast as Caspar’s work on BSV. The larger unit cell of TYMV meant that longer exposure times were needed to get useful diffraction

“Mug interview.

Page 21: Conformational Proteomics of Macromolecular Architecture

8 Gregory J. Morgan

patterns. Klug and Finch however ran into a further difficulty: the unit cell contained 16 particles in two different orientations, not 8 particles as earlier proposed by Bernal and Carlisle in 1948. Klug and Finch proposed a “double-diamond” arrangement for the TYMV crystal, so called because it consists of two intermeshed diamond-like lattices. Relatedly, instead of getting 10 spikes of intensity as Caspar had with BSV, Klug and Finch had a more complex diffraction pattern, and thus at first, it was difficult to perceive the 5-fold symmetry of TYMV.

Jim Watson

After his success in determining the structure of DNA with Francis Crick in 1953, James Watson sought the structure of RNA. TMV contains RNA and consequently Watson hoped it would be the key to understanding of the structure of RNA. Watson’s pursuit of RNA had begun during the spring of 1952, when Lawrence Bragg, head of the Cavendish Laboratory, declared a moratorium on Watson and Cricks work on DNA. Bragg considered it ungentlemanly to compete directly against Maurice Wilkins and Rosalind Franklin who were at that time working on the structure of DNA at Kings College, London. Conse- quently, Watson thought he would look at TMV. This change was more in emphasis than in intellectual orientation. As Watson himself said, “A vital component of TMV is nucleic acid, and so it is the perfect front to mask my continued interest in DNA.” (Watson, 1968, 67)

Crick taught Watson the rudiments of helical diffraction theory in his lessons “Helical Diffraction Theory for Birdwatchers”! (Watson was an avid bird watcher.) Using his newly gained knowledge, Watson could see evidence for helices in Fankuchen’s superb X-ray patterns taken in 1938 and published in 1941, but needed more experimental evidence. He obtained purified TMV from Roy Markham at The Molten0 Institute and with the help of Hugh Huxley obtained mediocre X-ray diffraction patterns from imperfectly oriented dry para-crystalline specimens. Unlike Bernal and Fankuchen, Watson and Huxley tilted the TMV specimen to obtain patterns that would indicate the number of subunits in the helical repeat. Using the newly developed theory of the diffraction of helical structures, Watson, with help from Crick, inferred that TMV was a helix

Page 22: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 9

that had an integral number of subunits in 3 turns of the helix, possibly 31 (Cochran et al., 1952; Watson, 1954). The central idea of Cochran, Crick, and Vand (1952) was that there would be certain reflections in the diffraction pattern that are not permitted if the diffracting particle is helical. The useful result for Watson was that for a helix with n residues per repeat, a Jo Bessel function contributes to the n* layer line of the diffraction pattern. Watson’s difficulty was that he could not distinguish between a JO and a higher order Bessel function on the basis of his data. Franklin would later correct his estimate of 31 subunits per 3 turns of the helix to 49 subunits per 3 turns of the TMV helix (Franklin and Holmes, 1958).

Alongside speculations about the rod-shaped TMV, Crick and Watson developed analogous hypotheses about the structure of spherical viruses. While the problem of determining the first structure of TMV in- volved the question of how to arrange subunits according to helical symmetry, an analogous problem for spherical viruses was how to arrange the subunits in a spherical shell. Given the appropriate projection, the arrangement of TMV subunits can be considered as a problem of determining the correct line group. Likewise, determining the arrangement of subunits in a spherical virus requires determining which point group best describes its structure.

Don Caspar

In 1953, Donald Caspar began his PhD in biophysics at Yale University, after majoring in physics at Cornell. Many of his fellow biophysics graduate students were using radiation to inactivate biological systems, such as viruses, with the goal of determining the sizes of targets sensitive to radioactivity. Caspar thought X-ray crystallography might be a more constructive tool for probing biological structures. At Yale, Caspar studied fibers of the rod-shaped TMV. Fibers of TMV are not true crystals since, although the central axis of each TMV particle aligns parallel to the fiber length, each virus particle is randomly rotated around its central axis.

Page 23: Conformational Proteomics of Macromolecular Architecture

10 Gregory J. Morgan

Often overshadowed by the discovery of the DNA structure, the year of 1953 was also an important year for protein crystallography. In this year, Max Perutz, worlung on the structure of hemoglobin, made an important technical breakthrough with the successful application of heavy metal I

I I isomorphous replacement to a biological specimen thereby overcoming what is

0 K--rT-a 20 40 a0 so 1w known as “the phase problem” (Green, Ingram, and Perutz, 1954). Inspired by R3dlus. R (angitmms)

the phases of his pattern. At low resolution the TMV fiber is centro-symmetric and in this special case, determining the phases requires merely determining the sign of each phase. Caspar collected his data over the period 1953-4. In December 1953, over a year before he finished his PhD, Caspar pitched a post- doctoral research proposal to George Beadle of Caltech. He proposed examining solutions of Southern Bean Mosaic Virus (SBM or SBMV), a spherical virus, using low angle scattering apparatus. Watson, now a Senior Research Fellow at Caltech, supported Caspar’ s application, indicating he was “willing and anxious to help Caspar to make his project a success.”a By September of 1954, most of Caspar’s PhD research was finished and in December, Caspar joined Watson at Caltech.

Watson and Caspar at Caltech

Since the fall of 1953, Watson had attempted to make well oriented fibers of RNA using techniques successfully applied to DNA. Earlier he had tried and failed to obtain informative diffraction pattern of plant viral RNA (Watson, 2002: 47). Results in Pasadena were equally disappointing. Watson and Alex Rich tried, with little success, to take informative X-ray photographs of purified RNA (Rich, 1995). The

”Memo from Max Delbrikk, 23 December 1953, Caltech Archives, Bio Division 21.33.

Perutz's innovation, Caspar created alead deriavative of TMV AND ESTIMATED

Figure 1. Radial density plot basedon TMV fiber diffraction data.

Page 24: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 11

diffraction data did allow for the possibility of an RNA helix, but were inconclusive (Rich and Watson, 1954a; Rich and Watson, 1954b). All that Watson and Rich could conclude was that RNA was “DNA-like,” perhaps having bases stacked on one another, but there was nothing analogous to Chargaff‘s rules (Rich and Watson, 1954~). Watson and Rich could not determine if helical sections of RNA existed or even if it was a single or double chain molecule, and gave up worlung on naked RNA (Watson, 2000: 25).

The structure of TMV RNA might prove to be more tractable than naked RNA, Watson hoped. It was in this context that Caspar arrived at Caltech. While there, Caspar analyzed the data he brought on TMV: he calculated a cylindrically averaged radial mass distribution function for TMV (see Fig. 1). Most remarkably there was no significant density in the center-the virus was hollow!

Watson and Caspar speculated about the structure of TMV. At first they took the innermost peak at 20 A to be the RNA.” They then speculated about how the RNA was wound within the protein shell: “For the RNA case we favor a 10-12 stranded model in which the RNA chains follow the same helical grid as the protein.”b At the time, the best estimates of the molecular weight of the TMV RNA suggested that there were more than one piece of RNA in each bipolar particle. On this early model, as in later models, the length of RNA determines the length of the TMV particle. Watson wrote to Franklin describing the model: “The main thing in favor of the P-0-P [pyrophosphate] model is that it is very, very pretty steriochemically. But does nature always like to be pretty?” Franklin was skeptical of the model: she replied that she thought the RNA might as well be a disordered core as far as the X-ray data were concerned!‘ In retrospect, the estimated molecular weight of TMV RNA was too low - the TMV particles are polar with only one strand of RNA running through the entire length of the virus particle.

”Letter Watson to Franklin, 28 February 1955, Churchill College Archive. Unpublished 1955 manuscript in the possession of Don Caspar. bLetter Watson to Franklin, 9 April 1955, Churchill College Archive. ‘Letter Franklin to Watson, 10 June 1955, Churchill College Archive.

Page 25: Conformational Proteomics of Macromolecular Architecture

12 Gregory J. Morgan

SPHERICAL VIRUSES

The Five-Fold Symmetry

The well studied Tomato Bushy Stunt Virus (BSV) and Turnip Yellow Mosaic Virus (TYMV) were natural choices for the crystallographers Caspar and Mug. Bernal and Fankuchen had studied BSV by X-ray diffraction in 1938. Because only small crystals were available, they took powder diffraction measurements to calculate the dimensions of the unit cell (Bernal, Fankuchen, and Riley, 1938). TYMV was also investigated by Bernal at Birkbeck, this time in collaboration with Carlisle. In 1948 they published powder photographs of TYMV purified by Kenneth Smith and Roy Markham (Bernal and Carlisle, 1948). Bernal and Carlisle judged TYMV to be in a diamond-type lattice with 8 particles per unit cell-a lattice in which the four nearest neighbors of any given particle are on the vertices of a tetrahedron.

In the late summer of 1955, both Caspar and Watson traveled to England. Caspar met Rosalind Franklin for the first time in September, although previously they had been corresponding about the structure of TMV. In Cambridge, Caspar began working with spherical viruses. He had earlier obtained some samples of BSV from Art Knight at Berkeley, but was now in search for more virus material. He traveled with Peter Pauling in his sporty Porsche to Rothamstead Experimental Station to meet Fred Bawden and Bill Pirie. They gave him some BSV preparations in solution and crystalline form. They also told him that years ago they had given Harry Carlisle crystals of BSV and of TYMV and that these samples may still be stored in the refrigerators at Birkbeck College. Caspar started growing BSV crystals, and analyzing the larger ones on Tony Broad’s powerful rotating anode X-ray tube. He then traveled down to London to look for the samples at Birkbeck College. To his sur- prise, Franklin would not turn over the TYMV samples. However, Watson exaggerates when he claims that Caspar and Franklin got in a “verbal fight” over them (Watson, 2002: 188).” She was saving them for her colleague Aaron Klug!

”Caspar, Interview.

Page 26: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 13

Nevertheless, results with BSV came quickly for Caspar, but they were unexpected. First, using a partially disordered crystal, he obtained 10 “smudges,” which to a crystallographer surprisingly suggested 5-fold rotational symmetry. He repeated the results and showed that there were “spikes” in the diffraction pattern that conclusively indicated 5-fold synmetry (Caspar, 1956a). Max Perutz who helped Caspar refine his manuscript coined the name “spike”.“ A 5-fold symmetry indicates that the virus has icosahedral, or 532 symmetry. Of the platonic solids, only the icosahedron (20 triangles) and the dodecahedron (12 pentagons) have this symmetry. Caspar showed the results to Francis Crick who happened to be working in the same building.

Caspar’ s results supported Crick and Watson’s hypothesis that viruses have cubic symmetry, since icosahedral symmetry implies cubic symmetry. Crick and Watson had a draft paper written the year before: which Crick now shortened for Nature and adapted in light of the new experimental results (Crick and Watson, 1956; Watson, 2002: 123).“ This article was rewritten sometime after October 10 when Crick and Rich submitted a manuscript on collagen (Rich and Crick, 1955; Rich, 1998).d They argued that a virus possessing cubic symmetry must necessarily be built from a regular aggregation of smaller asymmetrical building bricks and this can only be done in three types of ways. Each way corresponds to one of the three cubic point-groups. Caspar’s experimental results for BSV were published in Nature (Caspar, 1956a) immediately following Crick and Watson’s article. Earlier in 1950, the crystallographer Dorothy Crowfoot Hodgkin had suggested that if BSV crystals were cubic then the BSV particles might consist of 12n identical “submolecules.” Hodgkin’s early comments had much less impact than Crick and Watson’s report. Part of the difference may be due to emphasis. Hodgkin made her remarks in passing in a review (Hodgkin, 1950), while Crick and Watson devoted a full article to the theme. Additionally, Crick and Watson saw that virus with icosahedral symmetry, and thus 60n subunits, was also a possibility.

”Caspar, Interview; Perutz, Interview. bWatson Interview. See also Watson (2002) p. 123. ‘Crick, Interview. dWatson. Interview.

Page 27: Conformational Proteomics of Macromolecular Architecture

14 Gregory J. Morgan

Why does a virus have subunits?

Founded in 1949, the Ciba Foundation held small informal meetings of elite researchers from around the world in their elegant establishment at 41 Portland Place, London. The general goal of these symposia was to stimulate fruitful and friendly conversations in a moderately informal atmosphere (Lee and Spufford, 1993: 144). In March 1956, a symposium was held to “revivify” virology in England.” The major centers for basic viral research were represented: Cambridge, Berkeley, Birkbeck, and Tubingen. It proved to be a meeting of the old and the new. The new were represented by Crick, Watson, Franklin, Caspar, Klug, and others who were convinced that the new physical techniques would spawn a new molecular biology founded on information containing nucleic acids. The old were mainly traditional virologists who were somewhat skeptical of the new approach.

Here Crick, as the first speaker, presented his and Watson’s ideas to the group of virologists (Crick and Watson, 1957). In addition to the theme of their recent Nature article, Crick also considered a new question: “Why does a virus have subunits?’ His answer was both elegant and remarkably simple. First, he assumed, a virus consists of RNA that is surrounded by a protein “coat,” and that the amino acid sequence of the coat protein is determined by the molecular structure of the RNA. Crick then argued that given a “coding ratio” of 3 nucleotides to 1 amino acid, the size of the viral genome is not big enough to code for a large number of non-identical subunits. Therefore, the viral shell must consist of at most a few protein subunits, repeated a number of times. This argument complements the earlier consideration of the number of subunits in the particle and how they are arranged. It is also one of the first arguments to utilize the idea of quantifiable genetic information.

At the time, the most surprising result of the meeting was the news that nucleic acid alone could be responsible for an infection. Robley Williams mentioned the work of his Berkeley colleague Heinz Fraenkel- Conrat who had shown that pure TMV RNA would infect a tobacco plant

“Letter, Wolstenholme to Franklin, 21 June 1955, Churchill Archive 2/34.

Page 28: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 15

if rubbed on the leaves (Fraenkel-Conrat, 1956; Williams 1957: 31). To the surprise of Watson and Crick, some of the participants did not appreciate the scientific impact of such findings (Wolstenholme and Millar, 1957: 35). Watson later explained the root of the difference in opinion: “They were not at home with the concept that information flows unidirectional from nucleic acids to protein and never backwards.” (Watson, 2000: 28) Even Williams’ skepticism to the assumptions made by Crick is apparent in the question session after the talk. He asks Crick, “But are you not going to get into geometrical difficulties if you say that the RNA codes all of one subunit? How does the RNA expose itself to the whole of the subunit?’ To which Crick replies that the RNA takes an extended form to code for the peptide, which then folds up, foreshadowing what would become one of the central problems in molecular biology-theHE Hprotein folding problem-what principles govern how the extended protein folds correctly?

Figure 2. Madrid International Union of Crystallography meeting 1956. From left to right, Anne Cullis, Francis Crick, Don Caspar, Aaron Klug, Rosalind Franklin, Odile Crick, and John Kendrew. Image by courtesy of Don Caspar.

The concept of a virus as a “surface crystal”

A week after the Ciba Foundation meeting, Caspar, Klug, and Franklin traveled to Madrid to attend the International Union of Crystallography Easter meeting (Fig. 2). Caspar presented a paper entitled “The Molecular Viruses Considered as Point-Group Crystals,” that combined Crick and Watson‘s theoretical speculations and his own experimental

P P

Page 29: Conformational Proteomics of Macromolecular Architecture

16 Gregory J. Morgan

results. The paper illustrates how Crick, Watson, and Caspar now thought of viruses as a type of “surface crystal.” They conceived of the virus as consisting of identical subunits bonded together in identical ways. Each subunit is equivalent to every other subunit, as is true of crystals, but unlike space-group crystals, point-group crystals necessarily have a finite number of subunits. The maximum number of equivalently related subunits in a cubic point-group crystal is 60-think of the 20 triangular sides of an icosahedron each subdivided into three.

The idea that viruses behave like crystals goes back at least to Stanley’s well-publicized “crystallization” of TMV in 19351.~ What is new, here in 1956, was to regard the components of a virus as crystul- lizing into a well-formed virus. Crick mentions this possibility at the Ciba Foundation conference: “The process of aggregation (of subunits into a virus) is one which you might reasonably call crystallization.” (Crick and Watson, 1957: 17) Furthermore, the Madrid paper developed a “formal crystallographic classification of viruses.” Caspar experienced some resistance from crystallographers to the idea that viruses possess 5- fold symmetries, since every student of crystallography learns that true 5-fold lattice symmetry is impossible! However, Caspar was not proposing that the lattice itself possess 5-fold symmetry, but the viruses, which sit on the cubic lattice points, possess 5-fold symmetry. It is quite possible to crystallize a molecule with higher symmetry in a lower symmetry lattice. Nonetheless, 5-fold symmetry in molecules is rare and many in the audience remained unconvinced by Caspar’s presentation.

Klug presented after Caspar. He had been trying to discover the selection rule to express the Fourier transform of the icosahedral point groups in terms of spherical harmonics and Bessel functions. Klug’s novel applied mathematics could be seen as an attempt to create the same theoretical approach to spherical viruses that Cochran, Crick and Vand (1952) had created for helical viruses. However, unlike the theory of helical diffraction, Klug’s theory proved difficult to use, especially in pre-computational crystallography.

“Technically, Stanley did not create true crystals of TMV since they were not 3-D crystalline.

Page 30: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 17

The Structure of Polio Virus

The first animal virus to be crystallized was poliovirus. In 1955, Fred Schaffer and Carlton Schwerdt, from the Virus Laboratory at Berkeley, purified the virus from 15-liter batches of monkey kidney tissue culture (Schaffer and Schwerdt, 1955). Eventually crystals large enough for single crystal X-ray diffraction were grown. In 1957 some of these were given to Rosalind Franklin, who wanted to study poliovirus in parallel with her group’s studies on TYMV. Some in the Birkbeck College laboratories were unhappy that Franklin was working with an infectious human virus and convened a meeting that resulted in Franklin being banned from using polio at Birkbeck. Consequently the crystals were taken to the London School of Hygiene and Tropical Medicine to be mounted and later transported in a Thermos to the Royal Institution’s rotating anode X-ray tube.

Franklin’s attempts to mount the polio crystals in capillary tubes in early to mid 1957 were unsuccessful.a The crystals dissolved spontaneously in the capillary before she could get any diffraction pictures. She attributed this instability to the glass of the capillary. In the month before her death, she wrote to Bawden suggesting that Pyrex tubes might be better.b

Klug found quartz capillaries to be more suitable for maintaining the polio crystals, and quartz is now widely used by protein crystallographers. A further technical innovation of Klug and Finch’s was to use dry ice to keep the crystals at 5°C while exposed to the X-rays in order to minimize damage and prolong the exposure time. From the “beautifuYc data obtained by December 1 958d, Finch and Klug were able to infer that polio, like the plant viruses, possesses icosahedral symmetry and they had a draft of a paper by February 1959.e This result was especially significant since it showed that there was a major structural similarity between animal and plant viruses. (Finch and Klug, 1959: 1714). Finch and Klug’s work was published in Nature in June of 1959.

“Letter, Franklin to Caspar 16 March 1958, Norman Archive. bLetter, Franklin to Bawden 20 March 1958, folder 2/33, Churchill Archive ‘Finch Interview. dLetter Crick to H u g 22 December 1958, Norman Archive. “Letter Klug to Crick 13 February 1959, Norman Archive.

Page 31: Conformational Proteomics of Macromolecular Architecture

18 Gregory J. Morgan

Figure 3. Two pictures that were printed in The Observer June 21, 1959 that probably caught the attention of the artist and author John McHale.

Immediately the popular press took notice. Polio research in the 1950’s was newsworthy. The Manchester Guardian printed a half page story, entitled “The Architecture of Viruses,” on June 30, 1959. “Scientific Correspondent” John Maddox wrote the article. However, it was a shorter article, “New Light on How Polio Starts” in The Observer on June 21 that probably was more important for the history of structural virology. This article concentrated exclusively on the structure of polio and included pictures of a 60 ping-pong ball model and an icosahedron with three subunits distributed on each face (Fig. 3). The unnamed scientific correspondent wrote: “The reason for the particular geometry displayed by the spherical viruses is probably that this icosahedral arrangement is the most economical way of “pachng” the small protein units around a central core.” The mention of efficient packing is reminiscent of B u c h n s t e r Fuller who bases much of his theory of architecture on the closest packing of spheres.

Page 32: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 19

PRINCIPLES OF VIRUS STRUCTURE

The Caspar-Klug Collaboration Caspar and Klug’s first collaborative venture was forged in the light of a great tragedy. In August 1958, Rosalind Franklin was scheduled to talk in Bloomington, Indiana at a meeting organized to celebrate the 50th anniversary of the American Phytopathology Society. She would have spoken on her recent work on TMV. Tragically, Rosalind Franklin died of ovarian cancer on April 16, 1958 at the age of 37. Caspar was invited by the committee to speak in Franklin’s place and he asked to Klug to join him. For that purpose they wrote a review of the X-ray diffraction of viruses and added her as first author posthumously (Franklin, Caspar, and Klug, 1959). Among other things, they noted that additional viruses-Tobacco Ringspot Virus and perhaps Coxsackie Virus, an animal virus-were now known to exhibit icosahedral symmetry. They also reported that Robley Williams using electron microscopy had shown that a large insect virus, Tipula Iridescent Virus (TIV), possessed icosahedral symmetry and was shaped like an icosahedron (Williams and Smith, 1958). As a review of the state of the field, the paper did not contain very much theoretical speculation beyond Crick and Watson’s 1956 ideas. However, they explain the structural similarity between BSV and TYMV in terms of efficiency:

The structural correspondence between these two significantly different virus particles is probably not fortuitous, but is likely a reflection of the fact that this type of cubic symmetry is a very efficient way for nature to build a compact particle out of smaller protein subunits (Franklin, Caspar, and Klug, 1959).

A further advance beyond the simplest Crick and Watson model was driven by recent biochemical work that suggested that there were at least 120 molecules in the viral shell of BSV. As it turns out, there are 180 protein molecules per virus particle. Regardless, the important point is that a number greater than 60 subunits per virus calls for a richer picture of virus assembly and additional principles than those given by Crick and Watson. Presumably, there might be a multi-stage process of virus assembly. For example, first chemical subunits might come together in

Page 33: Conformational Proteomics of Macromolecular Architecture

20 Gregory J. Morgan

pairs and then the bonded pairs assemble into a viral shell, but clearly there are other options also. In ending the paper, based on the cases of TMV, BSV, and TYMV, Caspar and Klug indulge in generalization about how all of the small viruses (and perhaps other “particulate nucleoproteins”) are put together:

... it seems likely that the parts are made by a type of subassembly pro- cess before being assembled to build the virus. The forces holding to- gether the protein subunits in the virus particle are like those of a crystal. The configuration of the RNA is determined by its regular packing with the protein (Franklin, Caspar and Klug, 1959: 458).

Thus, a crystal of virus particles is constructed by at least three levels of crystallization-like processes. After writing the review on the X-ray diffraction of viruses, Klug and Caspar continued their collaboration and wrote a long review article for Kenneth Smith’s Advances in Virus Research. Caspar was responsible for most of the material about helical viruses (exemplified by TMV) and Klug wrote much of the material on spherical viruses.a In the section on the symmetry and morphology of spherical viruses, Klug and Caspar discuss the relationship between three types of subunit. First, they define the “chemical subunit” as an individual protein molecule. Second, they define the “crystallographic subunit” as the smallest asymmetric unit. Finally, they define the “morphological unit” as one of the “bumps” or “knobs” on the virus seen in electron micrographs. One of the goals of their virus research was to see how these three types of subunit are related. First, one has to determine the number of each type of unit for a given virus. However, the determination of the size and number of chemical, crystallographic, or morphological subunits in a virus does not explain how these units come together to form the shell. Second, one hopes to determine caizstruction principles that underlie the formation of a viral shell. What one ultimately needs to establish is the regular contact pattern between individual protein molecules. The regular shell is a consequence of this contact pattern; that is, the shell is assembled from protein molecules arranged in regular groups. They argued that since virologists do not yet have techniques to follow the process of virus particle assembly, they

”Klug to Caspar, 12 January 1960, Norman Archive.

Page 34: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 21

Figure 4. Micrograph of Herpes virus taken by Horne and Wildy in 1959 (Wildy, Russell and Horne, 1960). Reproduced with permission of Robert Home and Academic Press.

must deduce the construction principles from the symmetry and morphology of the finished product (Klug and Caspar, 1960: 297). Klug and Caspar also suggest that there could be more than one type of chemical subunit.

Electron Microscopy and Viral Subunit Patterns The development in electron microscopy (EM) during the 1950’s and 1960’s advanced the understanding of virus structure. The innovation known as “negative staining” allowed higher resolution micrographs. Surprisingly, it was discovered more than once (Hall, 1955; Huxley, 1956; Brenner and Horne 1959; Home and Wildy, 1979). In spite of its sometimes misinterpreted results, it has revealed much about virus morphology.

In 1956, Jim Watson provided TMV sample to Hugh Huxley, micro- scopist at University College, London, to see if the RNA could be made visible by staining in the newly installed electron microscope, a Siemens Elmiskop-I.” Huxley developed an “outlining” technique, and, although he was not able to reveal the RNA, he showed that TMV appeared to have a hollow core of 20-30A (Huxley, 1956). The following year Sydney Brenner and Robert Horne, in Cambridge, on the day the Russians launched Sputnik 1 (October 4, 1957), took their first micro-

“Hugh Huxley, interview.

Page 35: Conformational Proteomics of Macromolecular Architecture

22 Gregory .I. Morgan

graph of “negatively stained” bacteriophage T2.” They also looked at the plant viruses TMV and TYMV (Brenner and Horne, 1959).

In December 1958, Peter Wildy, workmg on the growth of herpes virus (Wildy, 1954), approached Robert Horne to see if EM could be used to quantify his purified virus. On January 10, 1959, Wildy wrote in his otherwise bland lab book, “E.M. Photographs of great interest”. In his biographical curriculum vitae he put it like this:

Initial observations with herpes virus [Fig. 41 were so promising that we immediately resolved to examine a variety of animal viruses simul- taneously and brought in all the virologists we could find in Cambridge. Between January and June 1959 we examined particles from all the groups of vertebrate viruses then known. We were able to discover a unity in the structural pattern that went beyond the conven- tional boundaries limiting virologists at the time.b

Indeed by 1961, sugar beet virus, poliovirus, herpes simplex, myxo- viruses, polyoma viruses, bacteriophage QX 174, and parainfluenza virus had been seen through the Siemens Elmiskop-I electron microscope, pro- viding ample of material for many articles of the newly formed Journal of Molecular Biology. Perhaps the most impressive were the micrographs of adenovirus (Horne, 1959), which inspired Brenner and Home to visit a sports store on Kings Parade and buy 252 table tennis balls to build a model (Fig. 5).

Figure 5. The left panel shows an electron micrograph of Adenovirus taken by Brenner and Horne, and the right panel shows the 252 table-tennis-ball model they built. Reproduced with permission of Robert Horne and Academic Press.

“Robert Home, interview. bCV in the possession of the author.

Page 36: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 23

Micrographs of TYMV

In February 1960, while Klug was finishing up the manuscript for the review in Advances in Virus Research, H. L. Nixon of Rothamstead Experimental Station wrote to Klug about some recent electron micro- graphs of TYMV that he and his colleague had taken. Nixon’s interpreta- tion of the pictures was “markedly at variance” with Klug’s 1957 pro- posal that TYMV consists of 60 subunits laying at the vertices of a snub dodecahedron, as he wrote”:

Two aspects are visible, one with a ring of six subunits with one in the center . . . This ring is in turn surrounded by eight other units, although eight are not usually visible there is room for them and I have little doubt that eight is the correct number.b The other common aspect shows four subunits in a diamond shaped arrangement, with its center tilted slightly away from the observer. These 4 units can be seen to form parts of rings of 6 and 5. If one constructs a complete particle according to this pattern the result is a body with 532 symmetry, consisting of 32 subunits of two kinds.

At first Klug was skeptical of Nixon’s result, and replied by asking if Nixon meant that the proposed structure consist of two types of unit: 12 units at the corners of an icosahedron and 20 units at the corners of a dodecahedron. Klug called the first set of units “white” and the second “black” according to a diagram that he included (Fig. 6, left). Klug said, “I do not see any such arrangements in the picture you sent. In fact, the electron microscopy pictures would look to an unprejudiced eye as though the particles were made up of 14 knobs at the vertices of a rhombic dodecahedron, which has just the right features of lozenge- shaped rings of 1 + 6 neighbours.”‘

In retrospect, as Klug would later point out, one reason one does not often see five knobs around one knob is that by looking down a virus’s 5- fold axes on an electron micrograph, one observes a superposition of the front and the back of the particle, which are out of phase. Of course at

”Letter Nixon to Klug, 10 February 1960, Norman Archive bThere should be nine units, if a 3-fold axis is pictured. ‘Letter Mug to Nixon, 17 February 1960, Norman Archive.

Page 37: Conformational Proteomics of Macromolecular Architecture

24 Gregory J. Morgan

the time, many microscopists mistakenly thought that negative staining produced a one-sided “footprint” rather than a two-sided “superposition.”

Nonetheless, Nixon remained convinced he had the correct structure and replied to Klug that TYMV could not be a rhombic dodecahedron to which Klug wrote:

I am pleased that your pictures cannot be explained by a rhombic dodecahedron, since that would not have the right [i.e. 5321 symmetry. I mentioned it only to make quite sure that you could rule it out. As I pointed out in my last letter, the 20+12 unit structure (which would, incidentally, be a rhombic triacontahedron if all rhomb edges were equal) is quite compatible with the X-ray observations.a

He also informed Nixon that a few days before he got his first letter, Hugh Huxley of University College, London had shown Klug similar pictures and has independently come up with a similar model of TYMV. In fact, Klug had given Hugh Huxley the samples of TYMV.b Klug and Finch addressed the electron micrographs of Huxley and Zubay, and Nixon and Gibbs in a 1960 article (Klug and Finch, 1960; Huxley and Zubay, 1960; Nixon and Gibbs, 1960). One could account for the electron microscope data and remain consistent with the X-ray data if the 32 “knobs” or “bumps” were arranged resembling a pentakis dodecahedron or a rhombic triacontahedron. These two semiregular solids are two extremes in a continuous series: by varying the radii of the set 12 icosahedral knobs relative to the 20 dodecahedra1 knobs one can generate any one of the series. All members of the series have true 532 symmetry.

Klug and Finch suggest a way to resolve the apparent tension with the Crick-Watson 60 subunit model. If the particles have true 532 symmetry, then the knobs themselves can be further subdivided. These sub-sub-units Klug and Finch call “structural units”. For example, each of the 12 knobs that lie on 5-fold axes can be divided into 5 identical structure units and those 20 knobs on 3-fold axes can be divided into 3 (Fig. 6, right). Thus, in this example, the 32-knob structure consists of 120 structural units of two types ((12.5)+(20.3)). Klug and Finch also

“Letter Klug to Nixon, 26 February 1960, Norman Archive. ‘Hugh Huxley, interview.

Page 38: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 25

Figure 6. Klug and Finch’s 1960 model of TYMV. Left, the 32 “knobs” or capsomeres seen by electron microscopists. Right, each “white knob” might be divided into 5 structural units and each “black knob” divided into 3 structural units each consisting of two chemical units, one black and one shaded. Reproduced with the permission of Academic Press and John Finch.

allow that each structural unit might itself be divided into “chemical units” - i.e., distinct polypeptides. Given this possibility, the number of chemical units will be a multiple of 60. Caspar and Klug’s later work would be concerned with exactly how the units are differently situated in the structure, though they would later rule out a 120-subunit model on theoretical grounds.

ARCHITECTURAL DESIGN Upon reading a newspaper report of the polio work, the pop artist John McHale,” a popularizer of the architectural designer Buckminster Fuller (McHale, 1961, 1962), noticed the potential similarities between viral structure and Fuller’s architectural creation-the geodesic dome. He wrote a letter to Fuller and arranged a meeting with Klug and Finch in London (Marks, 1960: 44).

The Geodesic domes - Buckminster Fuller

After stints at Harvard, the Navy, and a construction company, Buckminster Fuller in 1927, at the age of 32, decided to devote his re-

“John McHale (1922-1978), wrote Fuller a letter asking him the simple question “Did the Bauhaus influence your work?’ to which Fuller replied that it did not with a 7000-word letter! (Hamilton, 1984) Fuller’s letter was dated January 7, 1955 and McHale edited it and eventually had it published in Architectural Design (McHale, 1961). This letter excited McHale to the degree that he wrote of one of the first books devoted to Fuller (McHale, 1962).

Page 39: Conformational Proteomics of Macromolecular Architecture

26 Gregory J. Morgan

Figure 7. Buckminster Fuller’s Radome (Marks, 1960). Among his different “geodesic domes”, Fuller designed a plastic radar dome, later shortened to “Radome,” that was 55 feet in diameter, 40 feet high, and could withstand 220 mile-an-hour winds. Notice in the framework construction the circles surrounded by 6 triangles and those surrounded by 5 triangles.

maining life to improving the world through better design. Over the next few years, he designed and built a streamlined car, the “Dymaxion” car, a mass-producible bathroom, and an aluminum house. He also began to formulate his seemingly incomprehensible “synergetic geometry”-hehe saw his inventions as applications of synergetic geometry (Fuller, 1975). It was, however, his geodesic domes that made Fuller most famous.

The exemplary geodesic dome is constructed from many approxi- mately equilateral triangular faces (Fig. 7). It approximates a tessellated icosahedron, although in most cases not all the triangles are identical. Many of the edges of the faces, when projected onto a sphere follow arcs of great circles. Fuller called his enclosures geodesic domes because the minor arc of a great circle is the shortest path between two points on the surface of a sphere, i.e., a geodesic. In 1946, Fuller organized and incorporated the Fuller Research Foundation. He traveled around the USA showing students how to build geodesic domes. Fuller’s big break came in 1952, when Henry Ford I1 contracted him to build a ninety-three foot diameter dome at the Ford Motor Company plant in Dearborn, Michigan. After this publicly visible structure was built, Fuller’s ideas began to gain more interest. Prudently, Fuller patented his geodesic dome and could claim royalties on any dome built (Patent No. 2,682,235, June

Page 40: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 21

29, 1954). Fuller also worked with the Department of Defense to build domes that would house radar equipment necessary for a missile early warning system. These domes had to withstand harsh arctic weather as well as be invisible to radio waves.

Contacted by McHale, Buckminster Fuller was intrigued to hear about possible similarities between domes and viruses and met with Klug and Finch in July 1959. There is no written record of the meeting, but Fuller subsequently wrote: “In 1957 [sic] Klug asked if I could identify the geodesic-like protein shell of the polio virus. I was able to give him the mathematical explanation of the structuring” (Fuller, 1969: 104; See also Urner, 1991). This recollection is a slight exaggeration on Fuller’s part, since at the time it was unclear what the exact details of the analogy were and the comparison with domes did not directly give Klug and Finch explanation of viral structure. However, in April 1960, John McHale sent Klug, Fuller’s manuscript Energetic Synergetic Geometry. In this manuscript Fuller revealed how he made a “270 Strut Isotropic Tensegrity” sphere (Fig. 8). There were 5 types of strut, each made to precise pre-calculated dimensions, not one type with “give” in it as Klug had inferred when he looked at a picture of the model.a The idea of “give” was a precursor to the idea of quasi-equivalence. Klug calls his insight a “misprision,” after Caspar’s friend and literary critic Harold Bloom (Laszlo, 1986: 50). Bloom contends later writers are influenced by earlier writers by misconstruing and reinterpreting the earlier texts.

Artistically inclined, Don Caspar became intrigued by the work of Fuller through the work of Robert Marks in 1960, and later established his own direct contact. When Harvard University awarded Fuller the 1962 Charles Eliot Visiting Professorship in Poetry, Caspar arranged for them to meet at the Children’s Cancer Research Foundation where Caspar worked.b These discussions on how structural organization related to tensional-integrity, contracted as “tensegrity,” inspired some of Caspar’s insights into virus architecture.

“Buckminster Fuller “An Introduction to Energetic Synergetic Geometry” MS, p. 21 2-224. Fuller also gave a similar MS Robert Horne. Some of this material was published in Fuller (1975). See, for example p. 394. Klug Interview. Each of these struts represents 2 structural units. hCaspar Interview.

Page 41: Conformational Proteomics of Macromolecular Architecture

28 Gregory J. Morgan

Tensegrity structures - Kenneth Snelson

Before continuing with the development of the virus theories, let us further detour into the cultural world. Although there is some controversy over the origins of tensegrity, the American sculptor Kenneth Snelson constructed the first tensegrity structure/sculpture in 1948 while he was a student of Buckminster Fuller’s at Black Mountain College, North Carolina. There are a number of his sculptures dotted around the USA. For example, his “Needle Tower” is exhibited at the Hirshhorn Sculpture Garden on the Mall in Washington DC (Snelson et al., 1981). The basic idea of a tensegrity structure is to isolate the compo- nents of the structure that are under tension (wires) and those under compression (struts). Tensegrity structures are in an energy minimum and will return to their original state after deformation.

The Idea of Quasi-Equivalence The key to quasi-equivalence was to see that the angles or “contact points” between identical structure units need not be “absolutely equivalent.” Caspar first called his insight non-crystallographic equivalence.a Caspar suggested that his idea was analogous to Pauling’ s use of a non-integral helix to solve the structure of the alpha helix. Klug replied that the true analogy would be Crick’s coiled coil, and later coined the term “quasi-equivalence”.b Caspar then asked Klug to join him in another collaborative project. He included a rough draft of his notes and suggested that they rewrite them. These findings reoriented Caspar’s research agenda. “Now that there is a unifying principle to work with I am more interested in the spherical viruses and will be glad to collaborate with you in anyway possible on BSV.”“ A collaborative pro- ject between Klug and Caspar seemed natural since, as Caspar noted, his own breakthrough fitted well with Klug’ s earlier ideas on the subject.

Klug welcomed the renewed collaboration with Caspar and men- tioned that in fact he had recently been trying to classify all the structures that can be made up of linear combinations of 30,20, 12, and 60 subunits

‘Letter, Caspar to Klug, 11 November 1960, Norman Archive. bSee also the postcard Klug to Caspar 17 November 1960, Norman Archive. ‘Letter Caspar to Mug 14 November 1960, Norman Archive.

Page 42: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 29

“on the basis of both density and uniformity of packing.”” Klug’s idea was to triangulate the sphere with triangles as closely equilateral as possible. Before Klug’ s letter reached him in Boston, Caspar had found one of the first books written on Buckminster Fuller, The Dymaxion World of Buckminster Fuller, by Robert Marks (Marks, 1960) and wrote to Klug, “In retrospect, I expect you will find it surprising that you did not recognize how well Fuller’s geodesic structures do, in fact,

Figure 8. The 270 Subunit Tensegrity Sphere. From Marks ( 1 960).

represent virus structure.”b Fuller derived formulae which describe the close paclung of small spheres around the tetrahedron, the cube, the octahedron and the cuboctahedron, the last of which is applicable to a subset of spherical viruses (Marks, 1960: 46). On the other hand, Fuller was concerned with building shells out of triangles, which, unlike the virus case, need not compound into integral numbers of hexagons and pentagons and he also did not consider skew geodesic domes.

An important shift occurred in interpreting Fuller’s geodesic domes: rather than identih vertices as viral subunits, Caspar identified the triangular faces as subunits. Technically this shift involves moving from one figure to its dual.

Klug response to Caspar’s exuberance was a little more reserved. He wondered about what the justification could be for why virus subunits cluster into fives and sixes, if indeed they do.

My difficulty is still the same one that led me to identify the structure units with the vertices of the triangulations rather than with the centers as you have done. In the first type (my early ideas) the structure units are packed as densely as possible on the surface of the sphere, i.e. make as many contacts as possible and share out the stresses at any one point in the framework in the most favorable way (Buckminster Fuller’s prin- ciple). In the second type (i.e. your idea) each structure unit has only

“Letter Klug to Caspar 17 November 1960, Norman Archive, underline in original hLetter Caspar to Klug, 18 November 1960, Norman Archive.

Page 43: Conformational Proteomics of Macromolecular Architecture

30 Gregory J. Morgan

three neighbours so that if one joins up the centers of the units by lines one has a polyhedron with trihedral vertices.a

Klug was aware that polyhedra with trihedral vertices span the maximum volume with the minimum material and therefore that there was a reason from efficiency for preferring this second class (rather than its dual). A further problem was whether one should allow sub-clustering and whether removing this constraint would lead to many more difficulties. “There are so many aspects to this problem and so many loose points that I have difficulty in deciding in what detail to deal with them.”b

After reading Marks’ book on Fuller“ and examining pictures of domes, Klug gleaned a couple of conceptual advances. First, that some domes had been assembled following simple building rules.d For example, unskilled workers following a system of color-coded building rules had assembled a 100-foot diameter dome within 48 hours in Kabul, Afghanistan. Presumably viruses also might assemble by “following” simple building rulesa Caspar and Klug toyed with a similar simple scheme for the viral binding rule: If you label each triangular subunit’s three edges, I , 2, and 3, then 3 bonds only with 3, I bonds 2, and 2 bonds I , reflecting the analogous specific bond types between protein subunits (See Fig. 9, right). Crick had even enunciated a related principle-“virus usassembly should be simple enough that even a child could do it.”

By 1961 Klug was aware of two other groups who were also workmg on a theory of spherical virus structure. The Italian physicist Mario Ageno sent a manuscript to Crick, who passed it to Klug.“ In his manuscript, Ageno derives the formula for number, N, of “subunits” in an icosahedral shell structure N=lO(n-1)*+2, which is closely related to Buckminster Fuller’s formula P=10F2+2, when F is the number of subunits on an edge of the icosahedron. In effect, these formulae describe the knobs seen in EM, each of which consists of 5 or 6 of Caspar and Klug’ s structural subunits. Implausibly, Ageno argues that the

“Letter Klug to Caspar 2 December 1960, Norman Archive. hLetter Klug to Caspar 2 December 1960, Norman Archive. ‘Letter Klug to Caspar 11 January 1961, Norman Archive. dL,etter Klug to Caspar 21 February 1961, Norman Archive. ‘Letter Crick to Ageno, 31 January 1961, Norman Archive; Letter Crick to Mug, 31 January 1961, Norman Archive.

Page 44: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 31

icosahedral shape is to be preferred over other polyhedra because of “surface tension”.a The second group now worhng on the structure of viruses was closer to home. George Hirst, editor of the journal Virology, had commissioned Robert Horne and Peter Wildy to write a paper on the structure and symmetry of viruses (Horne and Wildy, 1961). Klug learned that they too had derived the formulae 10q2+2 and 30q2+2 as well as a skew type 10(m2+m)+12 and a further type based on the icosidodecahedron. Klug thought their paper was mainly descriptive and did not compete directly with his and Caspar’s more theoretical project, but nonetheless the existence of two groups working on similar problems provided more urgency for Caspar and Klug to finish up and submit a manuscript.b

In March 1961, Caspar finally derived a complete general formula for the total number, N, of structure units in 1/60 of the shell (i.e., in the asymmetric unit of the shell) of any possible icosahedral virus structure:

N = h2 + hk + k2 (1)

where h and k can be any integer.c Over the summer, Caspar refereed the paper by Horne and Wildy on

viral symmetry from the point of view of electron microscopy, and in August he wrote to Klug to discuss how they should continue. Two distinct research projects persisted. First, a geometrical project: to enumerate and classify icosahedral structures. Caspar’ s derivation of a general formula made significant progress on this project. Second, a physical project: to explain why so few of the possible structures are rea- lized in nature. Caspar suggested they write two papers based on this division. The first paper would be applicable to boron molecules, viruses, and geodesic domes. The second would be focused more specifi- cally on virology. However, for remainder of 1961 Caspar and Klug made few conceptual advances, until in January 1962, when Klug put the situation this way:

d

“Ageno, M. “Some Remark (sic) on the Shape of Viruses” MS, Norman Archive bLetter Klug to Caspar, 21 February 1961, Norman Archive. ‘Letter Caspar to Klug, 13 March 1961, Norman Archive. dLetter Caspar to Mug, 8 August 1961, Norman Archive.

Page 45: Conformational Proteomics of Macromolecular Architecture

32 Gregory J. Morgan

I am afraid I have got into a complete malaise with the icosahedral virus paper. I seem unable to take it up again although on all sides of us people are busy dealing with points that have arisen. For instance have you seen Pawley in the January Acta and Ageno’s paper in Nuovo Cimerito.a Also I see that Home and Wildy mention the connection with Buckminster Fuller, which I am sure they learned only from us after I spoke about it at the Biophysics Conference in July 1959 and you presumably showed Marks’s book to Wildy. Therefore, despite the doubts I have indicated previously, I think it would be a good idea to publish our thoughts as soon as possible as we had intended last sum- mer.

One thing that Caspar and his new post-doc Ken Holmes had been concerned with since the summer was an analysis of the Dahlemense strain of TMV. This strain yielded an interesting diffraction pattern. Additional meridianal and near meridianal diffraction maxima appeared on the layer lines halfway between those given by the common strain (Caspar and Holmes, 1969).“ Caspar’ s laboratory partner Carolyn Cohen saw that the results could be explained by a periodic perturbation in the TMV helix. The relevant finding for Caspar and Klug’s theoretical aspirations was that due to the periodic perturbations of Dahlemense TMV helix, the rod shaped virus essentially consisted of identical sub- units lying in non-identical environments. It provided empirical support for Caspar and Klug’ s notion of quasi-equivalence. Therefore, later that month of January, 1962, when Klug was invited to present at a June meeting at Cold Spring Harbor on “Basic Mechanisms in Animal Viruses,” he saw this as an appropriate place to present some of the Caspar-Klug ideas, and wrote to the organizers requesting that Caspar be included as a co-author. Robley Williams was to chair a session on the “Structure and Intracellular Location of Viruses,” which along with Klug scheduled Peter Wildy to talk on electron microscopy. Caspar suggested that Klug come to Boston for a least a month so that they had time to write the Cold Spring Harbor paper.d

b

”Pawley, 1962; Ageno, 1960. ’Letter Klug to Caspar, 9 January 1962, Norman Archive. Italics mine. ‘Note that although this paper was published in 1969 it was first received at the JMB on 25 August 1965. dLetter Caspar to Klug 29 January 1962, Norman Archive.

Page 46: Conformational Proteomics of Macromolecular Architecture

Early Theories o j Virus Structure 33

Meanwhile Klug continued to think about spherical viruses. He spoke with Crick to “try out” some ideas with him. “I noticed that even he did not see at first the implications of near equivalence and I had to make a diagram [Fig. 91 to convince him that by following a simple bonding rule, one could make the structure.”a Notice that with two types of bonds, those that bind the subunits into fives or sixes and those that bind the groups of fives and sixes together, one can assemble icosahedral structure with more than 60 subunits.

1:: j L&&~ 3 : \,..&,.&u&

Figure 9. Diagram drawn, by Klug’s hand, in February 1962 indicating how simple bonding rules and near equivalence can lead to icosahedral structures. Only one face of the icosahedron is shown in each case.

There are many different sizes of structure that can be assembled using these rules. But what determines the correct ratio of the number of clusters of six and clusters of five for each species of virus? Caspar and Klug attribute a property to the subunit that they called “built-in- curvature.” This property guaranteed the unique size of the closed structure that could be built from subunits with a particular degree of built-in-curvature. However, Crick remained skeptical and wondered how the built-in curvature of the subunits guided the formation of strict icosahedral structures. Klug wrote to Caspar “If you look through our

”Letter Klug to Caspar 6 February 1962, Norman Archive.

Page 47: Conformational Proteomics of Macromolecular Architecture

34 Gregory J. Morgan

correspondence at the time when you were building the 540-unit (92 cluster) shell I queried [Feb 3, 19611 whether the units could not have assembled themselves in a less symmetrical form. You replied no [Feb 7 19611, but it is clear that one can build isolated fragments of shell or even pieces of plane sheet which will not close up to make the complete structure. Francis [Crick] wondered whether there was not some actual rule of assembly which forced five-rings to form in the appropriate places .”a

Caspar wrote to Klug: “It is clear to me that our model for the virus shells are tensegrity structures.”b Klug had come to a similar conclusion after loolung at the 270-strut model. What attracted Caspar to tensegrity structures was that like viruses, “the structure units naturally arrange themselves in as near equivalent environments as possible.” Klug acknowledged the applicability of the analogy between viruses and tensegrity structures, but cautioned Caspar on the limits of its fruitfulness. He had discussed the connection between tensegrity and near equivalence with Fuller, but Fuller either did not understand or did not see the importance of the connection.c However, Caspar continued to be intrigued by Fuller’s ideas and built new more complex tensegrity inspired models. This model building delayed their manuscript writing. Klug argued that they should not overemphasize the principle of tensegrity in their paper since many tensegrity structures were not built from 60n units and many were not even approximately spherical (such as Snelson’ s “Needle Tower”). Indeed “tensegrity” merely denoted a separation of tension and compression components in a structure. On the face of it, the separation of tension and compression had little to do with protein structures or a structure that can “build itself’ as a virus does.d Caspar replied that the significant analogy between virus structure and tensegrity structure was that they both exist as minimum energy structures. If one deforms a tensegrity structure and then removes the source of deformation, the structure will spring back to its original

“Letter Klug to Caspar 6 February 1962, Norman Archive. hLetter Caspar to Klug 12 March 1962, Norman Archive. ‘Letter Klug to Caspar 16 March 1962; letter Mug to Caspar 7 April 1962, Norman Archive dLetter Klug to Caspar 7 April 1962, Norman Archive.

Page 48: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 35

shape.a He was able to show that under plausible assumptions the energy of a closed viral shell would be lower than of a flat sheet. a

Caspar “deduced” the possible quasi-equivalent virus shells by cutting and folding a hexagonal net following G. S. Pawley, who had shown that only two types of plane nets--cubic and hexagonal+an be folded onto the surface of a convex polyhedron while maintaining the same nearest neighbor contact pattern (Pawley, 1962).b

The classic 1962 paper

Caspar & Klug’s classic work appeared as the first paper in the 27‘h volume of the Cold Spring Harbor Symposia proceedings (Caspar and Klug, 1962). Their paper introduced to the biology community a number of influential ideas, notably, “self assembly”, “triangulation number”, “built-in-curvature”, and of course the concept of “quasi-equivalence.”

They intended their theory to supercede and explain the earlier speculations about economy and efficiency arguing that icosahedral symmetry is a feature of minimum energy structures. In other words, the lowest energy structure will have the maximum number of stable bonds formed-and in spherical virus shells with more than 60 identical subunits, this is physically realized as an icosahedrally symmetric structure with quasi-equivalent bonding between identical structure units. The self-assembly of a virus, like crystallization is driven by laws of statistical mechanics. These physical considerations have lead to an extension of the traditional concepts of symmetry more specifically applicable to highly organized biological structure, Indeed quasi- equivalence might be thought of as a form of symmetry brealung.

The fourth section of the paper, “The Geometry of Icosahedral Viruses,” contains the heart of the Caspar-Klug theory. The basic point is that “the shell is held together by the same type of bonds throughout but that these bonds may be deformed in slightly different ways in the different, non-symmetry related environments.” (Caspar and Klug, 1962: 10) From the general equation derived by Caspar, they classify all possible icosahedral deltahedral assemblies of similar structural units. A

“Letter Caspar to Klug 11 April 1962, Norman Archive bLetter Caspar to H u g 9 April 1962, Norman Archive,

Page 49: Conformational Proteomics of Macromolecular Architecture

36 Gregoty J. Morgan

deltahedron is a polyhedron whose faces are equilateral triangles. Each type of icosahedral virus can be assigned a particular triangulation number, T:

(2)

where f is an integer and h and k are integers with no common factor! The entire protein shell of the virus is made up of 60T structure units. Typically, the structure units are individual protein molecules.

Given the synergy between the TMV and spherical virus research programs, it is fitting that Caspar and Klug end their seminal article by comparing helical and icosahedral designs. Since both can be constructed by quasi-equivalent bonding, further reasons are needed if nature might prefer one design to the other. A rod-shaped virus allows for more contact between the nucleic acid and protein and may result in a more stable particle. On the other hand, an icosahedral virus exposes a minimum amount of surface to the environment and does not require the shell to be completely disassembled to release the nucleic acid in the infection process. But the disassembly of viruses is another story.

T = (h2 + hk + k2)f2

ACKNOWLEDGEMENTS This paper was made possible by the financial support from the US National Science Foundation (Doctoral Dissertation Award 99 1089 1). Lena Hammar and Stacey Welch helped with editing. Lindley Darden, Peter Achinstein, Ed Lattman, Martin Kemp, Angela Creager, Robert Olby, and Kenneth Snelson provided useful comments. The Novartis Foundation (formerly Ciba) allowed me access to their archives, as did Stanford University Archives, LMB Cambridge Archive, Caltech Archives, Welcome Institute of Medicine, Royal Society Archives, UMBC Archives, Tate Gallery Archives, and the Churchill College Cambridge Archives. Jeremy Norman generously allowed me access to his impressive private archive for the history of molecular biology. Kirby Urner and Ed Applewhite shared their knowledge of Buckminster Fuller. Many people kindly agreed to share information, including, Sylvia Alloway, Sydney Brenner, Carolyn Cohen, Bob Connelly, Dick Crane, Francis Crick, David DeRosier, John Finch, Alfred Gierer, Jeremy

Page 50: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 37

Goldberg, Ken Holmes, Robert Horne, Hugh Huxley, Reuben Leberman, Vittorio Luzzati, John Maddox, Lee Makowski, Anne Massey, Magda Cordell McHale, Tony Minson, Alison Newton, Max Perutz, Ivan Raymont, Alex Rich, Michael Rossmann, Willy Russell, Herbert Simon, Michael Stoker, James Watson, Jo Wildy and lastly, but certainly not least, Don Caspar and Aaron Klug.

REFERENCES 1. Ageno M. Some remarks on the shape of viruses. Nuovo Cimento, 1960;

2. Bernal JD and Carlisle CH. Unit cell measurements of wet and dry crystalline turnip yellow mosaic virus. Nature, 1948; 162: 139-140.

3. Bernal JD, Fankuchen I, and Riley DP. Structure of the crystals of tomato bushy stunt virus preparations. Nature, 1938; 142: 1075.

4. Brenner S and Horne RW (1959). A negative staining method for high- resolution electron microscopy. Biochim Biophys Acta, 1959; 34: 103- 1 10.

5. Caspar DLD. Structure of Bushy Stunt Virus. Nature, 1956a; 177: 475. 6. Caspar DLD. Structure of Tobacco Mosaic Virus. Nature, 1956b; 177: 928-

930. 7. Caspar DLD. Movement and self-control in protein assemblies. Quasi-

equivalence revisited. Biophysical J , 1980; 32: 103-38. 8. Caspar DLD and Holmes KC. Structure of dahlemense strain of tobacco

mosaic virus: A periodically deformed helix. J Mol Biol, 1969; 46: 99-133. 9. Caspar DLD and Klug A. Physical principles in the construction of regular

viruses. Cold Spring Harbor Symposia on Quantitative Biology, 1962;

10. Cochran W, Crick FHC, and Vand V. The structure of synthetic polypeptides I: The transform of atoms on a helix. Actu Crystallogr, 1952;

11. Crane HR. Principles and problems of biological growth. The Scientific Monthly, 1950; LXX(6): 376-389.

12. Creager ANH. The Life of a Virus: Tobacco Mosaic Virus as an Experimental Model, 1930- 1965. University of Chicago Press, Chicago, 2002.

13. Crick FHC and Watson JD. Structure of small viruses. Nature, 1956; 177: 473-475.

XVIII(2): 160-175.

XXVII: 1-24.

5: 581-586.

Page 51: Conformational Proteomics of Macromolecular Architecture

38 Gregory J. Morgan

14. Crick FHC and Watson JD. Virus structure: general principles, in GEW Wolstenholme (ed.) Ciba Foundation Symposium on The Nature of Viruses, Little Brown and Co., Boston, 1957, pp. 5-13.

15. Finch JT and Klug A. Structure of poliomyelitis virus. Nature, 1959; 183:

16. Fraenkel-Conrat H. Infectivity of ribonucleic acid from tobacco mosaic virus. J Am Chem Soc, 1956; 78: 882.

17. Franklin R, Klug A, and Holmes KC. X-ray diffraction studies of the structure and morphology of tobacco mosaic virus, in GEW Wolstenholme (ed.) Ciba Foundation Symposium on The Nature of Viruses, Little Brown and Co., Boston, 1957, pp. 39-52.

18. Franklin R, Caspar DLD, and Klug A. The structure of viruses as determined by X-ray diffraction, in Plant Pathology: Problems and Progress 1908-1958. University of Wisconsin Press, Madison, 1959, pp. 5- 13.

19. Franklin RE, Holmes KC. Tobacco mosaic virus: An application of the method of isomorphous replacement to the determination of the helical parameters and radial density distribution. Actu Cryst, 1958; 11: 2 13-220.

20. Fuller RB. Utopia or Oblivion: the Prospects for Humanity. Bantam Books, New York, 1969.

21. Fuller RB. Synergetics. Macmillan, New York, 1975. 22. Green DW, Ingram VM, and Perutz MF. The structure of haemoglobin IV.

Sign determination by the isomorphous replacement method. Proc Royal Soc, London, 1954; A225: 287.

23. Hall CE. Electron densitometry of stained virus particles. J Biophys Biochem Cytol, 1955; l(1): 1-15.

24. Hamilton R. Comments on McHale and his work. In Charlotta Kotik (ed.) The Expendable Ikon. Albright-Knox Gallery, Buffalo, NY, 1984, pp. 45- 47.

25. Hodgkin DC. X-ray Analysis and protein structure. Cold Spring Harbor Symposium on Quantitative Biology, 1950; XIV: 65-78.

26. Horne RW, Brenner S, and Waterson AP. The icosahedral form of adenovirus. J A401 Biol, 1959; 1: 84-86.

27. Horne RW and Wildy P. Symmetry in virus structure. Virology, 1961; 15: 348-373.

28. Horne RW and Wildy P. An historical account of the development and applications of the negative staining technique to the electron microscopy of viruses. J Microscopy, 1979; 117( 1): 103-22.

1709- 17 14.

Page 52: Conformational Proteomics of Macromolecular Architecture

Early Theories of Virus Structure 39

29. Huxley HE. Some observations on the structure of tobacco mosaic virus. Proceedings Stockholm Conference in Electron Microscopy, 1956: 260.

30. Huxley HE and Zubay G. The structure of the protein shell of turnip yellow mosaic virus. JMol Biol, 1960; 2: 189-196.

3 1. Judson HF. The Eighth Day of Creation, Jonathan Cape, London, 1979. 32. Klug A and Caspar DLD. The structure of small viruses. Adv Virus Res,

33. Klug A and Finch JT. The symmetries of the protein and nucleic acid in turnip yellow mosaic virus: X-ray diffraction studies. J Mol Biol, 1960; 2: 20 1-21 5.

34. Klug A, Finch JT, and Franklin RE. The structure of turnip yellow mosaic virus: X-ray studies. Biochim Biophys Acta, 1957a; 25: 242-252.

35. Klug A, Finch JT, and Franklin RE. Structure of turnip yellow virus. Nature, 1957b; 179: 683-684.

36. Laszlo P. Molecular Correlates of Biological Concepts, Elsevier Science Publications, Amsterdam, 1986.

37. Lee K and Spufford NG. Portrait of a Foundation: A Brief History of the Ciba Foundation and its Environment, Ciba Foundation, London, 1993.

38. Markham R. Physiochemical studies of the turnip yellow mosaic virus. Dis- cussions of the Faraday Society, 1 95 1 ; 11 : 22 1.

39. Marks RW. The Dymaxion World of Buckminster Fuller, Reinhold, New York, 1960.

40. McHale J. Richard Buckminster Fuller. Architectural Design, 1961; July:

41. McHale J. R. Buckminster Fuller, Brazillier, New York, 1962. 42. Morgan GJ. The Beauty of Symmetrical Design. PhD Dissertation in

Philosophy, Johns Hopkins University, Baltimore, 2004. 43. Morgan GJ. Historical Review: Viruses, crystals, and geodesic domes.

Trends in Biochemical Sciences, 2003; 28(2): 86-90. 44. Nixon HL and Gibbs AJ. Electron microscope observations in the structure

of turnip yelIow mosaic virus. J Mol Biol, 1960; 2:197-200. 45. Pawley GS. Plane groups on polyhedra. Acta Crystallogr, 1962; 15: 49-53. 46. Rayment I, Baker TS, Caspar DLD, and Murakami WT. Polyoma virus

capsid structure at 22.5 A resolution. Nature, 1982; 295: 110-1 15. 47. Rich A. The nucleic acids: A backward glance. Ann N Y Acad Sci 1995;

48. Rich A. Follow that fiber. Nature Struct Biol, 1998; S(8): 675.

1960; 7: 225-325.

290-3 19.

758: 97-142.

Page 53: Conformational Proteomics of Macromolecular Architecture

40 Gregoly J. Morgan

49. Rich A and Crick FHC. The structure of collagen. Naturc., 1955; 176: 915- 916.

50. Rich A and Watson JD. Physical studies on ribonucleic acid. Nature, 1954a; 173: 995-996.

5 1. Rich A and Watson JD. Some relations between DNA and RNA. Proc Nut Acad Sci, USA, 1954b; 40: 758-764.

52. Sayre A. Rosalind Franklin and DNA, Norton, New York, 1975. 53. Schaffer FL and Schwerdt CE. Crystallization of purified MEF-1

poliomyelitis virus particles. Proc Nut Acad Sci, USA, 1955; 41: 1020-1023. 54. Schmidt P, Kaesberg P, and Beeman WW. Small angle scattering from

turnip yellow mosaic virus. Biochim Biophys Acta, 1954; 14: 1-1 1. 55. Snelson K, Fox HN, and Schultz OG. Kenneth Snelson: An Exhibition,

Albright-Knox Art Gallery, Buffalo, N.Y ., 198 1. 56. Urner K. The invention behind the invention: Synergetics in the 1990s.

Synergetica Journal, 1991; 1: 8-25. 57. Watson JD. The structure of tobacco mosaic virus I: X-ray evidence of a

helical arrangement of sub-units around a longitudinal axis. Bichim Biophys

58. Watson JD. The Double Helix. Atheneum, New York, 1968. 59. Watson JD. A Passion f o r DNA. Oxford University Press, London 2000. 60. Watson JD. Genes, Girls, and Gamov. Alfred A, Knopf, New York, 2002. 61. Wildy P. The growth of herpes simplex virus. Australian J Exp Biol Med

Sci, 1954; 32: 60.5-620. 62. Wildy P, Russell WC, and Horne RW. The morphology of herpes virus.

Virology, 1960; 12: 204-222. 63. Williams RC. Structure and substructure of viruses as seen under the elec-

tron microscope. Ciba Foundation Symposium on the Nature of Viruses. GEW Wolstenholme and ECP Millar (eds.) Little, Brown and Co., Boston, 1957, pp. 19-32.

64. Williams RC and Smith KM. The polyhedral form of Tipula iridescent virus. Biochim Biophys Acta, 1958; 28: 464-469.

65. Wolstenholme GEW and Millar ECP. The Nature of Viruses. Little Brown and Co, Boston, 1957.

Acta, 1954; 13: 137-149.

Page 54: Conformational Proteomics of Macromolecular Architecture

Chapter 2

QUASI-EQUIVALENCE AND ADAPTABILITY IN LIVING MOLECULAR ASSEMBLIES

Donald L. Caspar* and Lena Hammar'

The quest for relations between form and function is the substance of structural molecular biology. Biological systems combine a high degree of organization with a capacity for dynamic behavior. These features seem to originate from the specific bonding properties of protein mole- cules that can appear in more than one structural configuration. An attempt to explain how multiple copies of a protein can self-assemble into a cage with a strict symmetry lattice was given in the 1962 theory of quasi-equivalence. The same concept of protein adaptability may be used to approach dynamic behavior of regular biological assemblies as a general. We here describe the principle with two examples that did not originally apply to the theory.

Keywords: Virus architecture, spherical shell lattice, quasi-equivalent environment, tensegrity structure, 5-fold- and icosahedral symmetry.

INTRODUCTION Knowledge is the astonishing consequence of mans shortcomings; it has evolved from confusion, errors and chaos. But, as put by Francis Bacon, truth emerges more easily from error than from confusion. Therefore, for making any progress with confusing and complicated problems, some simple rules are very helpful, also if they will be proven wrong! Discov- eries and theories very often turn out to be wrong regarding the phe- nomenon they were developed for. Nevertheless, they may be right in

'Institute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306-4380. Email address: [email protected] 'Department of Biosciences, Korolinska Institute, Stockholm Sweden. Email address: 1ena.hatnniar~biosci.ki.se

41

Page 55: Conformational Proteomics of Macromolecular Architecture

42 Dunald L. Caspar & Lena Hammar

3 3

Fig. 1. When the first virus crystals appeared with a diffraction pattern indicating the ”unnatural” and crystallographically “forbidden” 5-fold symmetry it was met with suspi- cion, since we did not understand how it could be integrated among the natural crystals. However, already 1596 Keppler, although his theories on the matter were wrong, under- stood the icosahedron as one among the harmonic bodies of nature.

some other aspects! Searched and utilized with an open mind, theorems could make life a lot easier for a scientist, as well as an end user, since they open the window towards understanding and prediction.

Without disturbing life dynamics we can search for its base in struc- tural terms! It is with the structure it performs its actions. The structure should therefore contain information for our imagination to comprehend! With a look at the structures of viruses, one get puzzled about both the regularity they present and the dynamic trap they hide! Some of the base for this dynamic property was set by the theory of quasi-equivalence. The theorem has hold for some time, and now and then evoked discus- sions on generality. However, the essence remains and is applicable to a variety of motional or transforming regular structures, as exemplified in other Chapters of this book.

Page 56: Conformational Proteomics of Macromolecular Architecture

Quasi-Equivalence and Adaptability in Living Molecular Assemblies 43

GEOMETRY AND ACTION IN VIRUSES Viruses have found the economical solution to use multiple copies of a single or a few proteins to build their cages for intercellular genome transfer. This minimizes the space for coding of the cage and provides a self-correcting structural solution, since only the fitting pieces will work.

There are two main types of simple viruses, composed of protein and nucleic acid only, rod shaped and isometric (spherical). The former is exemplified by the tobacco mosaic virus, TMV, a tubular container for the RNA genome. It assembles into rods of defined length in the pres- ence of the genome. The dynamics of its structure still attracts our curiosity, resulting in reevaluations of early results. l 2

The isometric viruses were found to have an icosahedral symmetry, first proven by X-ray diffraction studies on the tomato bushy-stunt vi- rus.3 An icosahedral shape provides a low energy solution for shell- formation, why it is common among many viruses.

Sixty identical units can form an icosahedron. However, while the capsid proteins are relatively small, many virus capsids are of a consider- able size and built from a high number of building blocks! One may ask how these identical units can form the large spherical capsids with obvi- ously more than 60 units per shell and within a conserved symmetry. Furthermore, and applying also to the rod shaped capsids, what is the fundamental principle for the dynamics developed by these regular bod- ies on assembly and disassembly?

The Theory of Quasi-Equivalence The general idea for how identical structural units in a virus would as- semble into a closed shell of icosahedral symmetry was much inspired by the geometrical principles applied by Buckminster Fuller in the construc- tion of the geodesic domes.” The bearing idea here is the theme of quasi- symmetry in the framework of the structure! This is the engineering dis- covery that makes the virion, along with the geodesic dome to withstand its own pressure! Thus, “the basic principle is that the shell is held to- gether by the same type of bonds throughout, but that these bonds may be deformed in slightly different ways in the different, non-symmetry related environments.” That means that, although it is impossible to place

Page 57: Conformational Proteomics of Macromolecular Architecture

44 Donald L. Caspar & Lena Hammar

Fig. 2. Polyoma virus reconstructions derived at 25p\ from electron cryomicroscopy data.'' The outer shell is built from 72 pentarneric capsomers of major capsid protein VP1 in a T=7d icosahedral surface lattice." The prongs inside the pentamer are VP2 and VP3, protruding from the less ordered interior core filled with DNA and histon p r ~ t e i n s . ' ~

more than 60 chemically identical proteins in symmetrically identical positions in a spherical coat, multiples of 60 proteins can be arranged such that they are all in nearly identical environments. The triangulation of the icosahedral faces was introduced, and has since been the base for structural classification of icosahedral viruses (See Chapter 20). The concept thus postulated that proteins could be adaptable to a limited ex- tent in order to carry out required functions.

Considering the virus shell in terms of these principles, and with plausible assumptions on the degree of quasi-equivalence required, there is a general way in which isodimentional shells may be constructed from a large number of identical subunits and this preferentially lead to icosa- hedral symmetry. Moreover, virus subunits organized on this scheme would selfassemble into shells of discrete sizes.

The capsid proteins of many viruses have the capacity to self assem- ble, not only into original icosahedral shells, but also as subviral parti- cles, including helical tubule. Therefore the same principles for assembly would apply for both the helical tubule viruses and for the isometric ones. A bearing characteristic of the structural units is then that they are

Page 58: Conformational Proteomics of Macromolecular Architecture

Quasi-Equivalence and Adaptability in Livirtg Molecular Assemblies 45

capable of polymorphic transitions. In the assembly this elaborates on bonding specificity and conformational polymorphism.2 The specificity of protein interactions is the unifying concept in biological organiza- t i ~ n , ~ ; ' ~ but to lead to a defined minimum energy arrangement in assem- blies, the units need to adopt in a quasi-equivalent manner to symmetry constrains. 2;20

Our understanding of the principles governing "self assembly" of protein molecules is based largely on structural studies of viruses, or vi- rus-like particles. The polymorphism is understood as the capacity of a single molecule to form different structures, and provides the functional basis for protein assemblies. This concept might seem to be the antithesis of specificity. The two aspects of design are, however, compatible in many structures and are in fact crucial to biological activity.

Any dynamic structure, i.e. a structure which changes its state, is by definition polymorphic. The distinctive aspect of polymorphism in pro- tein structures, contrasts with non-living states of matter, assumedly since the molecular design has been selected to carry out function and that this function is part of the integrated system. Here quasi-equivalence may be defined as a small variation in conformation or bonding which leads to a more stable structure than strictly equivalent bonding would give. Quasi-equivalent bonding may occur if all the interactions among identical parts are not compatible with a regular arrangement.

Quasi-equivalent bonding is a topological necessity in the formation of a closed shell from multiple copies of identical units. Icosahedral sur- face lattices represent the optimal design for spherical shell structures since the distortions in the specific bonds are minimized.2 The degree of quasi-equivalence in icosahedral viruses may be determined. lo

The specific bonding properties of a structure unit determines the surface curvature and thereby define which one of possible icosahedral surface lattices that provide the most stable form on assembly. Neverthe- less, polymorphism is likely since a variety of lattices with only slightly higher energy than the most stable one would be formed using the same specific bonds. Such may occur through failure of a normal control mechanism or by variation in the lunetics of assembly.

Page 59: Conformational Proteomics of Macromolecular Architecture

46 Donald L. Caspar & Lena Hammar

Fig. 3. In the polyoma virus all capsomers are pentamers. Essentially the C-terminal domain of the structural protein VP1 establishes the intercapsid contacts. The capsomers at the 5-fold axis are usually referred to as pentavalent pentamers (shown in gray at the top in the right hand panel) and those around the 3-fold axis as the hexavalent pentamers. There are totally 6 different configurations of the VPI arm. The left panel shows the connecting arms of the hexameric pentamer.’’ The atomic structure of the six VPl con- figurations in the similar SV40 virus can be seen at page 68, Figure 6, or in PDB: 1SVAI6

Triangulation Number

The icosahedron itself has 20 equilateral triangular faces and any ico- sadeltahedron has 20T facets, where T is the triangulation number given by the rule T=h2+hk+k2, for all pairs of h and k having no common fac- tor. This provide a classification rule for icosahedral virus geometry,a which may not always apply to the number of structural unit^.'^;^^

lcosahedral Particles Revisited A novel approach for the description of the protein stoichiometry of viral capsids, that is the protein shells protecting the viral genome, has been introduced based on tiling theory.6 Thereby we generalize the theory of quasi-equivalence to account also for non-quasi-equivalent subunit ar- rangements in icosahedral virus capsids that have been observed experi- mentally but are not directly covered by the earlier approach. Thereby the structure of polyoma virus, Simian Virus 40 and LA virus capsids, which were considered structural puzzles in view of original theory are now regarded as applications of the principle.

“The last chapter in this book present a website and database of solved high-resolution virus struc- tures (VIPER: http://mmtsb.scripps.edu/viper/) classified according to this icosahedral triangulation convention.

Page 60: Conformational Proteomics of Macromolecular Architecture

Quasi-Equivalence and Adaptability in Living Molecular Assemblies 47

Fig. 4. Negatively stained polyoma virus tubes as obtained by digital image processing of low-irradiation electron micrographs. Both the narrow, pentamer tubes, and the wider, so called hexamer tubes, are assemblies of pairs of pentameric capsomers.' Fig. 5 . Polymorphism in the assembly of polyomavirus capsid protein VP1.22 Recombi- nant VP1 forms stable pentamers, corresponding to viral capsomers, in low ionic strength neutral or alkaline environments. These can assemble into spherical shells of different diameters. The 72-capsomer virus is 500 A in diameter. An octahedral, 24-capsomer particle and an icosahedral, 12-capsomer one are 320 A and 260 A in diameters respec- tively. The virus model is viewed near a quasi-threefold axis. Similar trimer contacts are made among the three capsomers at the center of each picture, but the angles between neighboring pentamer axes vary (about 25", 44", and 63", respectively, for the three parti- cles in the order displayed).22

Studies on the polyoma viruses, which are DNA containing tumor vi- ruses, have demonstrated that the molecular adaptability goes beyond the modest conformational adjustments anticipated in the quasi-equivalence theory; nevertheless essential bonding specificity is conserved in the con- tacts that tie the coat protein molecules together.

The polyomaviruses have triggered curiosity due to the very special arrangement in their shell. Initially there was a delusion about the num- ber of protein copies in the capsid shell, displaying a T=7d icosahedral surface lattice. Judged from the plain rule it would have 7 asymmetric structural units, as demonstrated in the Cauliflower mosaic virus also having a T=7 symmetry shell lattice.8 This represents a total of 420 units in the capsid. However, results from x-ray crystallography and electron microscopy of the polyomaviruses proved that all the 72 capsomers are pentamers (Fig. 2). Therefore, the virus coat is assembled from VP1 pro- teins in the number of 360.';19 In their high resolution structure of the SV40, Harrison and co-workers'6 showed that the coat protein is

Page 61: Conformational Proteomics of Macromolecular Architecture

48 Donald L. Caspar & Lena Hammar

remarkably adaptable in the 6 non-equivalent positions of thc SV40 vi- rion (Fig. 3, right). Various assembly forms observed, furlher demon- strate the adaptability of VPl pcntainers - the recombinant VPI may form narrow or wide tubes, or different spherical particles (Fig. 4, 5).

That pentameric capsomers alone elaborate in the polyoma structure was further supported by the observalion lhat the wide, “hexamer” tubes formed from purified VP 1 with approximately hexagonally arrayed cap- somers.’ The similar pentamers was found in the more narrow “pen- tamer” tubes. In both the helical tubes (Fig. 4), the capsomers were ar- rayed in a particular pentagonal tessellation. This arises from the pairing of pentamers across 2-fold axes of the surface lattice. In both the tube structurcs examined at least one pair-wise contact between adjacent pen- tamers closely resembles the contact between the pentavalent and the hexavalent capsoiners in the icosahedral capsid.

The TMV Revisited Tobacco mosaic virus, another paradigm of a self-assembling structure, has been the focus of renewed studies to investigate structural switching. Details revealed in the structure and assembly properties have led to a

Fig. 6. The packing of TMV protein in disc aggregate is b i p ~ l a r , ’ ~ and the repeating two- layer unit is similar to the dihedrally symmetrical A-ring pair in the disk crystal. Three- dimensional reconstructions of the stacked disk aggregate were obtained by electron mi- croscopy of ice-embedded samples. The resolution achieved in the image processing of the electron micrographs is on the order of 9 A in the meridional direction and 12 A in the equatorial.””*

Page 62: Conformational Proteomics of Macromolecular Architecture

QUUA i-Equivulmct. and Aduptnhilihj in Living Molecular Assemblies 49

fundamental switch in the model of the self-assembly process: rather than being nucleated by the hypothetical two-layer disk, virus assembly appears to be initiated by interaction of a specific RNA sequence wilh a short helical aggregate of the coat protein arranged as in the virus. TMV further assembly does appear to involve conservation of bonding speci- ficity, as initially presumed, but only in helical packing arrangements of the protein subunits (Fig. 6). Switching from disordered to ordered con- formations of the protein, dependent on changes in the electrostatic interactions among the protein subunits, appears to be critical in controlling the assembly process. 7 9,l 1-11,18,21

D Y NAMlCS

Fig. 7. Fivefold, by DLC.

Philosophically, motion implies adaptability, and adaptability implies motion. Conse- quently, it is clear, that action by a protein implies its ability to take on different struc- tures. We all seek to explore as deeply as possible the relationship between the adapt- ability and movement of proteins and their many functions in life. Chemically identical proteins may be in quasi-equivalent environ- ments, or perhaps, non-equivalent environ-

ments like the fingers of the hand (Fig. 7). There their design takes o n the inultiple conformations that their fhnctioning requires. The corollary of which is that a coniponenl of strcss can be built in, providing an energy resource for its dynamics. To explore flexibility and sclf-controllcd movements in virus particlcs the distribution of energy in the structure would be searched in the mechanical design and arrangement of matter. Concerted action by assemblies of identical proteins may often depend on individually differentiated movements. Such are seen in the T4 bacte- riophage tail structure. where movements are controlled by switching the subunits from an inactive. unsociable form to an active, associable form (Fig. 8). Energy to drive this change is provided by the intersubunit bonding in the growing structure. Moving from the low-resolution irn- ages of viruses that first hinted at thc adaptability of proteins in forming

Page 63: Conformational Proteomics of Macromolecular Architecture

50 Donald L. Caspar & Lena Hammar

Fig. 8. The gradient of quasi-equivalent conformations modeled in the contracting tail of bacteriophage T4 sheath has suggested a workable mechanism for self-determination of the tail tube length. Top Zeft shows an electromicrograph of the tail and a filtered image. Bottom left show models of the elongated and contracted protein assemblies. To the right a mechanical model of the contractile T4 tail sheath. This was constructed to demonstrate how self-controlled activation of a latent bonding potential can drive a purposeful move- ment5

assemblies, to the high-resolution molecular models of proteins in viruses, in helical assemblies, in actin, in myosin, is a remarkable jour- ney. Proteins have proven more clever than structural biologists, and possibly much more adaptable.

ACKNOWLEDGMENTS Through the years great teamwork efforts along with fights about inter- pretation and principles have been experienced! The fact that many of

Page 64: Conformational Proteomics of Macromolecular Architecture

Quasi-Equivalence and Adaptability in Living Molecular Assemblies 51

the ideas - right and wrong ones - have evoke discussions, is to be regarded as a great privilege for which those involved are thankfully acknowledged! We sincerely thank the copyright holder's for permis- sions to publish the selection of figures included here.

REFERENCES 1. Baker TS, Caspar DL, and Murakami WT. Polyoma virus 'hexamer' tubes

2. Caspar D, and Klug A. Cold Spring Harbor Symposia on Quantitative Biol-

3. Caspar DL. Structure of bushy stunt virus. Nature, 1956; 177:475-6. 4. Caspar DL. An analogue for negative staining. J Mol Biol, 1966; 15:365-71. 5. Caspar DL. Movement and self-control in protein assemblies. Quasi-

equivalence revisited. Biophys J , 1980; 32: 103-38. 6. Caspar DL, and Fontano E. Five-fold symmetry in crystalline quasicrystal

lattices. Proc Natl Acad Sci U S A , 1996; 93: 14271-8. 7. Caspar DL, and Namba K. Switching in the self-assembly of tobacco mo-

saic virus. Adv Biophys, 1990; 26: 157-85. 8. Cheng RH, Olson NH, and Baker TS. Cauliflower mosaic virus: a 420 sub-

unit (T = 7), multilayer structure. Virology, 1992; 186:655-68. 9. Cross TA, Opella SJ, Stubbs G, and Caspar DL. 31P nuclear magnetic reso-

nance of the RNA in tobacco mosaic virus. J Mol Biol, 1983; 170: 1037-43. 10. Damodarian KV, Reddy VS, Johanson JE, and Brooks I11 CL. A general

method to quantify quasi-equivalence in icosahedral viruses. J Mol Biol,

11. Diaz-Avalos R, and Caspar DL. Structure of the stacked disk aggregate of tobacco mosaic virus protein. Biophys J , 1998; 74595-603.

12. Diaz-Avalos R, and Caspar DL. Hyperstable stacked-disk structure of to- bacco mosaic virus protein: electron cryomicroscopy image reconstruction related to atomic models. J Mol Biol, 2000; 297:67-72.

13. Dore I, Ruhlmann C, Oudet P, Cahoon M, Caspar DL, and Van Regen- mortel MH. Polarity of binding of monoclonal antibodies to tobacco mosaic virus rods and stacked disks. Virology, 1990; 176:25-9.

14. Griffith JP, Griffith DL, Rayment I, Murakami WT, and Caspar DL. Inside polyomavirus at 25-A resolution. Nature, 1992; 355:652-4.

consist of paired pentamers. Nature, 1983; 303:446-8.

ogy, XXVII: 1-24., 1962.

2002; 324: 723 -7 3 7.

Page 65: Conformational Proteomics of Macromolecular Architecture

52 Donald L. Caspar & Lena Hammar

15. Klug A, and Finch JT. Structure of viruses of the papilloma-polyoma type. IV. Analysis of tilting experiments in the electron microscope. J Mol Biol,

16. Liddington RC, Yan Y, Moulai J, Sahli R, Benjamin TL, and Harrison SC. Structure of simian virus 40 at 3.8-A resolution. Nature, 1991; 354:278-84.

17. Marks R. "The Dymaxion world of Buckminster Fuller." 1960; Reinhold, New York.

18. Raghavendra K, Salunke DM, Caspar DL, and Schuster TM. Disk aggre- gates of tobacco mosaic virus protein in solution: electron microscopy ob- servations. Biochemistry, 1986; 25:6276-9.

19. Rayment I, Baker TS, Caspar DL, and Murakami WT. Polyoma virus cap- sid structure at 22.5 A resolution. Nature, 1982; 295: 110-5.

20. Reddy VS, Giesing HA, Morton RT, Kumar A, Post CB, Brooks CL, and Johnson JE. Energetics of quasiequivalence: computational analysis of pro- tein-protein interactions in icosahedral viruses. Biophys J , 1998; 74546-58.

21. Ruiz T, Ranck JL, Diaz-Avalos R, Caspar DL, and DeRosier DJ. Electron diffraction of helical particles. Ultramicroscopy, 1994; 55:383-95.

22. Salunke DM, Caspar DL, and Garcea RL. Polymorphism in the assembly of polyomavirus capsid protein VPI. Biophys J , 1989; 56:887-900.

23. Stehle T, Gamblin SJ, Yan Y , and Harrison SC. The structure of simian vi- rus 40 refined at 3.1 A resolution. Structure, 1996; 4:165-82.

1968; 31:1-12.

Page 66: Conformational Proteomics of Macromolecular Architecture

Chapter 3

THE ROLE OF DISORDERED SEGMENTS IN VIRAL COAT PROTEINS

Lars Liljas*

Viral coat proteins have a variety of functions in addition to form the protecting shell around the viral nucleic acid. In many cases, the protein has disordered arms. These arms are used to regulate the assembly of coat proteins into symmetric shells, to interact with the viral nucleic acid and to control the release of the viral genome.

Keywords: Virus assembly, coat protein, protein disorder, protein- nucleic acid interaction

ORDER AND DISORDER IN PROTEINS Most proteins depend on a specific conformation for their function. The amino acid sequence of the protein is thought to code uniquely for this conformation, which is obtained in the folding process. Functional proteins are often more or less globular with a hydrophobic core, and the formation of this core is what drives the folding process.

Although the folded protein has a unique conformation it is not rigid. Most proteins need some sort of flexibility for their function. One common form of flexibility is the domain rotations found in many en- zymes. Hexokinase is a classical example of domain rotations, where substrate binding causes the two halves of the molecule to close around the active site. Another well-known example is the bacterial elongation factor EF-Tu, where three relatively rigid domains have drastically

*Department of cell and molecular biology, Uppsala University, Box 596, 751 24 Uppsala, Sweden. Email address: [email protected]

53

Page 67: Conformational Proteomics of Macromolecular Architecture

54 Lars Liljas

different relative position depending on if the protein is in its active GTP form or the inactive GDP form. In other proteins, smaller regions of the structure undergo movements or conformational changes as normal parts of their function.

In the cases mentioned above, the proteins are fully folded but able to change conformation. In the case of domain rotation, there is often a linker region where the main chain torsion angles can vary easily at one or several positions. There are also several cases where the protein or part of it is not completely folded or become folded only when inter- acting with other molecules. These proteins or protein segments might have a unique conformation when bound to another molecule or might be able to take up different conformation depending on the molecule to which it is bound. Proteins interacting with RNA have in several cases been found to be partly unfolded before binding to their target (Williamson, 2000, Leulliot and Varani, 2001). In ribosomes several proteins do not have a globular conformation but arc extended, filling cavities between the RNA segments (Wimberly et al., 2000, Ban et al., 2000). These ribosomal proteins are unlikely to have a fixed confor- mation in the absence of their binding sites in the ribosomal RNA mole- cules.

The property of these proteins to fold only when bound to another molecule is probably important for their function. A protein designed to adopt its conformation only at binding will decrease its entropy con- siderably when the complex is formed. For the complex to form, the de- creased entropy of the protein therefore has to be compensated by the interactions in the complex. When the molecules forming the interaction have evolved, this might be a way of achieving the right level of binding affinity and specificity. Interaction of large rigid surfaces would allow a high degree of specificity of the binding, but might make the affinity too high. Another reason for keeping a protein unfolded before binding might be that the folded protein might be unable to interact properly with its target, perhaps for sterical reasons.

How is a protein designed to fold in this way? Disordered regions might have unique properties that make them unable to form a hydro- phobic core. Existing disordered proteins tend to have an amino acid composition that differs from well folded proteins. For example, un-

Page 68: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 55

folded proteins might on average have a lower degree of hydrophobic residues and a higher net charge (Uversky et al., 2000) or a sequence of low complexity where some amino acids are found less frequently than in ordered proteins (Dunker et al., 2001).

THE FUNCTIONS OF VIRAL COAT PROTEINS All viruses must have a protein coat, since that is a part of the definition of these entities. This coat has as its main role to protect the viral genome. The nucleic acid in viruses is either DNA or RNA, and it might be double-stranded or single-stranded. The protein coat is in some cases a single shell of protein molecules, but some viruses have two or even three protein shells and others have a lipid bilayer with inserted proteins in addition to a protein coat enclosing the nucleic acid. This paper will describe some of the properties of coat or capsid proteins and especially the use of disordered segments for various purposes.

The protein coat is always built up by many copies of one or a few coat proteins. In elongated rod-shaped viruses, the coat protein molecules are arranged with helical symmetry. In all the different viruses with a more spherical appearance, the number of molecules and the details of their interactions differ enormously, but the protein molecules are always arranged with icosahedral symmetry or a variant of that. The most basic property of the coat proteins is to form this symmetric shell, with or without the assistance of other components like the viral nucleic acid, separate scaffolding proteins or a lipid bilayer.

In the case of icosahedral capsids, the coat protein has to be able to form contacts that create a coat of the correct size. A regular icosahedron can be built up of 60 identical triangles related by two-, three- and five- fold symmetry axes. In icosahedral viruses, 60 copies of a basic building block build up a protein shell. The basic building block might be a single protein subunit or several identical or non-identical proteins. With a single protein molecule or non-identical proteins forming the basic unit, each protein molecule will have one set of interactions with its neighbors. In the common case of a unit with several identical proteins in the basic unit, the subunits will have different sets of interactions de- pending on their position in relation to the icosahedral symmetry axes.

Page 69: Conformational Proteomics of Macromolecular Architecture

56 Lars LiZjas

Long before any high resolution structures of viruses were known Caspar and Klug (1962) suggested how multiples of 60 identical subunits could be arranged with similar (quasi-symmetric) interactions. Their theory lead to the prediction that only some multiples of 60 subunits are allowed according to the rule T = h2 + hk + k2, where h and k are integers and T is called the triangulation number. Although there are some exceptions, the coat proteins of most icosahedral viruses are arranged according to these predictions, why the packing of the coat protein is usually well described by the triangulation number.

In many cases, the coat proteins have additional roles. The most obvious function is to recognize and bind to the viral nucleic acid, but viruses have widely different mechanisms for this recognition. In other cases, the coat protein can catalyze hydrolysis of specific peptide bonds. The most well-known example is the capsid protein of alphaviruses like Sindbis virus, which is a serine protease with a fold similar to chymo- trypsin (Choi et al., 1991). This protein is used to cleave itself from a structural polyprotein that also contains the proteins that are inserted in the lipid bilayer of these viruses. Another well characterized function is found in phage MS2 and other leviviruses, where the coat protein acts as a translational repressor of one of the viral genes (Witherell et al., 1991).

DISORDER IN VIRAL COAT PROTEINS

The Presence of Arms in Viral Coat Proteins

The first virus structures that were determined showed a coat protein with a globular part formed by two antiparallel four-stranded sheets with a topology that was labeled jellyroll or Swiss roll (Harrison et al., 1978, Abad-Zapatero et al., 1980). In addition to the globular parts, N-terminal segments were extended and partly invisible in the electron density maps due to disorder. Crystal structures have now been determined for a large number of very different viruses and in a few cases for isolated viral coat proteins. The jellyroll fold (Fig. la) is found in many apparently unre- lated icosahedral viruses (Chapman and Liljas, 2003). This includes viruses infecting animals, plants and bacteria, and ranging in size form the small T = 1 plant satellite viruses (Liljas et al., 1982) to the large

Page 70: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 57

C-terminus a

Figure 1. Examples of folds in viral coats. a) jellyroll fold (Poliovirus vp3). The eight strands of the jellyroll are highlighted. b) the Sindbis capsid protein (serine protease fold). c) the MS2 coat protein dimer (levivirus fold). d) the coat protein dimer of hepatitis B. e) the HK97 coat protein. All the figures are produced using the program Molscript (Kraulis, 1991).

Paramecium bursaria Chlorella virus with more than 10 000 jellyroll domains in its outer protein shell (Nandhagopal et al., 2002). In all these viruses, the two four-stranded sheets are found, but the connecting loops vary enormously in length and conformation. In plant viruses, they are mostly relatively short. In other viruses, some of the connecting loops are very long (Tsao et al., 1991, Roberts et al., 1986) or even contain a sepa- rate domain (Munshi et al., 1996). The amino acid sequences of these proteins in most cases do not show any significant similarity, but the common fold and function indicate that they all have a common ancestor.

Page 71: Conformational Proteomics of Macromolecular Architecture

58 Lars Liljas

There are also a number of other proteins with this fold, but there is no evidence for a common origin of the viral coat proteins and non-viral proteins with an unrelated function having this fold.

There are also some cases of viral coat proteins with other types of fold (See Tables). The MS2 fold is found only in the small RNA bacteriophages belonging to the Leviviridae family. It is formed by an antiparallel sheet and two helices that are inserted between structural ele- ments in another monomer in a tightly bound dimer (Fig. lc). This fold is similar, but not identical to the fold of many small RNA-binding protein domains. The hepatitis B virus coat protein is unusual in that it consists only of helices (Fig. Id). The coat protein of the tailed bacteriophage HK97 has an unusual shape with two domains with mixed a and p conformation (Fig. le). The fold of both these domains has not been observed in other proteins.

Table 1. Types of folds observed in capsids of icosahedral viruses

Type of Fold Found in

Jellyroll fold: sandwich of two fourstranded antiparallel sheets

All families of non-enveloped positive-strand RNA plant, insect and animal viruses, some DNA phages, adenoviruses, polyoma- and papillomaviruses, parvoviruses

Leviviridue MS2 fold: Dimer with a large antiparallel sheet and helicesclamping monomers together

HK97 fold: two domains formed by mostlv antimrallel sheets and helices

Tailed bacteriophages

Hepatitis Bfo ld: helical with two helices forming bundle in dimer

Hepadnaviridae

Serine Droteuse fold Alphavirus nucleocamids

Reovirus nucleocapsid protein fold ': elongated protein formed by several helices and sheets

Retrovirus cupsid protein: two helical domains with flexible connection

Reoviruses (animal, plant), totiviruses

Retroviruses

"Reoviruses have an inner protein shell formed by 120 copies of a protein. This protein is highly variable in sequence and conformation. A protein with a similar shape is found in the totivirus L-A, which has a single protein shell of 120 subunits.

Page 72: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 59

Many viral coat proteins have extended arms that are used for interactions with other molecules (Table 2). This is true for almost all proteins with the jellyroll fold. In many simple viruses, an N-terminal part of the chain is not part of the fold. In some viruses with a jellyroll fold, also the C-terminus is an extension of the basic p sandwich. In alphaviruses, the serine protease-like nucleocapsid protein has an N- terminus of (in the case of Sindbis virus) 113 amino acid residues that is partly basic, variable between alphaviruses, and completely disordered in the crystal structure of the isolated protein (Choi et al., 1991). The HK97 protein has an N-terminal arm extending from the core (Wikoff et al., 2000) and an N-terminal segment of about 100 amino acid residues that is cleaved and removed in a maturation step in the assembly process. The structure of the hepatitis B coat protein is known from structural studies of recombinant particles (Wynne et al., 1999). To produce these, the C- terminal 36 residues of the protein, which are expected to be extended, were excluded. There is one clear exception to the rule that viral coat proteins have extensions to the globular folded domain: in leviviruses, the complete chain is used to form the observed fold (Valegird et al., 1990).

The conformation of the extended arms before assembly of the protein coat is not known. Most likely, the arms are more or less disor- dered. The conformation of the arms in solution could be studied by NMR, but most capsid proteins have a strong tendency to aggregate at the concentrations required for this type of studies. The apparent disorder of arms in coat protein subunits after assembly is probably mainly of a static kind. The segments have different conformations in different sub- units, and these differences as well as the asymmetry of the viral nucleic acid do not influence the symmetrical surface of the capsid. At the crystallization, the particles will pack in any of 60 equivalent orienta- tions. All atoms that do not follow the icosahedral symmetry will not contribute significantly to the diffraction, since the scattering from differ- ent particles in the crystal will be different. That means that the nucleic acid is normally not seen in the map even if it might have a unique fold. The extended arms will in the same way not be visible when they differ in conformation.

Page 73: Conformational Proteomics of Macromolecular Architecture

Table 2. Types of folds observed in capsids of icosahedral viruses Abbreviations: STNV: satellite tobacco necrosis virus; CPV: canine parvovirus; SCPMV: southern cowpea mosaic virus, formerly called southern bean mosaic virus (SBMV), cowpea strain (this virus is not yet associated with a specific family); TYMV: turnip yellow mosaic virus; CCMV: cowpea chlorotic mottle virus; NwV: Nudaurelia capensis w virus; SV40: simian virus 40.

Virus Virus Family Type of Number of Length of Charges of Length of Charges of Fold" Subunits in N-terminal N-terminal C-terminal C-terminal

Capsid Extension Extension Extension Extension (disordered) (disordered) (disordered) (disordered)

STNV Jellyroll 60 24 (11) 7+ (4+)

CPV Pawoviridae Jellyroll 60 55 (36) 3+, 4- (2+, 2-)

Poliovirus VP2b Picornaviridae Jellyroll 60 122 (6) lo+, 13- (I-) 12 (0)

Poliovirus VP3 Jellyroll 60 43 (0) I+, 6- 13 (2)

Poliovirus VPl Jellyroll 60 76 (19) 6+, 7- ( I+, 2-) 30 (0)

Page 74: Conformational Proteomics of Macromolecular Architecture

b Table 2. (Continued) 3.

& 6 SCPMV Sobemoviruses Jellyroll 180 64 (38)' 15+ (12+) 8

2 TYMV Tymoviridue Jellyroll 180 26 (0) 2+, 4- 5 0s

CCMV Bromoviridae Jellyroll 180 49 (26) 11+, 1- (9+) 12 (0) I + , 2- $ x Pariacoto virusd Nodaviridae Jellyroll 180 82 (6) 18+, 5- 80

1+, 5- (I+, I-) External & Nonvalk virus Culiciviridae Jellyroll + 180 49 (9) domain 9

NwVd Tetruviridue Jellyroll + 240 120 (41) 19+, 12- (12+, I-) 114 (30-50) 14+, 8- YY s e 2. 2 Hepatitis B Hepudnuviridue Helical 240 (37) 17+, 2-

~~

SV40 Polyomaviridae Jellyroll 360 43 (17) 9+, 3- (5+, 1-) 65 (var) 5+, 8-

"Jellyroll + indicates that extra domains are involved! bThe N-terminal of poliovirus VP2 includes VP4, which is cleaved from VP2 after assembly. 'The length of the disordered segment is given for the subunit with the least degree of disorder. dThe C-terminal of Pariacoto virus and N o V includes the y-peptide that is cleaved after assembly.

Page 75: Conformational Proteomics of Macromolecular Architecture

62 Lars Liljas

Arms Used for Stabilization of Structure and Control of Disassembly

The arms in picornaviruses

The protein shells of picornaviruses are built up of 60 copies of three different jellyroll proteins called VP1, VP2 and VP3. A small fourth protein, VP4 is initially an N-terminal extension of VP2 that is cleaved at a late stage of the assembly. All these proteins have long N-terminal arms (40-80 residues) that are found on the inside of the shell (Rossmann et al., 1985, Hogle et al., 1985, Luo et al., 1987, Acharya et al., 1989). Most of these arms are ordered and extend to interact with other subunits and form a network on the inside of the protein shell (Fig. 2). These arms are less conserved in sequence and conformation when different picorna- viruses are compared, but they still mostly have a similar path on the inside of the shell. This similarity of the conformation of the N-terminal

Figure 2. A protomer of the poliovirus capsid as seen from the inside of the particle with VP4 and the N-terminal arms of VP 1-3 highlighted.

arms also extends to insect viruses of the Dicistroviridae family (Liljas et al., 2002).

The arms in picornaviruses obtain their conformation after assembly. The coat protein molecules are part of a struc- tural polyprotein VP4-VP2- VP3-VP1 that is initially cleaved between VP2 and VP3 and between VP3 and VPl to allow the assembly to proceed. The free termini of the proteins produced by the cleavage are far apart in the final structure. The mechanism for the final cleavage to release VP4 from VP2 is still unclear, but it does not appear to be catalyzed by a

Page 76: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 63

separate protease (Arnold et ul., 1987, Harber et ul., 1991, Lee et ul., 1993). A crystal structure of poliovirus capsids where the final cleavage to release VP4 from VP2 is inhibited shows that the network of N- terminal arms has not yet formed (Basavappa et al., 1994).

The function of the arms in picornaviruses appears to be to stabilize the mature particle. The area of contact between relatively rigid surfaces can be used as a measure of the affinity between interacting proteins. In this case the contact area will be large, but since the extended arms will become ordered only at assembly, it is difficult to estimate how much these interactions will contribute to the particle stability.

It is also possible that the arms are important for the release of the viral RNA at the infection. There is evidence that VP4 and the N- terminal segment of VP 1 becomes externalized when the virus particles bind to their receptor at the cell surface (Fricks and Hogle, 1990). The exposed N-terminal arm of VPl is able to attach particles to liposomes (Fricks and Hogle, 1990), and it is possible that this portion of the protein is able to interact with the cell membrane and create a pore where the viral RNA can enter the host cell cytoplasm. The cleavage of VP4 would therefore be an important maturation step allowing the release of the RNA at the correct moment No structural details are, however, known of intermediates in the disassembly process.

Helical regions in nodaviruses and tetraviruses

The insect viruses in the T = 3 Noduviridue and the T = 4 Tetruviridue families both have N-terminal and C-terminal regions that are found on the inside of the shell (Hosur et ul., 1987, Munshi et al., 1996). These regions contain several helices connected by loops of various lengths (Fig. 3). The helices in the mature particles form a protein layer inside the layer formed by the jellyroll domains. Similar to the picornaviruses, a cleavage occurs after assembly. The y peptide, a C-terminal segment of 44 and 74 residues in nodaviruses and tetraviruses, respectively, is cleaved autocatalytically (Friesen and Rueckert, 198 1, Gallagher and Rueckert, 1988, Agrawal and Johnson, 1992). In the case of nodaviruses, the cleavage has been shown to increase particle stability (Gallagher and Rueckert, 1988), but more importantly it is necessary for full infectivity

Page 77: Conformational Proteomics of Macromolecular Architecture

64 Lars Libas

Figure 3. The NwV coat protein (C subunit). The view is tangential to the viral surface. The C-terminal and N- terminal segments (pale) form helices and extended segments on the inside surface of the capsid.

Figure 4. The N-terminal glycine- rich segment of CPV passes through a thin pore at the five-fold axis (Xie and Chapman, 1996). The jellyroll of the five subunits surrounding the fivefold is drawn, but only one N- terminal segment.

(Schneemann et al., 1992). There is no direct evidence for a role of the arms in the disassembly process in these viruses, but it has been sug- gested that the cleavage in noda- viruses will allow a pentameric bundle of helices formed by the y peptide to be externalized and inter- act with the cell membrane and allow RNA entry into the host cell (Cheng et al., 1994). This is sup- ported by studies suggesting that part of the y peptide will modify the properties of lipid bilayers (Janshoff et al., 1999, Bong et al., 1999).

In contrast to many of the other N- and C-terminal segments, the arm segments in the coat proteins of nodaviruses and tetraviruses are partly helical. These elements of secondary structure might be formed already before assembly, but the position of the helices in the assem- bled virus is dependent on the inter- action with other subunits. The loop regions are relatively long and flexi- ble and would allow the rearrange- ment of the helices upon assembly. In NoV, part of one helix has com- pletely different conformation in different subunits, indicating that some of the helices might be only marginally stable before the assem- bly (Helgstrand et al., 2003).

Page 78: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 65

The N-terminal arm in parvoviruses

In canine parvovirus (CPV), which has T = 1 symmetry, an N-terminal segment of about 40 residues is disordered in most of the subunits, but some of these arms appear to pass through a small pore at the five-fold axis and extend to the outer surface of the virus (Tsao et al., 1991). A part of this segment has the very unusual sequence:

GSGNGSGGGGGGGSGG

The large number of glycines explains why this segment can pass through the pore (Fig. 4). Some of the subunits in the CPV shell are cleaved at the N-terminus, and the extension of the N-terminus through the five-fold pore might be necessary for this cleavage (Xie and Chapman, 1996). Similar features are found in many viruses in the Parvoviridae family.

Arms Used for Regulation of Assembly

Flat and bent contacts controlled by switches

The extended arms in the first virus structures gave a good explanation of how the assembly of multiples of 60 identical coat protein molecules can be regulated. In tomato bushy stunt virus (TBSV) and southern cowpea mosaic virus (SCPMV, formerly called southern bean mosaic virus, SBMV), 180 coat protein molecules forms the protein shell (T = 3), and an N-terminal segment turned out to be ordered in only one out of three subunits (Harrison et al., 1978, Abad-Zapatero et al., 1980). The ordered segment is inserted between subunits to create a flat contact between pairs of subunits whereas other subunit contacts without the inserted arm are bent (Fig. 5). The ordered part of the three arms meets at the three- fold symmetry axes of the icosahedral particle and forms a ring-like structure. A very similar pattern of ordered and disordered arms has been found in some other plant viruses (Hogle et al., 1986, Oda et al., 2000, Qu et al., 2000). The difference in order of this arm in the subunits has been described as a switch to regulate the assembly of the T = 3 shell (Harrison, 1980). The interactions between the ordered arms from differ- ent subunits result in a network. This network of ordered arms allows all

Page 79: Conformational Proteomics of Macromolecular Architecture

66 Lars Liljas

d.

Figure 5. Arms regulating assembly in T=3 viruses. a) Packing of six icosahedral asymmetric units in rice yellow mottle virus, a sobemovirus (Qu et al., 2000). The view is down a three-fold axis. b) The same view as in a) but with the N-terminal arms of the C subunit highlighted. The arms interact at the three-fold axis to form what is labeled “annulus”, referring to the ring-like appearance of the corresponding feature in TBSV. c) The flat quasi-sixfold contact between A and B subunits and d) the bent five-fold contact between C subunits as seen in a tangential direction. e) The arms of the A and B subunits in desmodium yellow mottle virus (Larson et al., 2000), a tymovirus, form a sixfold p barrel at the three-fold axis.

subunits to sense its position in the lattice and form the desired bent or flat contacts independent of the exact pathway of assembly. This solves the problem of packing identical molecules with different contacts: the presence of the arm leads to two types of contacts where only a few interactions are the same at the atomic level. The arm has been removed from the viral coat protein of two sobemoviruses by proteolysis or by

Page 80: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 67

recombinant techniques, and the truncated protein forms T = 1 particles with only 60 subunits, supporting the hypothesis that the arm is crucial for the correct assembly of T = 3 particles in this group of viruses (Erickson and Rossmann, 1982, Lokesh et al., 2002)

A similar switch is also found in nodaviruses. These insect viruses are similar to SCPMV in that they have a T = 3 lattice with flat and bent contacts between the jellyroll subunits (Hosur et al., 1987). The flat con- tacts are stabilized by a peptide segment, but this segment does not form the ring-like structure found in the plant viruses, and it is connected to a different coat protein molecule in the lattice (Tang et al., 2001). The flat contacts are further stabilized by the binding of segments of the viral RNA forming a double helix (Fisher and Johnson, 1993, Tang et al., 2001). Also in the case of nodaviruses, the importance of the N-terminal arm for assembly has been analyzed by mutational studies (Dong et al., 1998). The truncated coat protein of Flock house virus were still able to form T = 3 particles and did not form T = 1 particles in the same way as sobemoviruses. The assembly product was inhomogeneous, however, indicating an important role of the N-terminal extension also for these viruses.

Another version of flat and bent contacts in a T = 3 virus is found in Norwalk virus, which belongs to the Culiciviridae family. There is an N- terminal arm that is disordered to different degrees in the three subunits, and the ordered part of the arm is found in subunit-subunit interfaces (Prasad et al., 1999). The pattern of ordeddisorder and the path of the ordered arm is, however, completely different from the network of ordered arms found in the T = 3 plant viruses described above.

The tetraviruses have some properties in common with the noda- viruses. These viruses have a T = 4 lattice, and the structure of NwV shows that the subunits in the virus form flat and bent contacts in a simi- lar way as the nodaviruses (Munshi et al., 1996, Helgstrand et al., 2003). In NoV, the flat contacts are stabilized by C-terminal arms from two of the four subunits rather than the N-terminal arms in other viruses. A helix at the very C-terminus of the y peptide cleaved from the coat protein is inserted in the interface between pairs of subunits. This region is disordered in the other two subunits.

Page 81: Conformational Proteomics of Macromolecular Architecture

68 Lars Liljas

There are T = 3 plant viruses that do not have this network of ordered segments regulating flat and bent contacts. The protein coats of cowpea chlorotic mottle virus (Speir et al., 1995) and turnip yellow mosaic virus (Canady et al., 1996), belonging to different virus families, both have arrangement of jellyroll subunits where two out of three sub- units have an ordered N-terminal arm. Six arms meet at the three-fold symmetry axes to form a short p cylinder with approximate sixfold symmetry (Fig. 5e). In cucumber mosaic virus, which is related to cowpea chlorotic mottle virus, a similar arrangement but with six helices is found (Smith et al., 2000). The subunit-subunit contacts are more similar than in the viruses with

Figure 6. Arms in SV40 (Stehle et al., 1996). The arms in SV40 extend from each subunit of a pentamer and joins the jellyroll of the subunit in another pentamer. Part of the arm forms different contacts depending on its postion in the lattice. Four pentamers are drawn as well as the C-terminal segment from a further subunit to complete the interaction at the three- fold axis.

flat and bent contact, and in this way the subunit arrangement is in close agreement with the predictions of the Caspar and Klug hypothesis. There is still a type of ordeddisorder switching that in some way seems to regulate the assembly.

The arms in SV40

In polyomavirus and SV40, the protein shell is formed by units arranged in a T=7 lattice. The lattice is built up of stable pentamers, rather than the expected mixture of hexamers and pentamers. Pentamers are found also at the positions predicted to have a hexamer of subunits according to the original rules suggested by Caspar and Klug (1962). There are thus six independent subunits in the lattice. These protein molecules might in principle have six different types of interactions. An analysis of the SV40 protein shell shows that there are essentially three different kinds of

Page 82: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 69

contacts (Liddington et al., 1991). A C-terminal arm mediates all the contacts. Part of this arm invades another pentamer and interacts with the jellyroll of another subunit (Fig. 6). Another part of the arm form differ- ent contacts depending on its position in the lattice. There are essentially three kinds of contacts. In one contact, this segment is disordered, but in the other contacts it forms a helix that forms a two-fold or a three-fold contact with the corresponding helices of other subunits. An N-terminal arm is partly disordered, but an ordered part is clamping the invading C- terminal arm to the subunit. In the similar papillomavirus, truncation of part of the N-terminal of the coat protein leads to the formation of T = 1 particles (Chen et al., 2000).

Quasi-symmetry without arms

As described above, extended arms appear to be involved in the regu- lation of subunit packing in many viruses with quasi-symmetry, and this is normally connected with a disorder of a segment in a set of the quasi- equivalent subunits. One might therefore ask if this type of regulation is necessary for the formation of capsids with quasi-symmetry. If the or- dered arms form a network through interactions with other ordered arms, this network will store the information about the type of interaction re- quired from every subunit that is inserted in the growing capsid. If there is no such network formed during the assembly, the incoming subunits will not easily sense their position in the lattice and might form inter- actions leading to an incorrect curvature, as has been found for some plant virus proteins lacking the N-terminal arm (Erickson and Rossmann, 1982, Lokesh et al., 2002).

The structure of MS2, a T=3 bacterial virus, showed that efficient assembly of a capsid with quasi-symmetry is possible without extended arms (Valegird et al., 1990). In this virus, all of the polypeptide chain is ordered. The fold of the protein is different from other viruses, and it forms tightly interacting dimers. There is a relatively long loop connec- ting two p strands, and this loop has two types of conformations depend- ing on the position in the lattice, but it is unclear if this conformational switching is in any way involved in the assembly control. The quasi- equivalent subunit-subunit interactions are very different at the atomic

Page 83: Conformational Proteomics of Macromolecular Architecture

70 Lars Liljas

level, although the same surfaces are used. In the absence of a network of arms, the assembly of this capsid has to be regulated in another way. Possibly, the interacting surfaces are designed such that the T = 3 arrangement gives the optimal contacts. Growing particles with the correct curvature will be assembled and closed more efficiently than particles with some incorrect contacts.

Many large virus particles have more than one protein layer in the assembled capsid or use scaffolding proteins that are removed after assembly. One example is the viruses in the Reoviridue family. In blue- tongue virus, the T = 13 arrangement of the VP7 protein (a two-domain protein with a large helical domain and an outer jellyroll domain) is formed on the inner VP3 shell containing 120 protein molecules (Grimes et al., 1998). A similar arrangement appears to be found in reovirus (Reinisch et al., 2000). The 780 VP7 subunits are arranged in trimers, and the subunits contacts are relatively similar. The size is controlled by the inner shell, and there is no need for conformational switching in the VP7 layer.

Arms used for binding to nucleic acid

In many viruses, the coat protein molecules interact with the viral nucleic acid. This interaction might be of two kinds: a specific binding to a recognition signal in the nucleic acid to ensure that the correct viral nucleic acid is encapsidated and an unspecific binding that might be re- quired for the packaging of the viral genome.

An extended part of the coat protein is used to pack the nucleic acid in many simple RNA viruses. These viruses have N-terminal arms that contain several lysines and arginines that can interact with the negatively charged phosphates of the RNA. This was first observed in the structures of RNA plant viruses like TBSV and SCPMV (Harrison et al., 1978, Abad-Zapatero et ul., 1980). In SCPMV, 38 residues are disordered in all three subunits. These include 12 positively charged residues of which eight are found in an arginine-rich region with the sequence:

RRKRRAKRR

The 2160 positive charges of the disordered arm will be able to neutralize about half of the about 4200 nucleotides in the viral RNA.

Page 84: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 71

Disordered N-terminal segments with many positively charged residues are found in all plant T = 3 viruses with flat and bent contacts. Positively charged RNA-binding arms are also found in bromoviruses, nodaviruses and some plant satellite viruses with T = 1 capsids. There are, however, many simple viruses like picornaviruses, tymoviruses and leviviruses that lack positively charged arms. The negative charges are in these viruses neutralized by polyamines that are incorporated in the virus particles.

The positively charged arms probably have a random, extended conformation in the absence of the nucleic acid. When interacting with the RNA and its negatively charged phosphates, the repulsion between the positive charges of the side chains will be much reduced. In some cases, there is evidence that the arm forms a helix in contact with the RNA (van der Graaf et al., 1991). In satellite tobacco necrosis virus, part of the N-terminal is visible in the crystal structure. This part form a helix, and a bundle of three helices around the three-fold axis is ordered and interact with the RNA (Liljas et al., 1982).

The specific recognition of the viral nucleic acid is in general not well characterized. The main exception is the leviviruses, where a trans- lational operator, an RNA hairpin in the viral RNA, is used also as a packaging signal (Witherell et al., 1991). This recognition has been well characterized structurally (Valegird et al., 1994). The recognition site is found on the surface of a p sheet and does not involve any extended arms. A specific interaction has also been characterized in turnip crinkle virus, a virus similar to TBSV (Wei and Morris, 1991), and in SCPMV (Hacker, 1995), but no structural details are known for this interaction. In the nodavirus Pariacoto virus, the partly visible N-terminal of one of the coat proteins interacts with the segment of visible double-stranded RNA centered on the two-fold axis (Tang et al., 2001). The interactions in- volve both phosphate and base contacts. Both the N-terminal and the C- terminal ends of the coat protein of nodaviruses have been suggested to interact specifically with the viral RNA (Marshall and Schneemann, 2001, Schneemann and Marshall, 1998). It is not clear, however, to what extent the interactions observed in the crystal structure and thus are found at all 60 equivalent positions in the capsid are similar to the specific interactions.

Page 85: Conformational Proteomics of Macromolecular Architecture

72 Lars Liljns

CONCLUSIONS The frequent use of extended arms in viral coat proteins is probably caused by the need for viruses to use the machinery of the host cell effi- ciently. Instead of coding for different proteins for different mechanisms needed in their life cycle, they have evolved extended segments that can be used for other purposes than forming the protecting shell. There is in general little conservation of these additional features of the coat protein. One example is the regulation of flat and bent contacts. Ordered seg- ments from different subunits are used in the plant or insect T = 3 viruses, and in the T = 4 tetraviruses a C-terminal region is used instead of the N-terminal segments in the T = 3 viruses. Another example is the lack of positively charged N-terminal arm in turnip yellow mosaic virus, which is in other respect similar to cowpea chlorotic mottle virus. Our understanding of the mechanisms where these segments are used is in general poor. This is especially true for the interactions with host cells, where it is still unclear to what extent coat protein segments interact with the cell membrane and how this might lead to insertion of the viral nucleic acid.

ACKNOWLEDGMENTS This work has been supported by the Swedish Research Council.

REFERENCES 1. Abad-Zapatero C, Abdel-Meguid SS, Johnson JE, Leslie AGW, Rayment I,

Rossmann MG, Suck D and Tsukihara T. Structure of southern bean mosaic virus at 2.8 A resolution. Nature, 1980; 286: 33-39.

2. Acharya KR, Fry E, Stuart D, Fox G, Rowlands D and Brown F. The three- dimensional structure of foot-and-mouth disease virus at 2.9 A resolution. Nature, 1989; 337: 709-716.

3. Agrawal DK and Johnson JE. Sequence and analysis of the capsid protein of Nudaurelia capensis w virus, and insect virus with T=4 icosahedral symmetry. Virology, 1992; 190: 806-814.

Page 86: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 73

4. Arnold E, Luo M, Vriend G, Rossmann MG, Palmenberg C, Parks GD, Nicklin MJH and Wimmer E. Implication of the picornavirus capsid struc- ture for polyprotein processing. Proc Natl Acad Sci (USA), 1987; 84: 21-25.

5. Ban N, Nissen P, Hansen J, Moore PB and Steitz TA. The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science, 2000;

6. Basavappa R, Syed R, Flore 0, Icenogle JP, Filman DJ and Hogle JM. Role and mechanism of the maturation cleavage of VPO in poliovirus assembly: Structure of the empty capsid assembly intermediate at 2.9 A resolution. Prot Sci, 1994; 3: 165 1 - 1669.

7. Bong DT, Steinem C, Janshoff A, Johnson JE and Reza Ghadiri M. A highly membrane-active peptide in Flock House virus: implications for the mechanism of nodavirus infection. Chem Biol, 1999; 6: 473-8 1.

8. Canady MA, Larson SB, Day J and McPherson A. Crystal structure of turnip yellow mosaic virus. Nature Struct Biol, 1996; 3: 771-781.

9. Caspar DLD and Klug A. Physical principles in the construction of regular viruses. Cold Spring Harbor Symp. Quant Biol, 1962; 27: 1-24.

10. Chapman MS and Liljas L. Structural Folds of Viral Proteins, in Chiu, W. and Johnson, J. E. (eds.), Advances in Protein Chemistry. Academic Press, 2003, pp. in press.

11. Chen XS, Garcea RL, Goldberg I, Casini G and Harrison SC. Structure of small virus-like particles assembled from the L1 protein of human papillomavirus 16. Mol Cell, 2000; 5: 557-67.

12. Cheng RH, Reddy VS, Olson NH, Fisher AJ, Baker TS and Johnson JE. Functional implications of quasi-equivalence in a T=3 icosahedral animal virus established by cryo-electron microscopy and X-ray crystallography. Structure, 1994; 2: 271-282.

13. Choi H-K, Tong L, Minor W, Dumas P, Boege U and Rossmann MG. Structure of Sindbis virus core protein reveals a chymotrypsin-like serine proteinase and the organization of the virion. Nature, 1991; 354: 37-43.

14. Dong XF, Natarajan P, Tihova M, Johnson JE and Schneemann A. Particle polymorphism caused by deletion of a peptide molecular switch in a quasiequivalent icosahedral virus. J Virol, 1998; 72: 6024-33.

15. Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, Oldfield CJ, Campen AM, Ratliff CM, Hipps KW, Ausio J, Nissen MS, Reeves R, Kang C, Kissinger CR, Bailey RW, Griswold MD, Chiu W, Garner EC and Obradovic Z. Intrinsically disordered protein. J Mol Graph Model, 2001; 19: 26-59.

289: 905-20.

Page 87: Conformational Proteomics of Macromolecular Architecture

74 Lars Liljas

16. Erickson JW and Rossmann MG. Assembly and crystallization of a T=l icosahedral particle from trypsinized southern bean mosaic virus coat protein. Virology, 1982; 116: 128-136.

17. Fisher AJ and Johnson JE. Ordered duplex RNA controls capsid architecture in an icosahedral animal virus. Nature, 1993; 361: 176-179.

18. Fricks CE and Hogle JM. Cell-induced conformational change in poliovirus: Externalization of the amino terminus of VP1 is responsible for liposome binding. J Virol, 1990; 64: 1934-1945.

19. Friesen PD and Rueckert RR. Synthesis of black beetle virus proteins in cultured Drosophila cells: differential expression of RNAs 1 and 2. J Virol,

20. Gallagher TM and Rueckert RR. Assembly-dependent maturation cleavage in provirions of a small icosahedral insect ribovirus. J. Virol., 1988; 62:

21. Grimes JM, Burroughs JN, Gouet P, Diprose JM, Malby R, Zientara S, Mertens PP and Stuart DI. The atomic structure of the bluetongue virus core. Nature, 1998; 395: 470-478.

22. Hacker DL. Identification of a coat protein binding site on the southern bean mosaic RNA. Virology, 1995; 207: 562-565.

23. Harber JJ, Bradley J, Anderson CW and Wimmer E. Catalysis of poliovirus VPO maturation cleavage is not mediated by serine 10 of VP2. J Virol,

24. Harrison SC. Protein interfaces and intersubunit bonding. The case of tomato bushy stunt virus. Biophys J , 1980; 32: 139-153.

2.5. Harrison SC, Olson AJ, Schutt CE, Winkler FK and Bricogne G. Tomato bushy stunt virus at 2.9 A resolution. Nature, 1978; 276: 368-373.

26. Helgstrand C, Munshi S, Johnson JE and Liljas L. The refined structure of the insect virus Nudaurelia capensis o virus. 2003, submitted.

27. Hogle JM, Chow M and Filman DJ. Three-dimensional structure of poliovirus at 2.9 A resolution. Science, 1985; 229: 1358-1365.

28. Hogle JM, Maeda A and Harrison SC. Structure and assembly of turnip crinkle virus. I. X-ray crystallographic structure analysis at 3.2 A resolution. J M o l Biol, 1986; 191: 625-638.

29. Hosur MV, Schmidt T, Tucker RC, Johnson JE, Gallagher TM, Selling BH and Rueckert RR. Structure of an insect virus at 3.0 A resolution. Proteins: Struct Funct Gen, 1987; 2: 167-176.

1981; 37: 876-886.

3399-3406.

1991; 65: 326-34.

Page 88: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 75

30. Janshoff A, Bong DT, Steinem C, Johnson JE and Ghadiri MR. An animal virus-derived peptide switches membrane morphology: possible relevance to nodaviral transfection processes. Biochemistry, 1999; 38: 5328-36.

31. Kraulis PJ. Molscript: a program to produce both detailed and schematic plots of protein structures. JAppl Cryst 1991; 24: 946-950.

32. Larson SB, Day J, Canady MA, Greenwood A and McPherson A. Refined structure of desmodium yellow mottle tymovirus at 2.7 A resolution. J Mol Biol, 2000; 301: 625-42.

33. Lee WM, Monroe SS and Rueckert RR. Role of maturation cleavage in infectivity of picornaviruses: activation of an infectosome. J Virol, 1993;

34. Leulliot N and Varani G. Current topics in RNA-protein recognition: control of specificity and biological function through induced fit and conformational capture. Biochemistry, 2001 ; 40: 7947-56.

35. Liddington R, Yan Y, Moulai J, Sahli R, Benjamin TL and Harrison SC. Structure of simian virus 40 at 3.8 A resolution. Nature, 1991; 354: 278- 284.

36. Liljas L, Lin T, Tate J, Christian P and Johnson JE. Evolution of the picornavirus superfamily: implications of conserved structural motifs between picornaviruses and insect picorna-like viruses. Arch Virol, 2002;

37. Liljas L, Unge T, Jones TA, Fridborg K, Lovgren S, Skoglund U and Strandberg B. Structure of satellite tobacco necrosis virus at 3.0 A resolution. J M o l Biol, 1982; 159: 93-108.

38. Lokesh GL, Gowri TD, Satheshkumar PS, Murthy MR and Savithri HS. A molecular switch in the capsid protein controls the particle polymorphism in an icosahedral virus. Virology, 2002; 292: 2 1 1-23.

39. Luo M, Vriend G, Kamer K, Minor I, Arnold E, Rossmann MG, Boege U, Scraba DG, Duke GM and Palmenberg AC. The atomic structure of mengo virus at 3.0 A resolution. Science, 1987; 235: 182-191.

40. Marshall D and Schneemann A. Specific packaging of nodaviral RNA2 requires the N-terminus of the capsid protein. Virology, 2001; 285: 165-75.

41. Munshi S, Liljas L, Cavarelli J, Bomu W, McKinney B, Reddy V and Johnson JE. The 2.8 A resolution structure of a T=4 animal virus and its implications for membrane translocation of RNA. J Mol Biol, 1996; 261: 1- 10.

42. Nandhagopal N, Simpson AA, Gurnon JR, Yan X, Baker TS, Graves MV, Van Etten JL and Rossmann MG. The structure and evolution of the major

67: 2110-22.

147: 59-84.

Page 89: Conformational Proteomics of Macromolecular Architecture

76 Lurs Liljus

capsid protein of a large, lipid-containing DNA virus. Proc Nut1 Acad Sci ( U S A ) , 2002; 99:14758-1476.

43. Oda Y, Saeki K, Takahashi Y, Maeda T, Naitow H, Tsukihara T and Fukuyama K. Crystal structure of tobacco necrosis virus at 2.25 A resolution. J Mol Biol, 2000; 300: 153-69.

44. Prasad BVV, Hardy ME, Dokland T, Bella J, Rossmann MG and Estes MK. X-ray crystallographic structure of the Norwalk virus capsid. Science, 1999; 286: 287-290.

45. Qu C, Liljas L, Opalka N, Brugidou C, Yeager M, Beachy RN, Fauquet CM, Johnson JE and Lin T. 3D domain swapping of a molecular switch for quasi-equivalent symmetry modulates the stability of an icosahedral virus. Structure, 2000; 8: 1095-1 103.

46. Reinisch KM, Nibert ML and Harrison SC. Structure of the reovirus core at 3.6 A resolution. Nature, 2000; 404: 960-7.

47. Roberts MM, White JL, Griitter MG and Burnett RM. Three-dimensional structure of the adenovirus major coat protein hexon. Science, 1986; 232:

48. Rossmann MG, Arnold E, Erickson JW, Frankenberger EA, Griffith JP, Hecht H-J, Johnson JE, Kamer G, Luo M, Mosser AG, Rueckert RR, Sherry B and Vriend G. Structure of a human common cold virus and functional relationship to other picornaviruses. Nature, 1985; 317: 145-153.

49. Schneemann A and Marshall D. Specific encapsidation of nodavirus RNAs is mediated through the C terminus of capsid precursor protein alpha. J Virol, 1998; 72: 8738-46.

50. Schneemann A, Zhong W, Gallagher TM and Rueckert RR. Maturation cleavage required for infectivity of a nodavirus. J Virol, 1992; 66: 6728-34.

51. Smith TJ, Chase E, Schmidt T and Perry KL. The structure of cucumber mosaic virus and comparison to cowpea chlorotic mottle virus. J Virol,

52. Speir JA, Munshi S, Wang G, Baker TS and Johnson JE. Structures of the native and swollen forms of cowpea chlorotic mottle virus determined by X-ray crystallography and cryo-electron microscopy. Structure, 1995; 3: 63- 78.

53. Stehle T, Gamblin SJ, Yan Y and Harrison SC. The structure of similan virus 40 at 3.1 A resolution. Structure, 1996; 4: 165-182.

54. Tang L, Johnson KN, Ball LA, Lin T, Yeager M and Johnson JE. The structure of pariacoto virus reveals a dodecahedra1 cage of duplex RNA. Nut Struct Biol, 2001; 8: 77-83.

1148-1151.

2000; 74: 7578-86.

Page 90: Conformational Proteomics of Macromolecular Architecture

The Role of Disordered Segments in Viral Coat Proteins 77

55. Tsao J, Chapman MS, Agbandje M, Keller W, Smith K, Wu H, Luo M, Smith TJ, Rossmann MG, Compans RW and Parrish CR. The three- dimensional structure of canine parvovirus and its functional implications. Science, 1991; 251: 1456-1464.

56. Uversky VN, Gillespie JR and Fink AL. Why are "natively unfolded proteins unstructured under physiologic conditions? Proteins, 2000; 41:

57. Valegird K, Liljas L, Fridborg K .and Unge T. The three-dimensional structure of the bacterial virus MS2. Nature, 1990; 345: 36-41.

58. Valegird K, Murray JB, Stockley PG, Stonehouse NJ and Liljas L. Crystal structure of an RNA bacteriophage coat protein-operator complex. Nature,

59. van der Graaf M, Kroom G and Hemminga MA. Conformation and mobility of the RNA-binding N-terminal part of the intact coat protein of cowpea chlorotic mottle virus. A two-dimensional proton nuclear magnetic resonance study. J Mol Biol, 1991; 220: 701-709.

60. Wei N and Morris TJ. Interactions between viral coat protein and a specific binding region on turnip crincle virus. J M o l Biol, 1991; 222: 437-443.

61. Wikoff WR, Liljas L, Duda RL, Tsuruta H, Hendrix RW and Johnson JE. Topologically linked protein rings in the bacteriophage HK97 capsid. Science, 2000; 289: 2 129-2 133.

62. Williamson JR. Induced fit in RNA-protein recognition. Nut Struct Biol, 2000; 7: 834-7.

63. Wimberly BT, Brodersen DE, Clemons WM, Jr., Morgan-Warren RJ, Carter AP, Vonrhein C, Hartsch T and Ramakrishnan V. Structure of the 30s ribosomal subunit. Nature, 2000; 407: 327-39.

64. Witherell GW, Gott JM and Uhlenbeck OC. Specific interaction between RNA phage coat proteins and RNA. Progr Nucl Acid Res Mol Biol, 1991;

65. Wynne SA, Crowther RA and Leslie AG. The crystal structure of the

66. Xie Q and Chapman MS. Canine parvovirus capsid structure, analyzed at

4 15-27.

1994; 371: 623-626.

40: 185-220.

human hepatitis B virus capsid. Mol Cell, 1999; 3: 771-80.

2.9 A resolution. J Mol Biol, 1996; 264: 497-520.

Page 91: Conformational Proteomics of Macromolecular Architecture

Chapter 4

PREFUSION DYNAMICS IN AN ENVELOPED VIRUS - ALPHAVIRUS MODEL

Lena Hammar, Lars Haag, Bomu Wu and R. Holland Cheng*

New findings challenge our understanding of the alpha virus structure and fusion mechanism. It is evident from recent work in electron cryo- microscopy, cryoEM, that the external domains of the membrane- anchored glycoproteins, El and E2, form a shell at some distance above the membrane. From there, the glycoproteins protrude further outwards as three-lobed spikes. They present a receptor-binding site residing in E2 at their outermost domains, distal to the center of the spike. The ectodomain of the fusion protein, the El , has an elongated shape, as re- vealed by X-ray crystallography. Fitted in the cryoEM structure of the virus, the C-terminal and central parts of the El ectodomain fill the ma- jor portion of the shell, while the fusion peptide loop hides under the receptor-binding domain in the spike. With this structural background, the alphaviruses represent an intriguing new fusion principle, differing in many aspects from the established influenza model. This mechanism is now on its way to be revealed.

Keywords: alphavirus, cryoEM, fusion mechanisms, fusion protein, membrane glycoproteins, pH effects, virus structure.

CELL ENTRY OF ENVELOPED VIRUSES Virus particles comprise multiple copies of a few basic units. A concept of symmetrical geometry in subunit arrays applies to many viruses and seems to be part of their strategy of efficiency. This includes the assem- bly into a compact genome carriage, hiding machinery for cell entry to

'From the Department of Biosciences, Karolinska Institute, Stockholm Sweden. Email address: [email protected]

78

Page 92: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 79

be put in action at the target. Thus, the single unit has a metastability de- sign that guides the folding and assembly into a construct with transport stability, as well as infectious capacity.

The surface proteins of the virus mediate cell attachment and ge- nome entry. For the enveloped viruses, this involves fusion with a cellu- lar membrane. After virus doclung at the cell surface, its envelope may fuse with the plasma membrane, as is the case with HIV. Here, the bind- ing to the receptors provides the key that transforms the virus into a fu- sogenic mode. Thereby the external gp120 is released and the membrane anchored fusion protein, gp41, can refold to grab the target plasma mem- brane and promote fusion. Alternatively, and exemplified by influenza virus and the alphaviruses, the enveloped virion is taken up by endocyto- sis for later, acid activated, fusion with the endosomal membrane. How- ever, the molecular mechanisms leading to virus-cell membrane fusion are more related in the HIV and influenza virus than in influenza and alphaviruses.

Prefusion Events

Triggered by the binding of the virion to the cell receptorls, or by acidification in the endosome, a series of events leads to integration of virus membrane into a cellular bilayer membrane. There could be a syn- ergistic trigger by more than one initiator, or several selecting halts in the process before fulfilled fusion and genome release into the cell. Depending on their fundamentals, different viruses may utilize quite dif- ferent strategies to approach fusion. This is exemplified by the two mechanisms described below, the influenza and the alphavirus models.

Contrary to the intracellular trafficlung with recycling of vesicles, the enveloped viruses usually elaborate unidirectional. Therefore, bud- ding and fusion are normally consecutive, essentially non-reversible events in a metastability pathway towards infection. From the mechani- cal point of view, structural configurations in the virus envelope proteins would contribute the energy needed and the gears and wheels of the fu- sion machinery. That is, the particle provides attachment point details, as well as cooperative effects. The mechanism is loaded during folding and assembly of the virions, and released by the environment encountered at

Page 93: Conformational Proteomics of Macromolecular Architecture

80 Lena Hammar et al.

Fig. 1. Fusion intermediates; Lipid bilayer membranes in contact by a local protrusion that penetrates the polar surface of the target and disturbs its hydrophobic interior phase (a). This may result in a stalk (b), or possibly a hemifusion (c) intermediate configuration, before formation of a pore and merging of the two membranes into a confluent layer (d). Viral fusion proteins enhance this process, by providing a fusion peptide (here depicted as a cone) and a trap, loaded for membrane capture (e), close encounter (9 and membrane fusion (g). During the process the fusion protein refolds in a series of intermediates that may include different levels of oligomerization. (Inspired by Chernomordik & Kozlovi2 and EpandI4'I5)

the target. Therefore, to obtain an understanding of the fusion mecha- nism, it is appropriate to study both the loading of the trap and the re- lease processes. Along with peptide folding and oligomerization, other posttranslational events, such as precursor cleavage, fattyacylation and the sequential decoration and modulation by glycosylation would help not only for assembly of a stable transport particle, but also to load the energy battery for cell invasion. Although we here only touch structural aspect of the fusion machinery, it is evident that this sector of the virus life cycle provides targets for antiviral interventions.

Virus Fusion Proteins

The fusion of biological membranes requires, in addition to a close en- counter, the transient formation of membrane discontinuities. This could involve hydrophobic disturbances in-between the lipid bilayers, and hy- drophobic, or hydrophilic pores. The fusion of two stable bilayers would likely proceed through intermediates in which the membrane acquires

Page 94: Conformational Proteomics of Macromolecular Architecture

Prefusinn Dynamics in an Alphavirus 81

curvature. Virus fusion proteins carry a hydrophobic “fusion peptide” sequence that is essential for their action. isolated small viral fusion pep- tides, when inserted in the target membrane, have been shown to do so at an angle to the ~ u r f a c e . ~ ” ~ This would promote formation of the local curvature needed to initiate membrane fusion according to the stalk-pore modelI5 (Fig. 1). Although this could be only one of several mechanisms by which fusion proteins accelerate the rate of fusion, the inserted pep- tide will provide both an anchor in the hydrophobic domains of the target and a disorganizing element between its bilayers. ’‘

The energy prize for disruption and elastic bending of the target and the viral lipid bilayers would be accommodated by the virus membrane proteins, possibly in conjunction with a cellular receptor molecule. Ac- cording to Chernomordik&Kozlov’2 the job description for a viral fusion protein would include:

i ) i i )

Establishment of membrane close encounter, formation of point-like dehydration contacts between the mem- branes to decrease the hydrophobic energy of monolayers rupture and allow tilt deformation, disturbance of bilayers internal domains with transient discontinui- ties forming a fusion stalk, and a possible stage of hemifusion intermediate, before induction of a pore, or a slit, to complete merging of virus envelope and target membranes. 12;43

in most viruses the building units organize themselves into a Caspar- Klug quasi-equivalent surface l a t t i ~ e . ~ : ’ ~ This requires a local flexibility, but also implies that what happens at one location would, elastically, af- fect the whole particle. Thus, viral fusion proteins elaborate in a dynamic architecture that helps to control and perform the task of membrane merging and delivery of the infectious genome into the target cell.

iii)

iv) v)

Virus Fusion Mechanism, Class 1 - Influenza Virus Model Enveloped virus such as HIV-1, influenza and Ebola virus have been the subjected of extensive fusion studies and serves today as models for vi-

The struc- ms fusion mechanisms, referred to as class 1, or type 1. tural organization of these viruses suggests that they have all evolved to

33-35;69

Page 95: Conformational Proteomics of Macromolecular Architecture

82 Lena Hammar et al.

use a similar fusion mechanism with refolding into long a-helices and formation of oligomeric helical bundles in their fusion glycoprotein.

The precursor glycoproteins of HIV-1 (env, gp160), influenza virus (Hemagglutinin, HA) and Ebola virus (Gp) are proteolytically cleaved by host proteases to produce the receptor binding domain in one glycopep- tide and a fusion domain in another, membrane anchored, glycopeptide. This maturation-cleavage places a hydrophobic glycine-rich fusion pep- tide at the N-terminus of the fusion protein. The receptor binding and the fusion glycoproteins form pairs in trimeric protrusions, also referred to as spikes, on the surface of the mature virion. In the case of HIV-1 the re- ceptor binding subunit, gp 120, is external and non-covalently associated with the membrane-anchored fusion protein, gp4 1, while the correspond- ing influenza subunits HA1 and HA2, as well as Ebola virus Gpl and Gp2, are both anchored in the membrane and covalently linked to each other by disulphide bonds. Fusion potential is activated either by the re- ceptor binding directly, or need the acid environment in the endosome for trigger.

Much has been learned about the molecular mechanism of viral fu- sion by studies with Iiposomes, as recently summarized by Smit et a1.62 and by the Kielian group. 19;38 Model studies utilizing peptide segments of the glycoproteins have also been informative, in particular in the influ- enza and HIV case, as reviewed by White7' and Skehel & wile^.^^ One of the best-characterized viral membrane fusion proteins is the hemag- glutinin of the influenza virus. The molecular rearrangements revealed seem to be relevant also for the HIV case, although the nature of the re- ceptor and the triggered release of the fusion protein from the agglutinin component differ considerably. The plain model assumes that in both HIV and the influenza the membrane proteins are fairly free to move lat- erally in the fluid mosaic structure of the lipid envelope.

Activation of the influenza hemagglutinin is highly dependent on protonation of the globular HA1 domain, resulting in a weakened inter- action between the subunit^.^';^^ In that sense HA1 can be regarded as a lock for fusogenic activity like the gp120 would be for HIV fusion. X- ray structures of influenza HA2 show different conformations in neutral and low pH.6;73 The neutral conformation has two vertically aligned anti- parallel a-helices, separated by a loop, in the center of the HA1/HA2

Page 96: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 83

spike heterotrimer. The fusion peptide is hidden in a pocket near the base of the spike. To extrude towards the target membrane a fundamental re- location is needed. In the low pH structure, the loop between the two helical domains has refolded to be part of a long helix, thereby extending the central triple-stranded coiled-coil of the spike. This relocates the fu- sion peptide more than 100 A from the previously buried position to the top of the spike, allowing insertion into the target membrane.6 Following insertion the second half of the long a-helix is unfolded at the middle, forming a reverse turn, to lie antiparallel against the first half in the tri- ple-stranded helix coil, making a six helix bundle. Thus the a-helix C- terminus folds back in a jack-knife manner and relocates the C-terminal membrane anchor to the same end of the rod-shaped molecule as the fu- sion peptide. This brings the host-cell and viral membranes in close prox- imity. The exact nature of the steps towards fusion are not clear, but it is assumed that on formation of the six-helical bundle there is a bending of the polar surfaces of the viral and target membranes promoting stalWhemifusion, pore formation and ultimately fusion with merged membranes.

The formation of a six helical coiled-coil bundle that provides juxta- position of the virus membrane anchor and an N-terminal fusion peptide is also implicated in H I V ~ ~ " ~ and would be a recurring feature character- izing this fusion mechanism, extensively reviewed by Skehel & Wiley58'59 and Less understood in its details are mechanisms prevailing in viruses, which, like the Flavi and alphaviruses have a low content of alpha-helical structure in their envelope proteins and which carries an endosequence fusion peptide. In these viruses, although de- rived from one precursor peptide chain, both spike proteins are anchored in the virus membrane and, apparently, tightly, but non-covalently asso- ciated in the external domains. To accommodate these features an alter- native fusion mechanism should be operating, not based on helical lever arms and formation of helical bundles.

Virus Fusion Mechanism, Class 2 - Alphavirus Model Enveloped viruses such as the alphaviruses, exemplified by Sindbis and Semliki Forest virus, SFV, and the Flaviviruses, exemplified by tick

OTHERS.4:70

Page 97: Conformational Proteomics of Macromolecular Architecture

Lena Hammar et al. 84

borne encephalitis virus, TBE, and Dengue virus, essentially lack helical motifs in the external part of their envelope glycoproteins. These viruses populate fusion class 2 or type 2 group, as earlier discussed by Heinz and Allison 30 and recently by Gibbons et a1.,22 and Modis, et al.52

The ectodomain of the isolated fusion protein E and E l of TBE and SFV, respectively, have been crystallized as dimers and show similar folding in the structure, composed mainly of P-strands with connecting

The fusion peptide in these proteins loops and a few short helices. is not N-terminal, but located in a loop between two antiparallel seg- ments of a P - ~ h e e t . ~ ~ ; ~ ~ In TBE the virion surface is rather smooth, while

These are composed the alphaviruses carry spike-like protrusions. of three copies each of the receptor binding glycoprotein and the fusion glycoprotein, both anchored in the envelope membrane. The major portion of the fusion glycoprotein contributes to an external protein shell. This provides an extramembrane shield that has to disintegrate to allow membrane-to-membrane encounter between virion and target. The initiation and rearrangements needed to promote a close encounter be- tween the membranes and fulfillment of infection by fusion, involves several steps that has been only partly revealed.19;20;27;29;36;55'64;77 Like in the class 1 mechanism, but with a rather different appeal, stable trimers of refolded fusion glycoprotein appears at a late stage of the proc-

As a background for a discussion of the details of this model, the structure and assembly of an alphavirus will be described.

44;55;56

3;17;56

1 I ; 16;44

ess .22;52;61

A 1 P H AV I RU S ES Along with its many biotechnical applications, the SFV has joined the Sindbis virus as a prototype for studies on structure and function in al- phaviruses. The alphaviruses form a group in Togaviridae family of small, enveloped RNA viruses. They infect birds and mammals and are important causes of mosquito-borne viral encephalitis. The virion is spherical with T=4 icosahedral symmetry in both the envelope and the capsid layers (Fig.2). The outer diameter is about 70 nm, while that of the capsid is about 40 nm.

Page 98: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 85

Fig. 2A. A cryoEM derived structure of Semliki Forest virus with enclosing icosahedron to indicate symmetry axes. Parts of the envelope are virtually pealed off to show mem- brane surface (green, bottom right) and nucleocapsid (blue, upper right,) with its RNA core (marine). Capsid proteins connect laterally forming pentameric or hexameric rings and, towards the ring centers, to the membrane where this is depressed. The semi-fluid hydrophobic domain of the lipid bilayer appears empty (inner layer in jade and outer layer in green). The limbs of the envelope glycoproteins emerges from the membrane surface above the capsid protomers, and connect to the shell domain (yellow), from where spikes are rising as three-lobed structures (copper colored). The center of the spikes are localized above the interstitial domain between three capsid rings in the T=4 icosahedral symmetry lattice. Opposite page: To demonstrate details of the layered struc- ture, a virtual bore kernel is drilled under the spike at the 3-fold axis, down to the RNA layer, as shown in the center of the figure. This is surrounded by top views at the differ- ent radial cutoffs indicated.

Page 99: Conformational Proteomics of Macromolecular Architecture

86 Lena Hammar et al.

Page 100: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 87

S'CAP

The capsid protomers are rooted in the RNA core and organized in hexameric or pentameric rings around the 2- and 5-fold symmetry axes in the lattice. Attached to these capsomers are the submembrane tails of the membrane glycoprotein E2. Thereby the spikes on the external side of the membrane, knots the capsid rings together.

The arrangement of the elongated El molecules provides an example of a tensegrity spherical shell network (See Chapter 1 and 2) with lateral interactions between the El molecules and a locking function provided by the E2 molecules. In brief, the alphaviruses have a strict icosahedral structure with interlocked protein layers in 3 dimensions (Fig. 2, A&B). This configuration, with its symmetry and organization is the result of an elaborate folding during protein synthesis and maturation.

Nonstructural ORF A,? + Genomic RNA, 425 (11.442 nt)

ns proteins

I_ Structural ORF ' mRNA, 26s (4.074 nt) tv trurislution

Fig. 3. Alphavirus genome organization and precursors of the structural proteins. Modi- fied from Strauss & Strauss, 1994.65

Genes & Proteins The alphaviruses carries a single stranded positive sense RNA genome with a 5' cap and a poly-A 3' tail. The structural proteins are translated from the same open reading frame in the sequential order of capsid pro-

Page 101: Conformational Proteomics of Macromolecular Architecture

88 Lena Hammar et al.

tein C, envelope E3, E2, 6k, and E l , as outlined in the Fig. 3. The as- sembly process involves steps that create a trigger for later activities. During the way of precursor formation the C domain folds into a prote- ase that cleaves itself off from the growing peptide chain. This frees a signal peptide domain that inserts the growing envelope protein precur- sor into the ER membrane. On the luminal side the sequence is glycosyl- ated, which probably accounts for that it is expelled from the membrane and folds with the growing peptide chain into the E3 domain of the pE2, precursor of the E2 and E3 found in the mature virus. The membrane proteins pE2, 6K and E l are formed in sequence, and cut apart by ER signalse. Other early posttranslational modifications include fattyacyla- tion at the submembrane tails of the glycoproteins, and glycosylation.

The Capsid Protein

The pentagonal and hexago- nal protomer rings in the capsid constituting the cap- somers (Fig. 2) are only weakly connected to each other why it is assumed that they are stabilized by RNA- interaction with the N- terminal domain of C pro- tein.75 Within the rings the C-terminal of the protomers are contacting at the turns of beta s t r u c t ~ r e s ~ ~ ' ~ ~ (Fig. 4). This domain carries a chy- motryptic active site and a hydrophobic pocket binding a tyrosine of the submem- brane tail of glycoprotein ~2.51;60;61 By this the capsid and the envelope are linked together in the mature vi- rion. 60

copmrein binding pocket

. . I : %L

* ..,... '.......' N-terminal domain aa1-118 .,,.;' I.

+/.'..... *...;f++ ... -+:

+.. . ., . s i;&; : ..,*,.*.. .&.. '"""i

... . ....*

... * I . . . . .

Fig. 4. The nucleocapsid protein C of Semliki Forest virus with its structural domains. The N- terminal (amino acid 1-118) of this 33-kDa pro- tein lacks a well-defined tertiary structure and is buried in the central RNA core of the virion. It contains positively charged clusters that would control the RNA-protein interaction in the nu- c leocap~id. '~ The C-terminal, with a sequence well conserved among the alphaviruses, folds into a serine protease and is responsible for the auto cleavage of the C peptide from the growing structural protein pro-precursor. (PDB: 1 DYL)47

Page 102: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 89

E l domains: Hydrophilicity plot (Kyte-Doolittle)

II I (38-129, 169-272) 111 - (291-381)

TM (413-436) Pab recogn. sites, highnow Glycosylation sites

...... ..... ....... ..

E2 00 22

0 8

Fig. 5. Hydrophilicity plots of Semliki Forest virus glycoprotein sequences. Positions of glycoconjugates and epitopes discussed in this article are indicated. The fusion peptide, fp, is identified by the MabElf and the receptor binding domain by the Mab E2r. Mab Ela-1 binds an extreme acid epitope.2 Recognition sites for rabbit serum (pab) against detergent isolated E l and E2 proteins are indicated as blue bars.29 Different levels of gray bars along the El sequence represent the domains I-IILJ4

The Clycoproteins and Envelope Maturation

The SFV normally builds up on and buds from the cell plasma mem- brane. The envelope glycoproteins derived from the structural proprecur- sor, have by then passed a series of maturation steps; the transmembrane glycoproteins pE2 (p62) and E l (49 kDa) form hetero-dimers in the en- doplasmatic reticulum before association into heterotrimeric format in the Golgi compartment. Then the pE2 sequence is cleaved by cellular furin into the membrane anchored E2 (52 kDa) and the small external glycoprotein E3 (-10 m a ) . Presumably, the E3 domain establishes an essential pE2-El interaction during particle assembly. The furin cleavage makes the spike layer structure more labile and accessible for low pH fusion activation than if the cleavage is avoided, as shown in furin defi- cient cells.76 Uncleaved pE2 does not prevent the transport of the pre- spike structures to the plasma membrane and the budding process, since the virus can be propagated in furin deficient cell lines.76

Page 103: Conformational Proteomics of Macromolecular Architecture

90 Lena Hammar et al.

Fig. 6. Fusion protein location in the SFV spike structure. The assumed occupancy of glycoprotein E l is marked in light surface pattern in this detail of a 14w CryoEM recon- struction of SFV and follows the Lescar et al. modeling of the E l crystal structure.u The shell is penetrated by the E2 limb, behind the domain I11 in the view. The N-terminal of the El is located in the domain I (See also Fig 7), closest to the viewer, while the C- terminal of the ectodomain is in the domain 111. From there the E l peptide continues, together with the glycoprotein E2, down through the limb space and the membrane lay- ers. The C-terminal of E l ends with two arginine residues below the membrane, while the E2 carries a longer submembrane tail and connects to the capsid protein. The domain I1 of the E l carries the fusion peptide at the top of the elongated molecule (fpd, fusion pep- tide domain), shielded for external access under the E2. Rbd is receptor-binding do- main.63 Parts contributed by E l from adjacent spikes are indicated by 1’, 11’, etc. z

Page 104: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 91

Mutation of the cleavage site does not prevent budding at about the Such “non-cleavable” mutants are poor same rate as the wt virus.

in binding to the target cells and require a much lower pH for fusion than the wild type virus. Structural studies on non-cleavable pE2 mutants im- ply that the E3 domain is located at the outer rim of the lobes in the spike-like protrusion^.^^ The E3 is highly glycosylated; considering its small size, and carries one or two complex type glycoconjugates that would contribute to the shielding of the virus protein structures (Fig. 5) .

The E2 protein present a receptor-binding site on the top of the spike lobes, as identified by antibody footprint location63 (Fig. 6). This is lo- cated close to one of the two N-linked glycosylation sites in the se- quence, assumed as of complex type (N199, Fig. 5) . The other site is oc- cupied by a high mannose type glycoconjugate, externally recognized by mannose-binding lectins, like the one from Galanthus nivalis, and with little variation relative pH.28;29 Both in the E l and the E2 protein there are functional sites related to neurovirulence, which imply that further do- mains may be involved in cell doclung.

The protein 6K is a small 60 amino acid long transmembrane peptide that separates the pE2 and E l sequences in the proprecursor sequence (See Fig. 3). Its sequence implies that it may span the membrane twice. Whether or not this is the case in the final configuration is not certain, but it is found to be a viroporin, i.e. capable of forming ion channels.25 It was recently shown that the 6K protein in Ross River virus and Barmah Forest virus forms cation channels when inserted into planar lipid bilay- ers. The 6K protein was then spanning the membrane with a single transmembrane a - h e l i ~ . ~ ~ SFV mutants with deletion of the 6K buds as the wt, but are more heat labile.45 The 6K protein is formed in the same number as the other em-proteins. However, only a small portion remains in the virus particle after budding. Thus, in SFV particles released from infected BHK cells only about 3% of the membrane protein mass is rep- resented by the 6K.46

The glycoprotein E l is the last protein in the proprecursor sequence of structural proteins. As mentioned, it associates with pE2 at an early stage in the biosynthetic pathway, and is part of the homotrimer glyco- protein structure formed in the Golgi and constituting the spikes. It car- ries an endosequence fusion peptide located at residues 7S-97.I7 That E l

23;24;66;76

49

Page 105: Conformational Proteomics of Macromolecular Architecture

92 Lena Hammar et al.

. Fusion loop (Garofl 1980)

i

/ (Vishishtha, 1

E381 6 Limb

S319

Twisted beta sheets Domain 1

157 Ela-1 Acid epitope

'El-E, SV Neutralizing domain t (Schrnabohn. 1983)

Fig. 7. The structure of fusion glycoprotein El e ~ t o d o m a i n . ~ ~ Molecular domain I (red), I1 (yellow), and I11 (blue) are indicated and the fusion loop encircled. The beta barrel nature of the domains 111 and I are shown in inserts. Filled small balls show epitopes and mutation sites of functional irnp~rtance.~;'~;*~;*~;~~~* Locations of Asnl41, carrying a complex type glycoconjugate, and Ser379, with an 0-linked sugar structure, are indi- cated.** T1 is the N terminal of E l and E381 the C-terminal of the ectodomain, linking to limb region. (PDB: 119W)

fill its assumed function and inserts the fusion peptide into target lipo- some membranes at an acidic pH has been well c ~ n f i r m e d . ' ; ' ~ ' ~ ~

The crystal structure of the neutral ectodomain of SFV E l shows an elongated molecule44 (Fig. 7), similar in folding to the'E protein of the tick-borne encephalitis virus.s6 Modeling of the ectodomain structure into cryoEM maps reveal that the El is the main constituent of the shell layer, leaving the major portion of the spike for glycoprotein E2.44 Similar or- ganization holds for the Sindbis virus,75 and was also independently im- plied by cryoEM difference-density mapping between sugar deletion mu- tants and wt virus. Thus, the Rossman groups4 utilized a panel of such Sindbis virus mutants to show that the two sugar moieties of the Sindbis virus El are located in the shell layer. Therefore, the E l would be ori- ented essentially parallel to the membrane and occupy the shell layer. In contrast, the difference structures of E2 pointed to an upright orientation

Page 106: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 93

in the spike structure, with one glycoconjugate in the center of the spike at the shell level and one in the spike wing domain.54 In the SFV, the El carries only one N-linked sugar (N141, Figs 5 and 7), which is of the complex type. An 0-linked glycoconjugate in El has been revealed by its interaction with Vicia villosa lectin, VVL, and other lectins that selec- tively bind 0-linked sugar structures (Hammar, unpublished). This is probably linked to S379 in the E l sequence, where an 0-linked sugar site has been found.22 That indicates a location at the connection between the shell and the limb in the domain I11 of the structure (Fig. 7). In the virus the VVL binding structure is not available for sensor surface interaction at neutral pH, but it pops up on acidification and thereby provides a probe for a pH dependent step in prefusion refolding of the virion shell (Chapter 18).

The transmembrane domain, TM, of El and E2 has been explored by mutational analyses,57 as well as by molecular modeling of the trans- membrane sequences. It is likely that El and E2 pass the membrane in consort as two slightly twisted helixes at neutral pH.* It was noted that the single E l helix might be more flexible than the E2 one. Furthermore, in spite of a great sequence variation in this domain, similarities in TM organization between the alphaviruses were traced.8 SjobergtkGaroff ob- served a pattern of conserved glycines in the TM region of El and made two mutants where either the glycines only or the whole segment around the glycines was replaced by leucines. Both mutations decreased the sta- bility of the E1-E2 interaction and promoted El homotrimer formation at a suboptimal fusion pH, while fusion activity was decreased.57 Thus, the TM domain of the glycoproteins, and their interactions in the membrane would have a considerable input on the behavior of the envelope and the correct timing of events during prefusion and fusion.

Cell Encounter

While the alphaviruses usually buds at the plasma membrane, they infect by endocytosis. They multiply also in the vector, why the receptork would be cell surface constituents common across these species. Al- though, the SFV with respect to cell cultures shows a rather promiscuous infection pattern, there is with many alphaviruses a prominent tropism

Page 107: Conformational Proteomics of Macromolecular Architecture

94 Lena Hammar et al.

for neuronal tissues. In the in vivo situation, this prepares the ground for encephalitic disorders. However, other tissues may also be targeted.

Described cellular receptors for the closely related alphaviruses Sindbis, Ross River and Semliki Forest virus include proteins assumed as high affinity laminin receptor (See review by Strauss & S t r a ~ s s ~ ~ ) , cell surface heparan sulfate42 and the cell surface mannose-binding proteins DC-SIGN and L-SIGN.41

A potential receptor binding location in Ross River virus has been approached using neutralizing antibodies. Antibody escape mutants showed changes in E2 residues T216,7';72 N218," and T2197 with varia- tions in receptor binding and virus tropism. In Sindbis and Ross River viruses this sequence is part of a protective epitope for lethal encephalitis Hamong the Ross River, Semlilu Forest, and Sindbis viruses and lies be- tween the two asparagine-linked glycosylation sites (residues 200 and 262, Fig. 5 ) in E2. In the structure this potential receptor binding domain was localized to the tip of the spike lobes6' (See Fig. 6).

Alphavirus infection includes endocytosis and cytoplasmic entry via fusion with the membrane of acid endosomes. Very little is known on the post-binding fate of the receptor, but as part of a common endocytotic pathway it may recycle to the cell surface. There seems to be no strict requirement for a protein receptor to induce fusion and release of the vi- ral genome. The common view is that the acidification in the endosomal vesicle is enough to transform the virion into a fusogenic mode. For ful- fillment of fusion, cholesterol and sphingolipids should be present in tar- get membrane, as reviewed by Kielian et aZ.38

Observed Fusion Related Events

An acid environment, as developed in the endosome, triggers fusion of SFV with target membranes. Normal fusion of SFV with liposomes or cells requires that the E3 domain has been proteolytically processed from pE2 and assumedly also released from the particle. However, virus pro- duced in furin deficient cells or mutants with defective cleavage and thus retaining the non-mature pE2 in the virion may fuse with target liposome membranes, but do so only at a much lower pH than the wt. This could

MICE.50;72 EPITOPE SEQUENCE IS CONSERVED IN ITS HYDROPATHY PROFILE

Page 108: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 95

100 -

0 5,5 6,O 6,5 7,O 7,4

PH

Fig. 8. The pH dependent binding of SFV particles to monoclonal antibodies, immobi- lized at sensor surfaces. The virus, at the pH indicated, was introduced in the flow over Biacore sensor surfaces that were coated with a monoclonal antibody or the mannose- binding lectin GNA. Virus concentrations were titrated for each surface to give an about 70% saturation at optimal condition. Thereby a response reflecting the exposure of epi- topes vs. pH was obtained (See Chapter 18). The epitope for the neutralizing antibody Eln is transiently exposed with an optimum at pH 6.6, and that for MabE2m at pH 6.3. The Mab E2r, binding to receptor binding domain of E2, shows a two-phase behavior - when the MabEln epitope disappears and the fusion loop appears (Elf epitope) it be- comes more available for surface i n t e r a ~ t i o n . ~ ~

be understood as that the E3 domain, with a location at the outer spike rim, proposed by P a r e d e ~ , ~ ~ locks El-E2 contacts and form an extra shield for both the receptor binding and fusion peptide domain of the El .

Phenomena observed in response to acidification of SFV particles are that the E 1 -E2 dimeric interaction is weakened.67 So far, the cryo-EM derived structure of SFV at pH 7.4 to pH 5.9 show only modest mor- phology changes of the spike heads.27 However, there are qualitative

Page 109: Conformational Proteomics of Macromolecular Architecture

96 Lena Harnrnur et ul.

changes. In the same pH range, variation in epitopes and exposure of the fusion peptide can be fol- lowed by sensor surface experi- m e n t ~ . ~ ~ It was then seen that a neu- tralizing antibody epitope disap- pears when the pH drops below 6.6, and some E2 epitopes, assumedly close to the spike top, appears as the ph decreases further. A Vicia villosa lectin-binding structure that is hardly seen at neutral pH appears, slightly preceding the fusion pep- tide, when the pH is lowered (Opti- mal interaction at about pH 6.2, See Chapter 18). The receptor binding domain becomes extensively ex- posed after the fusion peptide has become available for external inter- action (Fig. 8).29 It seems likely that exposed fusion loops around a spike do not protrude extensively. The peptide is only moderately hydro-

25

20

s a $ 15 3

E- k

10

5

0 4 0 9 O * r L 5 t ? L " , 9 9 9 * * * * * *

Relative diameter, YO

Fig. 9. Size distribution of SFV parti- cles at different pH. The samples ap- peared as well separated particles in EM. Similar material as was used in the Fig 8 experiment^.^^

phobic and flanked by polar amino acids (Fig. 5) . Considering that the lectin as well as the antibodies mentioned bound the denatured SFV well at neutral pH, the binding profiles would not reflect a pH dependent af- finity as such, but rather that these target structures are hidden in the neu- tral virion and become accessible for surface interaction at more acidic PH.

During these events, the virion diameter increases (Fig. 9). This seems to originate from relocations in the crawlspace between the virus membrane and the shell. Here the shell is moving out from the mem- brane, concomitant with a torsional, lateral movement of the El -E2 limbs towards the 3-fold and quasi-3 fold axes under the spikes. The spacing and angle between the TM and limbs in the pH 5.9 structure of the virus seems to locally increase the curvature of the membrane surface under

Page 110: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 97

the spike.27 This phenomenon, if expanding, would disturb the shell bar- rier.

Additional domains of the envelope would be responsible for turning the virion into fusion competence. Part of these triggers would be acid induced, some would be secondary to the relocations at the top of the spikes, like such that result from membrane contact by the fusion loop and the cholesterol determinant.” Structural variation at a pH around 6.0 usually implies involvement of histidyl protonation-deprotonation. There are some conserved histidine residues in the glycoproteins. Some resides in El domain 11, towards assumed E2 interface in the spike stem and in the interface of domain I11 and I. Judged from the pH range when it ap- pears, histidyl protonation would be expected to play a role in the release of the El-E2 contact.

Histidine provides an attractive target for zinc ions. It is noteworthy that zinc ions prevent both membrane insertion and formation of the E l homotrimer.’”20 The exposed fusion loop seems to need an additional protonation event in the molecule to be able to insert in the target. Lateral limb movement and lift of the shell seen at the early stages of acidifica- t i ~ n ~ ~ might help and would result from a changed status in local his- tidine and in other charged residues in the domain.

In the presence of liposomes acidification experiments show that the fusion peptide, as expected, becomes buried in target lipid bilayer’;*’ and eventually El will associate into highly stable homot r imer~ .~~ This re- quires a lowered pH, as well as the presence of cholesterol in the target membrane and would happen during the process of fusion completion, as

Very late in the process an judged from different approaches. “acid” epitope appears,2 apparently buried in domain I, close to the El- El ’ contact domain of the shell, at neutral pH (See Figs. 6 and 7). Again, fusion of wt virus requires, or is highly facilitated by the presence of cho- lesterol in the target membrane.38 The determinant for cholesterol de- pendence is a proline, located in the beta turn, close to the fusion peptide in the El structure (P226,68 See Fig. 7). The configuration of this domain would therefore affect the insertion of the fusion peptide into target membrane and possibly be involved in E 1 homotrimerization. 19;20;67

19;20;38;67

Page 111: Conformational Proteomics of Macromolecular Architecture

98 Lena Hammar et al.

Experiments with ectodomain of E l , assigned as El*, show a similar behavior to the native E l in the respect that it inserts into cholesterol- containing liposomes in response to acidification, and thereby forms trimers. The E l * appears as monomers in solution and its fusion peptide loop is free to bind the monoclonal antibody E l f even at pH 8!20 Trimerization of E l * occurs on acidification, but only in the presence of cholesterol containing liposomes. This implies that the exposure of the fusion peptide is a phenomenon separate to both membrane insertion and homotrimerization. Membrane insertion seems to induce a second stage of refolding, after the fusion peptide is first exposed. This may lead to formation of membrane fusion intermediates (Fig. l), preceding a possi- ble oligomerization. The late appearance of the “acid epitope” from its buried location in the shell domain could reflect such secondary refold- ing of the molecule, as well as of internal shell contacts that need to open for the virus and target membranes to merge. Interesting for the further discussion of the alphavirus fusion mechanism is that E l ectodomain in its acid-induced homotrimer form has been solved at atomic resolution.22

Post-Fusion Structure?

The crystallographic structure of liposome- and acid induced trimers of the isolated E l * provides a configuration that could be relevant, at least in part, for the E l in virus fusion.22 It shows three E l * units, assembled with domain I at the base and domain I1 with the fusion loop at the top. The latter would be inserted through the polar layer of the target mem- brane forming contact with its internal hydrophobic domain. The upper portion of the domains I1 in the trimer are spreading out from the center axis of the trimer, and the domain I11 is attached to the sides of domains 1-11 (Fig. lo). The C-terminal of the ectodomain, which represent the first part of the E l limb domain, here appears as an elongated peptide “climb- ing upwards” in the cleft between the subunits (Shown as thick backbone structure in Fig. 10). The trimer structure envisions that, if relevant for virus-target membrane fusion, the E l limb and transmembrane regions should end up with the virus membrane moved up to the beta finger do- main crowned by the fusion and cholesterol determinant loops. That is, it would represent a post fusion structure (Fig. lg).

Page 112: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 99

Cholesterol deterrmnant

loop

Fig. 10. Structure of the acid homotrimer of E l ectodomain. The homotrimer was formed by treatment of monomeric El ectodomain with cholesterol-containing liposomes at acidic pH (PDB: 1RER).22 Helical structures are in red, and the domains in membrane contact are in gold. Compared to the neutral structure of the E l ectodomain (Fig. 7), the domain 111 has moved relative the domain I, and along with that the “limb” domain (shown here in thick backbone representation). This is folded “upwards” and slides in the cleft between the subunits, along same subunits longer helix, towards the beta finger of the cholesterol determinant loop. Detail at top right shows organization of the helical structures at the central part of the trimer. Bottom right: a slab section to demonstrate how the limb sequence follows the longer helix with a slight rotation.

A similar structure of suggested “post-fusion’’ trimers was also seen in the Dengue virus case.52 The Flaviviruses differ from the Toga viruses among other things by the double span of the membrane of both the prM and E glycoprotein~.~~ The structure of immature Dengue virus and yel- low fever virus structures were recently solved.77 Here, the spike con-

Page 113: Conformational Proteomics of Macromolecular Architecture

100 Lena Hammar et al.

struction in the premature virus resembles that of SFV in the aspect that the spikes are composed of heterotrimers (glycoproteins E and prM). A further congruence is that the prM covers the fusion peptide at the distal end of the E glycoprotein providing a similar control function as the E2 of E l in SFV. On maturation by furin cleavage, the relatively smooth surface of TBV is covered with 90 horizontally arranged E-dimers. Low pH triggers the formation of E-trimers in a T= 3 lattice, and it is assumed that these participate in a fusion process similar to the alphavirus mecha- n i ~ m . ~ ~ - The solved structure of a soluble trimeric form of protein E demonstrates an assumed post fusion folding. As in the SFV case, an ex- tensive flip of domain I11 has oriented the C-terminal (connecting to vi- rus membrane) “upward”, towards the part of domain I1 with the fusion peptide By such a folding of the fusion protein in the virus, the membrane would be brought in close contact with the target membrane in the endosomal environment. However, by what means these homo- trimers of the fusion protein finally, or at all, have triggered the mem- brane close encounter, remains a structural quiz. From postfusion obser- vations of SFV derived homotrimer, Gibbon et al. discuss a pentameric- hexameric association as part of the membrane fusion process. 19’22

MECHANISTIC CONS1 DERATIONS At this stage the class 2, alpha virus fusion model may be summarized into a mechanical hypothesis to start a discussion on what can be verified in the available structures of the virus. Let’s assume that the solved struc- ture of acid and liposome induced form of E l ectodomain2* is relevant for the post fusion configuration of the full E l glycopeptide in the virus. This implies that the transmembrane domain, TM, of the E l would be anchored in the same lipid bilayer as the fusion loop (Fig. lg), just that the part of E l that originally anchored it in the membrane, i.e. the bottom of the limb, the TM and the submembrane two arginine residues, is not shown! In that model, the limb connecting to TM would be located in the trench between adjacent subunits (Fig. 10). To reach into such a configuration, the El glycopeptide in the neutral virus must be released from its scaffold in the spike and in the shell regions of the virus enve- lope. The contact between domain I11 and I in the neutral virus would be

LOOP.52

VIRUS

Page 114: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 101

released and rearranged. The separate E l molecule needs to come in close contact with two other El molecules. Furthermore, the molecule should have passed the stage of membrane close encounter (Fig. I f ) , formation of a membrane stalk (Fig. Ib) and, finally, mergence of the meeting lipid bilayers (Fig. Id and g). In a paraphrase on Fig. I f the El fusion protein should have released the hawse between domain I11 and I to allow it to slide over the end of the lever-like elongated structure (con- stituted by domains I1 and I) and stretched the connection to the virus membrane hold by the TM bolt. The force that attract and bind the do- main 111 and the limb to the side of the domain 1-11, as implied by acid trimer structure, should be strong, considering the energy needed to in- duce super curvature in the two meeting membranes and splice them to- gether into a stalk intermediate (Fig. I f ).

Data on the sequence of prefusion events reveal that the fusion loop would be exposed before El appears in a homotrimer configura-

With the ectodomain, liposomes and low pH were both ~ ~ e e d e d . ' ~ ; ~ ' However, does the reorganization of the domain I11 occur in the single molecule before, during, or after trimerization? As reported, the El homotrimer represent a structurally stable f o n n a t i ~ n ' ~ ; ~ ' ;and would be a potential end product, with or without the TM. One can only speculate if trimerization and possibly higher oligomerization would be the generator, or a secondary phenomenon in such a process.

On structure determination by cryoEM, the material is instant frozen in its hydrated state and images recorded at a low electron dose. Thereby a close to native structure is accessed. So far, the cryoEM structure data with virions does not provide strong support either for or against the de- scribed mechanistic pathway. Free, non-fused particles dominated down to pH 5.8. Although the whole extramembrane domain expanded tran- siently as pH was lowered to pH 5.9, the shell remained, only that the holes around the twofold become smaller..27 The movements of the limbs in the deeper regions of extramembrane domain, and the TM domains of the envelope would be of utmost importance for the process, which fits with mutational studies on the TM domain.57 However, with the present structural information it is not easy to find out how the post-fusion struc- ture envisioned by the E l ectodomain trimer may occur. Again, the al- phavirus envelope is a framework construct, a tensegrity sphere (See

tion.20;27;29

Page 115: Conformational Proteomics of Macromolecular Architecture

102 Lena Hammar et al.

Chapters 1&2). According to the Caspar-Klug theory of quasi- equivalence this is a dynamic r e ~ o u r c e . ~ ~ ~ ~ Therefore, what happens at phase transitions like fusion may be difficult to catch!

A rod exposing the fusion loop at the top could not be much raised above the spike head, as judged from cryoEM structures of the virus down to pH 5.9. Nevertheless, the fusion loop was maximally available for external binding to sensor surfaces at pH 6.0.29 At that stage a neutralizing epitope was declining and qualitative surface changes seen, represented by sequentially appearing e p i t o p e ~ . ~ ~ This strongly suggests tuning of the spike function. At the corresponding step, the influenza fusion protein would have flipped its fusion peptide upward on top of a rigid triple helix bundle, protruding above the spike. With the SFV, a similar protruding rod formation is thus far not seen. However, E l domain 1-11 is a rigid structure22;44 and a considerable portion if it would, at least initially be located in the shell layer44 (Fig. 6). After all, provided only that the target membrane can be reached and the flexible fusion loop inserted, it would not at all be necessary for a lever to rise to a radial orientation. Rather, it might be handled tangentially. Theoretically, a slight rotating motion along the long axis of a lever could wind the haws and lift the anchor on the boat. Here is a gap in our comprehension of how the virus acquires its post fusion configuration, envisioned from high-resolution studies on E 1 ectodomain.” With eagerness, we are awaiting the cryoEM whole-virus structure, solved at successively lower pH, to reveal the pathways in the alphavirus model of fusion class 2!

ACKNOWLEDGMENTS We thankfully acknowledge the Swedish Board for Science, the Karolin- ska Institute Research Foundations and the Sweden Japan STINT organi- zation for supporting this study. A special thank is due to Bomu Wu, KI, for providing the 13 A map as base for the Semliki Forest virus illustra- tions. We would also like to thank Leif Bergman and Sunny Wu for help with program items and molecular fitting.

Page 116: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 1 03

REFERENCES 1. Ahn A, Gibbons DL, and Kielian M. The fusion peptide of Semliki Forest

virus associates with sterol-rich membrane domains. J Virol, 2002; 76:3267-75.

2. Ahn A, Klimjack MR, Chatterjee PK, and Kielian M. An epitope of the Semliki Forest virus fusion protein exposed during virus-membrane fusion. J Virol, 1999; 73:10029-39.

3. Allison SL, Schalich J, Stiasny K, Mandl CW, and Heinz FX. Mutational evidence for an internal fusion peptide in flavivirus envelope protein E. J Virol, 2001; 75:4268-75.

4. Bentz J. Membrane fusion mediated by coiled coils: a hypothesis. Biophys

5. Brasseur R, Pillot T, Lins L, Vandekerchkhove, and Rosseneu. Peptides in membranes: tipping the balance. Trends in Biochemical Sciences, 1997;

6. Bullough PA, Hughson FM, Skehel JJ, and Wiley DC. Structure of influ- enza haemagglutinin at the pH of membrane fusion. Nature, 1994; 371:37- 43.

7. Burness AT, Pardoe I, Faragher SG, Vrati S, and Dalgarno L. Genetic sta- bility of Ross River virus during epidemic spread in nonimmune humans. Virology, 1988; 167:639-43.

8. Caballero-Herrera A, and Nilsson L. Molecular dynamics simulations of the eUe2 transmembrane domain of the semliki forest virus. Biophys J, 2003; 85: 3646-58.

9. Caspar D, and Klug A. Physical principles in the construction of regular vi- ruses. Cold Spring Harbor Symposia on Quantitative Biology, 1962;

10. Caspar DL. Movement and self-control in protein assemblies. Quasi- equivalence revisited. Biophys J, 1980; 32: 103-38.

11. Cheng RH, Kuhn RJ, Olson NH, Rossmann MG, Choi HK, Smith TJ, and Baker TS. Nucleocapsid and glycoprotein organization in an enveloped vi- rus. Cell, 1995; 80:62 1-30.

12. Chernomordik LV, and Kozlov MM. Protein-lipid interplay in fusion and fission of biological membranes. Annu Rev Biochem, 2003; 72: 175-207.

13. Corver J, Bron R, Snippe H, Kraaijeveld C, and Wilschut J. Membrane fu- sion activity of Semliki Forest virus in a liposomal model system: specific inhibition by Zn2+ ions. Virology, 1997; 238: 14-21.

14. Epand RM. Fusion peptides and the mechanism of viral fusion. Biochim Biophys Acta, 2003; 1614: 1 16-21.

15. Epand RM, and Epand RF. Modulation of membrane curvature by peptides. Biopolymers, 2000; 55:358-63.

J, 2000; 78:886-900.

22: 167-17 1.

XXVII: 1-24.

Page 117: Conformational Proteomics of Macromolecular Architecture

104 Lena Hammar et al.

16. Ferlenghi I, Gowen B, de Haas F, Mancini EJ, Garoff H, Sjoberg M, and Fuller SD. The first step: activation of the Semliki Forest virus spike protein precursor causes a localized conformational change in the trimeric spike. J M o l Biol, 1998; 283:71-81.

17. Garoff H, Frischauf AM, Simons K, Lehrach H, and Delius H. Nucleotide sequence of cdna coding for Semliki Forest virus membrane glycoproteins. Nature, 1980; 288:236-41.

18. Gibbons D, Ahn A, Chatterjee P, and Kielian M. Formation and characteri- zation of the trimeric form of the fusion protein of Semliki Forest Virus. J Virol, 2000; 74:7772-80.

19. Gibbons D, Erk I, Reilly B, Navaza J, Kielian M, Rey F, and Lepault J. Visualization of the target-membrane-inserted fusion protein of Semliki Forest virus by combined electron microscopy and crystallography. Cell, 2003; 114.

20. Gibbons DL, Ahn A, Liao M, Hammar L, Cheng HR, and Kielian M. Multistep regulation of membrane insertion of the fusion peptide of Semliki Forest virus. J Virology, 2004; 78:3312-33 18.

21. Gibbons DL, and Kielian M. Molecular dissection of the Semliki Forest vi- rus homotrimer reveals two functionally distinct regions of the fusion pro- tein. J Virol, 2002; 76: 1194-205.

22. Gibbons DL, Vaney MC, Roussel A, Vigouroux A, Reilly B, Lepault J , Kielian M, and Rey FA. Conformational change and protein-protein interac- tions of the fusion protein of Semliki Forest virus. Nature, 2004; 427:320-5.

23. Glomb-Reinmund S, and Kielian M. fus-1, a pH shift mutant of Semliki Forest virus, acts by altering spike subunit interactions via a mutation in the E2 subunit. J Virol, 1998; 72:4281-7.

24. Glomb-Reinmund S, and Kielian M. The role of low pH and disulfide shuf- fling in the entry and fusion of Semliki Forest virus and Sindbis virus. Vi- rology, 1998; 248:372-81.

25. Gonzalez ME, and Carrasco L. Viroporins. FEBS Lett, 2003; 552:28-34. 26. Griffin DE. Roles and reactivities of antibodies to alphairuses. Seminars in

Virology, 1995; 6:249-255. 27. Haag L, Garoff H, Xing L, Hammar L, Kan ST, and Cheng RH. Acid-

induced movements in the glycoprotein shell of an alphavirus turn the spikes into membrane fusion mode. Embo J , 2002; 21:4402-10.

28. Hammar L, Markarian S, and Cheng RH. Exploring virus surface structure. BIAJournal, 1998; 5:22-23.

29. Hammar L, Markarian S, Haag L, Lankinen H, Salmi A, and Cheng RH. Prefusion rearrangements resulting in fusion Peptide exposure in Semliki forest virus. J Biol Chem, 2003; 278:7189-98.

30. Heinz FX, and Allison SL. Structures and mechanisms in Flavivirus fusion. Adv in Virus Res, 2000; 5523 1-269.

Page 118: Conformational Proteomics of Macromolecular Architecture

Prejiusion Dynamics in an Alphavirus 105

3 1. Huang Q, Opitz R, Knapp EW, and Herrmann A. Protonation and stability of the globular domain of influenza virus hemagglutinin. Biophys 1, 2002;

32. Huang Q, Sivaramakrishna RP, Ludwig K, Korte T, Bottcher C, and Herrmann A. Early steps of the conformational change of influenza virus hemagglutinin to a fusion active state: stability and energetics of the he- magglutinin. Biochim Biophys Actu, 2003; 1614:3-13.

33. Hughson FM. Molecular mechanisms of protein-mediated membrane fu- sion. Curr Opin Struct B i d , 1995; 5507-13.

34. Hughson FM. Structural characterization of viral fusion proteins. Curr B i d , 1995; 5:265-74.

35. Jahn R, and Sudhof TC. Membrane fusion and exocytosis. Annu Rev Bio- chem, 1999; 68:863-9 1 1.

36. Justman J, Klimjack MR, and Kielian M. Role of spike protein conforma- tional changes in fusion of Semliki Forest virus. J Virol, 1993; 67:7597- 607.

37. Kerr PJ, Weir RC, and Dalgarno L. Ross River virus variants selected dur- ing passage in chick embryo fibroblasts: serological, genetic, and biological changes. Virology, 1993; 193:446-9.

38. Kielian M, Chatterjee PK, Gibbons DL, and Lu YE. Specific roles for lipids in virus fusion and exit. Examples from the alphaviruses. Subcell Biochem,

39. Kielian M, Klimjack MR, Ghosh S, and Duffus WA. Mechanisms of muta- tions inhibiting fusion and infection by Semliki Forest virus. J Cell B i d , 1996; 1342363-72.

40. Klimjack MR, Jeffrey S, and Kielian M. Membrane and protein interactions of a soluble form of the Semliki Forest virus fusion protein. J Virol, 1994; 68: 6940-6.

41. Klimstra WB, Nangle EM, Smith MS, Yurochko AD, and Ryman KD. DC- SIGN and L-SIGN can act as attachment receptors for alphaviruses and dis- tinguish between mosquito cell- and mammalian cell-derived viruses. J Vi- rol, 2003; 77: 12022-32.

42. Klimstra WB, Ryman KD, and Johnston RE. Adaptation of Sindbis virus to BHK cells selects for use of heparan sulfate as an attachment receptor. J Vi- rol, 1998; 72:7357-66.

43. Kozlovsky Y, Chernomordik LV, and Kozlov MM. Lipid intermediates in membrane fusion: formation, structure, and decay of hemifusion diaphragm. Biophys J , 2002; 83:2634-5 1.

44. Lescar J, Roussel A, Wien MW, Navaza J, Fuller SD, Wengler G, and Rey FA. The Fusion glycoprotein shell of Semliki Forest virus: an icosahedral assembly primed for fusogenic activation at endosomal pH. Cell, 2001;

82:1050-8.

2000; 34:409-55.

1051137-48.

Page 119: Conformational Proteomics of Macromolecular Architecture

106 Lena Hammar et al.

45. Loewy A, Smyth J, von Bonsdorff CH, Liljestrom P, and Schlesinger MJ. The 6-kilodalton membrane protein of Semliki Forest virus is involved in the budding process. Journal of Virology, 1995; 69:469-75.

46. Lusa S, Garoff H, and Liljestrom P. Fate of the 6K membrane protein of Semliki Forest virus during virus assembly. Virology, 1991 ; 185:843-6.

47. Mancini EJ, Clarke M, Gowen BE, Rutten T, and Fuller SD. Cryo-electron microscopy reveals the functional organization of an enveloped virus, Sem- liki Forrest virus. Molecular Cell, 2000; 5:255-266.

48. Mancini EJ, and Fuller SD. Supplanting crystallography or supplementing microscopy? A combined approach to the study of an enveloped virus. Actu Crystallogr D Biol Crystullogr, 2000; 56: 1278-87.

49. Melton JV, Ewart GD, Weir RC, Board PG, Lee E, and Gage PW. Alphavi- rus 6K proteins form ion channels. J Biol Chem, 2002; 277:46923-3 1.

50. Mendoza QP, Stanley J, and Griffin DE. Monoclonal antibodies to the El and E2 glycoproteins of Sindbis virus: definition of epitopes and efficiency of protection from fatal encephalitis. J Gen Virol, 1988; 69:3015-22.

51. Metsikko K, and Garoff H. Oligomers of the cytoplasmic domain of the ~62432 membrane protein of Semliki Forest virus bind to the nucleocapsid in vitro. J Virol, 1990; 64:4678-83.

52. Modis Y, Ogata S, Clements D, and Harrison SC. Structure of the dengue virus envelope protein after membrane fusion. Nature, 2004; 427:3 13-9.

53. Paredes AM, Heidner H, Thuman-Commike P, Prasad BV, Johnston RE, and Chiu W. Structural localization of the E3 glycoprotein in attenuated Sindbis virus mutants. J Virol, 1998; 72: 1534-41.

54. Pletnev SV, Zhang W, Mukhopadhyay S, Fisher BR, Hernandez R, Brown DT, Baker TS, Rossmann MG, and Kuhn RJ. Locations of Carbohydrate Sites on Alphavirus Glycoproteins Show that El Forms an Icosahedral Scaffold. Cell, 2001; 1051 27-36.

55. Rey F. Dengue virus envelope glycoprotein structure: new insight into its interactions during viral entry. Proc Nut1 Acad Sci USA, 2003; 100:6899- 901.

56. Rey FA, Heinz FX, Mandl C, Kunz C, and Harrison SC. The envelope glycoprotein from tick-borne encephalitis virus at 2 A resolution. Nature,

57. Sjoberg M, and Garoff H. Interactions between the transmembrane seg- ments of the alphavirus E l and E2 proteins play a role in virus budding and fusion. J Virol, 2003; 77:3441-50.

58. Skehel JJ, Cross K, Steinhauer D, and Wiley DC. Influenza fusion peptides. Biochem Soc Trans, 2001; 29:623-6.

59. Skehel JJ, and Wiley DC. Receptor binding and membrane fusion in virus entry: the influenza hemagglutinin. Annu Rev Biochem, 2000; 6953 1-69.

1995; 375:291-8.

Page 120: Conformational Proteomics of Macromolecular Architecture

Prefusion Dynamics in an Alphavirus 107

60. Skoging U, and Liljestrom P. Role of the C-terminal tryptophan residue for the structure-function of the alphavirus capsid protein. J Mol Biol, 1998;

61. Skoging U, Vihinen M, Nilsson L, and Liljestrom P. Aromatic interactions define the binding of the alphavirus spike to its nucleocapsid. Structure,

62. Smit J, Waarts B, Bittman R, and Wilschut J. Liposomes as target mem- branes in the study of virus receptor interaction and membrane fusion. Methods Enzymol., 2002; 372:374-92.

63. Smith TJ, Cheng RH, Olson NH, Peterson P, Chase E, Kuhn RJ, and Baker TS. Putative receptor binding sites on alphaviruses as visualized by cryoelectron microscopy. Proc Nut1 Acad Sci USA, 1995; 92: 10648-52.

64. Stiasny K, Koessl C, and Heinz FX. Involvement of lipids in different steps of the flavivirus fusion mechanism. J Virol, 2003; 77:7856-62.

65. Strauss JH, and Strauss EG. The alphaviruses: Gene expression, replication and evolution. Microbiological Reviews, 1994; 58:49 1-562.

66. Tubulekas I, and Liljestrom P. Suppressors of cleavage-site mutations in the p62 envelope protein of Semliki Forest virus reveal dynamics in spike struc- ture and function. J Virol, 1998; 72:2825-31.

67. Wahlberg JM, Bron R, Wilschut J, and Garoff H. Membrane fusion of Sem- liki Forest virus involves homotrimers of the fusion protein. J Virol, 1992;

68. Vashishtha M, Phalen T, Marquardt MT, Ryu JS, Ng AC, and Kielian M. A single point mutation controls the cholesterol dependence of Semliki Forest virus entry and exit. J Cell Biol, 1998; 140:91-9.

69. Weissenhorn W, Dessen A, Harrison S, Skehel JJ, and Wiley DC. Atomic structure of the ectodomain from HIV-1 gp41. Nature (London), 1997:

70. White JM, Hoffman LR, Arevalo JH, and Wilson IA. Attachment and entry of influenza virus into host cells; Pivotal roles of the hemagglutinin. In "Structural Biology of viruses." (Chiu W, RM Burnett, and RL Garcea, Eds.), 1997: Vol.: pp. 80-104. Oxford University press, New York, Oxford.

71. Vrati S, Fernon CA, Dalgarno L, and Weir RC. Location of a major anti- genic site involved in Ross River virus neutralization. Virology, 1988:

72. Vrati S, Kerr PJ, Weir RC, and Dalgarno L. Entry kinetics and mouse viru- lence of Ross River virus mutants altered in neutralization epitopes. J Virol,

73. Yang R, Yang J, and Weliky DP. Synthesis, enhanced fusogenicity, and solid state NMR measurements of cross-linked HIV-1 fusion peptides. Bio- chemistry, 2003 ; 42:3527-35.

279: 865-72.

1996; 4:519-29.

66: 73 09- 1 8.

387 :426-430.

162:346-53.

1996; 7011745-50.

Page 121: Conformational Proteomics of Macromolecular Architecture

108 Lena Hammar et al.

74. Yang ZN, Mueser TC, Kaufman J, Stahl SJ, Wingfield PT, and Hyde CC. The crystal structure of the SIV gp41 ectodomain at 1.47 A resolution. J Struct Biol, 1999; 126:131-44.

75. Zhang W, Mukhopadhyay S, Pletnev SV, Baker TS, Kuhn RJ, and Rossmann MG. Placement of the structural proteins in Sindbis virus. J Vi- rol, 2002; 76:11645-58.

76. Zhang X, Fugere M, Day R, and Kielian M. Furin processing and prote- olytic activation of Semliki Forest virus. J Virol, 2003; 77:2981-9.

77. Zhang Y, Corver J, Chipman PR, Zhang W, Pletnev SV, Sedlak D, Baker TS, Strauss JH, Kuhn RJ, and Rossmann MG. Structures of immature flavivirus particles. Embo J , 2003; 22:2604-13.

Page 122: Conformational Proteomics of Macromolecular Architecture

PART II APPROACH I NG LARGE ASSEMBLIES

IN MEMBRANES

Page 123: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 124: Conformational Proteomics of Macromolecular Architecture

Chapter 5

STRATEGY TO OBTAIN HIGH RESOLUTION STRUCTURE OF MEMBRANE PROTEINS

BY X-RAY CRYSTALLOGRAPHY

Hiroshi Aoyama*, Eiki Yamashita', Keisuke Sakurai' and Tomitake Tsukihara',$

The membrane proteins, which have key roles in various biological processes such as photosynthesis, respiration, ion pump, and so on, are estimated for 25-30% of all the species of proteins. The membrane proteins whose tertiary structures are known are, however, less than one % of those of water-soluble proteins. It is because both purification and crystallization of membrane proteins are more difficult than other proteins. The procedures of purification and crystallization were summarized for the membrane proteins whose structures were determined by X-ray method. Each membrane protein has suitable detergents for respective purification and crystallization. There are two types of crystals of membrane proteins diffracting at high resolution. In the type I1 crystal hydrophilic surfaces of molecules contribute mainly to inter-molecular interactions, while in the type I crystals trans- membrane surfaces with hydrophobic nature contact with each other. The quality of membrane protein crystals can be improved by adjusting solvent contents as that of water-soluble proteins. Keywords: Membrane protein structure, X-ray crystallography

I NTROD UCTlO N Structures of proteins determined by X-ray crystallography, two- dimensional electron crystallography, and NMR spectroscopy have provided the field of biology with fundamental discoveries. Among these

'RIKEN, Harima Institute, Kouto. Mikazuki-cho, Sayo-gun, Hyogo, 679-5 148, Japan. 'Institute for Protein Research, Osaka University, 3-2, Yamada-oka, Suita, Osaka, 565-0871, Japan. Corresponding author. Einail address: [email protected]

111

Page 125: Conformational Proteomics of Macromolecular Architecture

112 Hiroshi Aoyama et al.

for structural determination methods, X-ray crystallography has the widest coverage in molecular weight, and gives the highest quality of atomic parameters. The crystallographic analyses of the oxygen-carrying proteins myoglobin" and hemoglobin36 opened a new perspective: structural biology. Since these pioneering structural determinations, almost 20,000 protein structures have been deposited in the Protein Data Bank (PDB)2. Recombinant DNA technology, synchrotron radiation and development of computers have increased the number of protein structures solved in atomic detail, as well as enabling the determination of complex molecular assemblies with molecular weights exceeding 1 MDa. In spite of the remarkable progress in protein crystallography, the number of membrane proteins whose atomic parameters have been deposited in the PDB are less than 1% of the total number of proteins in the PDB. More than 30,000 proteins are expressed in human cells, of which membrane proteins are estimated to amount for 25-30%. Membrane proteins have key roles in various biological processes such as photosynthesis, respiration, ion pumping, molecular transport and signal transduction, malung their structural studies extremely desirable.

The difficulty not only of crystallographic but also biochemical studies of membrane proteins lies in their isolation process because of their high affinity to the membrane. Separation of sound protein from the membrane is the principal problem for structural study. Once the membrane protein is purified while preserving its conformation intact under the presence of a suitable detergent, it is crystallized by adding precipitants, as are water-soluble proteins. It is, however, usual that crystals obtained at first diffract X-rays too weakly at high resolution for one to determine atomic coordinates. We will describe strategies to obtain high-quality membrane protein crystals and crystallographic experiments to acquire diffraction data to high resolution.

Page 126: Conformational Proteomics of Macromolecular Architecture

High Resolution Structure of Membrane Proteins by X-Ray Crystallography I13

RESOLUTION OF X-RAY DIFFRACTION DEPENDS ON DEGREE OF ORDER WITHIN THE CRYSTALLINE LATTICE Observed intensity Z(hkl) of a reflection is proportional to the square of the structure amplitude IF(hkl)l, which is given by the following equation:

F ( M ) = ~ , h . exp[-B,. s in*~(hkl ) /~*] . exp[2.ni(hx1 + k~,+ lz,)] (1)

where the summation is performed over all atoms in the unit cell, and B,, and (xl yl z,,) are respectively the temperature factor and the fractional coordinates of atom j with the atomic scattering factor h, and 8(hkl) is the Bragg angle of reflection (hkl). The plane distance, d(hkl), is expressed as follow:

d(hkl) = h/2 sinO(hkl) (2)

The temperature factor B is related to the mean square distance (u*) of atomic displacement:

B = Xz2(u2) (3 )

In addition to atomic vibration or dynamic disorder, any atom in a protein crystal exhibits static disorder: equivalent atoms in different unit cells are not exactly at the same position relative to the origin of their respective unit cell. The averaged temperature factors and the resolution limits of observed intensity data are plotted for crystals of membrane proteins in Fig. 1. The B-factors of protein crystals exhibit much extremely higher values than those of organic crystals (about 3 A2 in general). The resolution limit is restricted by the averaged B-value, since the structure amplitude F(hkl) is proportional to exp[ -B . sin2 r3(hkl)lh2] as is deduced from equation (1). Though the static disorder is not strictly distinguished from the dynamic disorder with respect to its contribution to the temperature factor. Different crystals of the same protein often have B-values that are largely different from each other, and drastic decrease in B-factor is not achieved by freezing the protein crystals at 100 K. This suggests that the dynamic disorder is not the main component of the high temperature factor of protein crystal. The root mean square distances of atom from the equilibrium position was

Page 127: Conformational Proteomics of Macromolecular Architecture

114

10

Hiroshi Aoyarna et al.

-

t ..

*.. 1 . ..

* .

-. . '.. . , --._ * *.* . f .

+ * !f t . *. .

0 ' 0 01 0 2 03 04 05 d 2 ( A

4 5 3 s 3 0 2 s 2 0 1 5 d A )

Fig. 1. Averaged temperature factors of the membrane protein crystals determined by X-ray method are plotted against the resolution limits of their observed reflections. The dotted line shows the empirical highesk temperature factor at each observed resolution limit. or the empirical highest resolution at each temperature factor. The highest resolution of observable reflections of each crystal depends not only on its averaged temperature factor but also on the method of intensity data collection. The experiment by using synchrotron X-rays with high flux and low emittance effects in improvement of the resolution.

evaluated for two different B-values. They are 0.50 A and 0.87 A for B = 20 A2 and 60 A2, respectively. These displacements are too large to be attributed to dynamic disorder at around 300 K. Thus, the static disorder must be the main component of higher temperature factors of protein crystals in comparison with those of organic crystals. Reduction of static disorder in the crystalline lattice is essential to improve the resolution of diffraction.

QUALITY OF ELECTRON DENSITY DI STRl B UTlON Electron density in a crystal is the Fourier transform of structure factors:

p(qz) = V-' . ChCk& (F(hkl)l . exp[-2m'(kx + Ly + lz) + ia(hkl)J (4)

where V and a(hkl) are the volume of unit cell and the phase angle of structure factor F(hkl), respectively, and the summation is performed

Page 128: Conformational Proteomics of Macromolecular Architecture

High Resolution Structure of Membrane Proteins by X-Ray Clystallography 115

over the reflections up to the highest resolution obtained. The molecular structure is determined on the basis of electron density distribution. It is clear that the quality of the electron density calculated by the Fourier transform of structure factors depends on both the resolution limit and the accuracy of the observed amplitudes and the estimated phase angles of the structure factors. The phase angles are not obtained directly from the experiment but derived from observed structure amplitudes, and the higher the quality of the structure amplitude used, the higher the accuracy of the phase angles estimated. Thus, a high performance X-ray experiment that allows data collection with high accuracy at a high resolution is primarily important besides the quality of crystal. Since the intensity of reflections generally decreases with higher resolution, it is essential to measure accurately the intensity of weak reflections to obtain high-quality data at high resolution. This is accomplished by focusing X- rays and by lowering background scattering of X-rays, and synchrotron radiation, with its high flux and low divergence meet these requirements.

Solvent and detergent molecules with disordered structures occupy the space more than 65% of the membrane protein crystals in volume, while those in water-soluble protein crystals are around 50%. The high volume ratios of the disordered structures in the membrane protein crystals tend to prevent their lattices from exhibiting high ordered structures. The membrane protein crystal with highly ordered structure is obtained by searching crystallization conditions more precisely than the water-soluble proteins. Once a highly ordered crystal is obtained, the high volume ratio of disordered regions in the crystal results in high quality of electron density map. The disordered region consisting of solvent and detergent molecules in the crystal of the membrane protein crystal has a uniform value of electron density. The solvent flattening developed by Wang” refines phase angles of structure factors of the membrane protein more efficiently than those of the water-soluble proteins, since this method applies the uniform electron density of the non-protein regions to a target function.

The electron density map of bovine cytochrome c oxidase was calculated at 2.3 resolution with the observed structure factors phased by the multiple isomorphous replacement method and refined by the solvent flattening. The crystal contains more than 70% of solvent or

Page 129: Conformational Proteomics of Macromolecular Architecture

116 Hiroshi Aoyuma et ul.

Fig. 2. An electron density map around the dioxygen reduction center of fully oxidized cytochrome c oxidase from bovine heart muscle. The structural model was built in the map. Heme a3 and peroxide molecules are colored in red, and CUB is a small light blue sphere. Amino acid residues are drawn as sticks, each atom with a different color. Thf initial phases were determined by the multiple isomorphousoreplacement method at 5.0 A resolution, and they were refined and extended to 2.3 A resolution by the solvent- flattening5’. The electron density was calculated with the refined phases at 2.3 A resolution. The electron density cages are drawn at 1.2 A, level. The refined electron density map is so clear that a novel covalent bond between His240 and Tyr244 is definitely identified.

detergent regions. The electron density map obtained was so clear that an unexpected covalent bond between side chains of a tyrosine and a histidine was clearly detected as shown in Fig. 2. The membrane protein crystal can yield a high quality structure by the efficient phase refinement with observed structure factors measured accurately.

DETERGENTS USED FOR PURIFICATION AND CRY STALL I ZATl ON Membrane proteins have defined tertiary structures in the context of a biological membrane, which are lost in an organic or aqueous solution, owing to the amphiphatic properties exhibited by their molecular surfaces. Purified membrane proteins can preserve stable conformations even in an aqueous solution when their external hydrophobic surfaces are covered with detergent molecules. Thus, the initial step in biochemical research of a membrane protein consists in finding an adequate detergent with which the membrane protein is isolated from the membrane while maintaining the conformation it has in the membrane. An adequate

Page 130: Conformational Proteomics of Macromolecular Architecture

High Resolution Structure of Membrane Proteins by X-Ray Crystallography 117

detergent is generally chosen by trial and error. Since a large amount of detergent is required for the extraction of the membrane protein from the membrane, relatively inexpensive detergents such as triton X- 100, cholate and Brji-35 are chosen in the earliest stages of extraction and purification prior to crystallization. Since these detergents used for isolation contain detectable amounts of impurities, they are replaced with other detergents with a uniform molecular weight in the course of purification. After the protein is extracted from the membrane with a detergent, it is purified by chromatographic separations and other methods developed for purifications of water-soluble proteins. The purification of the membrane protein is monitored by measuring its biochemical activity or by spectroscopic methods such as UV-VIS spectroscopy and CD spectroscopy.

For crystallization, the homogeneity of the tertiary and quaternary structures of a protein is more important than the chemical purity of the protein. Sometimes, chemically pure proteins obtained by repeated chromatographic separations fail to crystallize, probably because the conformation of the protein is perturbed by repetitive interactions with the resin during chromatographic separation. Purifications and crystallizations of membrane proteins whose crystal structures have been determined by X-ray method are summarized in Table 1. Once the proteins are purified from the membrane, they are crystallized, by using techniques developed for water-soluble proteins: batch crystallization, liquid-liquid diffusion, vapor diffusion by hanging and sitting drops, and dialysis.

Though sometimes crystals obtained in the first screening diffract at high resolution, most membrane protein crystals obtained at the initial stage of the research diffract X-rays only at modest resolution. There are two reasons why the crystals do not diffract X-rays at high resolution. The first is that the crystals exhibit too high B-factors to diffract X-rays at a high resolution. The second reason is that the crystals are too fragile to be manipulated for the X-ray experiment. Modifications of the isolation, purification and crystallization methods overcome both problems.

Page 131: Conformational Proteomics of Macromolecular Architecture

Table 1. Membrane proteins whose structures have been determined by X-ray diffraction method

gent crystallization cryoprototant

xystallization I methods precipitants

OG

NG (+HPTO)

p-OG

UDAO (+HPTO)

LCP MOG unknown KC 1

HDVD A S . wcrose

LCP MOG unknown

SDVD A S . unknown

I proteins sources ~~

expression system

resolution(.&), Reference PDB ID purification methods

I R.S. None I LDAo

Celite fractogel-DEAE anion-exchange

p-OG 1 SDVD 1 PEG4K 1 unknown 2.60, I AIG Stowell et al., 1997

Photosynthetic reaction center

anion-exchange LDAO gel filtration

MonoQ Q-Sepharose eel filtration

LDOA I VD 1 AS. I room temperature

2.30,lPRC Deisenhofer et al.,

199s

T.tep

S.e. -

T.tep 2.80,l EYS Nogi et al., 2000

2.50,l JBO Jordan et al., 2001

none concentration

S.e. none ToyoperalhSO I DDM DDM I unknown I PEG2K I unknown 3.80,lFEl Zouni et al., 2001

Photosystem I1

S.V. none I "D: MonoQ DDM I Dialysis I PEGISK glycerol I PEGISK 3.70,lIZL

Kamiya et al., 2003

H.s. none p-OG 1 LCP 1 MOG 1 unknown 1.43,lMOK Schobert et al., 2002 filtration

Phenyl- Cholate sepharoseCLAB

centrifugation (+HPTO)

ion-exchange MonoQ

H.s. H.s Halorhodopsin

Light-harvesting complex I1

1.80,IElZ Kolbe et al., 2000

2.80,1F88 Palcaewski et al., 2000

2.40,1 JGJ Luecke et al., 2001

2.40, ILGH Koepke et al., 1996

Bovine

~

N.p.

R.m. ~

None

H.s

none

Page 132: Conformational Proteomics of Macromolecular Architecture

1 Kcsa K'channel

Mechanosensitive Ion Channel

S.I. E.coli Cobalt affinity DM DM SDVD gel-filtration LDAO Fab

M.t. E.coli Nickel affinity DDM DDM VD Ion exchange gel-filtration

R.s. Nickel affinity DDM DM HDVD I R's' 1 1 MonoQ 1 1 1

c oxidase Sepharose

none DEAE Triton Batch

Q-Sepharose Gel filtration

TMAE x- I00 , gel-filtration DDM 1 Bovine 1 none 1 AS. 1 Sodium 1 1 Batch VD

fractionation Cholate microcrystal-

lization

Ubiquinol oxidase E.coli E.coli Nickel affinity DDM HDVD

Cytochrome bcl Chicken none DEAE Sepharose DDM

MonoQ

paraffin oil

PEG400

room temperature

I cdrnplex

2.40,lEHK Soullimane et al., 2000

2.30,1M56 Svensson-Ek et al.,

2002

2.30,20CC Yoshikawa et al., 1998

I I glycerol

CL6B Sepharoe I CL6B

3.16,lBCC Zhang et al., 1998 I I

Bovine

Yeast

none DEAE Sepharose DM DDM VD Hydroxyapatite

none DEAE Seuharose DDM UDM SDVD 1 Sepharoe~CL6B I

PEG400

F V

AS. TEG

PEG2KMEE

PEG2K

PEG400

PEG4K

PEG1 S K

PEG4K

PEG4K

PEG4K

PEG400 2.00,1K4C 1 Zhou et al., 2001

3.50,lMSL TEG I Changetal., 1998

2.70,l AR1

3.50,lFFT %f% 1 Abramson et al., 2000

glycerol 3.00,1BE3 I Iwata et al., 1998

2.30,lEZV Unknown I Hunte et al., 2000

Table 1 (continued)

Page 133: Conformational Proteomics of Macromolecular Architecture

Furnarate Reductase

Formate dehydrogenase N

E.coli E.coli DEAE Sepharose C I ? E ~ C12b Poros SOHQ gel-filtration

electrofocusing &filtration

W.S. none DEAE CL-6B DDM,DM DDM,DM

E.coli 1 E.coli 1 P-OG gel-filtration

MonoQ

Glycerol Facilitator (GlPfO

E.coli I Ecoli Nickel affinity OG I Size-exclusion I Unknown I Sucrose-Specific Porin

Maltoporin

Ompf Porin

Ca" ATPase

FhuA

S.t.

E.coli

S.t.

E.coli

Sr.

E.coli

E.coli anion exchange

iaffini tv)

Affinity (Amylose) Anion

exchange

gel-filtration

DEAE

E.coli DEAE Sephadex G I5

PBE 94 Sephacryl S-200

Ecoli

X80 LDAO

Trition x - 100 CbDAO

Octyl-POE

choline

HDVD PEGSKMEE ehylene 2.70,IKF6 glycol Iverson et al., 2002

VD I PEG4K I unknown I 2.20,lQLA Lancaster et al., 1999

~~

HDVD PEGISK PEG1.5K 1.60,l KQF ethanol Jormakka et al., 2002

unknown PEG2K glycerol 2.20.1FX8

SDVD PEG2K unknown 2.40,l AOS

Fu et al., 2000

Forst et al., 1998

PEG2K unknown 2.40,l AF6 Wang et al., 1997

2.80,lMPR

3.00,l BT9 1 unknown 1 Phale et al., 1998

Dialysis I ;;;urre 2.60, I EUL I glycerol I Toyoshima et al., 2000

2.74,l BY3 Locher et al., 1998

Table 1 (continued)

Page 134: Conformational Proteomics of Macromolecular Architecture

Table 1 (continued) E.coli Ferric Enterobactin E.coli

Receptor anion exchange Triton LDAO PEG1 K glycerol 2.40,l FEP

Chromato- 1 X-100 I 1 HDVD 1 1 1 Buchananetal., 1999 focusing

Ovis Aries

Prostaglandin H2 Synthase-l

E.coli

none

I A'a

Squalene-Hopene Cyclase

2.20,3SQC Q-Sepharose Triton CxE4 HDVD Sodium glycerol Superdex-200 X-I 00 citrate Wendt et al., 1999

DEAE N - I a u ry 1 - NG SDVD PEGSSOMEE PEGSSOMEE 2.20,l J4N

CxE4

Sui et al., 2001 Sephacryl S200 sarcosine NG I Bovine Aquaporin I

E.coli

E.coli

Tolc I E.coli Q-Sepharose Triton DG,HG,HPG, HDVD PEG2KMEE PEG2KMEE 2.10,l EK9 gel-filtration X-100 p-OC PEG200 Koronakis et al., 2000

Octyl-POE (+HPTO) PEG400 p-OG

Nickel affinity DDM DDM SDVD PEG2K glycerol 3.50,lIWC Murakami et al., 2002

E.coli *R

room 3.10,lCQE I HDVD I I temperature ! Picot et al.. 1994 none gel-filtration I DM I p-oc I ion-exchange

none gel-filtration I DM I p-OC I HDVD I PEG4K room 3.50,lPRH ! ion-exchange I temperature I Picot et al., 1994

A. a.:Alicyclobacillus Acidocaldurtus B.b.:Bucillus Brevis B.v.:Blastochlaris Viridis E.coli:Escherichiu Coli H.h.:Halobacterium halobium H.s.:Halobncterium solinarum M.t.:Mycobucrerium Tuherculoyir N.p.:Nutronobacertum pharaonis P.d.:Puracoccus Denitrificans R.m. :Rhodospirillum molischimium R. s. :Rhudobacler sphaeroides R.v. :Rhodapseudomanas. viridis

S.e.:Synechococcus Elongutus S.v.:Svnechococcus vulcanus S.l.:Streptomyces itviduns S.t.:Salmonellrr typhimurium S.r.:Sarcoplasmic Reticulum T.tep:Thermochromurium lepidrrni T.I.: Therrmus Themophillus W.S.: Wolinellu Succinogenes A.S.:Aminoniuni sulfate C&:n -octyltetruu~ethylerie C I zEx:polvoxveth~lene(8)dodec~l etlier C 1zEp:polvoxqethyl~ne(9/dodec~l ether

LCP:Lipid cubic phuse DDM:n -Dodecyl-P-D-maltosidf DM:n-Decyl-P-D-multoside DG:n-dodecyl-D glucopyrunoside HPTO:heptanel,2,3-triol HG:n-hexyl-D glucopyrunoside HPG:n-hep.ph'l-D glucopyrunoside HTG:n-hepryl rhioglucoside LDA0UDM:luuryl-N. N-dimethvlu

oxideundecyl-d-maltoside NG:n-noiiyl-P-D-fiIucu.side MC:microcapilIu~

MD:micrudiulysis MOG: I-monooleo?l-rac~glycerol Octyl-POE:ortyl-pol),o~ethylene OG:n-octyl-P-D-thioglllcoside ~-OG:n-octyI-~-D-gluco.side 0HES:n- Octyl-2-hydroxefhyI-sulfoxide UDA0:N. N-dimethgluizdeo.vlamine-N-oxide UDM:undecyl-D-maltoside

mine-N- PEG55OMEE:PEG monomethylether 550 PEG2KMEE:PEG2000 monomefhylerlier PEG5 KMEE: PEG5000 monomethyleiher TEG: Triefhvlene glycol

Page 135: Conformational Proteomics of Macromolecular Architecture

122 Hiroshi Aoyama et a/.

Mitochondria1 cytochrome c oxidase is extracted from the mitochondria1 inner membrane of bovine heart, using cholate. The extracted enzyme is purified by ammonium sulfate fractionation repeated three times in the presence of cholate and another three times in the presence of Brij-35, a polydispersed non-ionic detergent, to replace cholate. The main component of Brij-35 has the chemical formula CH3(CH2), 10(OCH2CH20)23H. Hexagonal bipyramidal crystals are grown at low ionic strength by adding solid Brij-35 to the enzyme solution with a high protein concentration of about 70 m g / m ~ ~ ~ Any polyethyleneglycol monoether with shorter polyethyleneglycol than Brij - 35 does not yield crystals at low ionic strength. This is consistent with the finding that larger detergents are more effective for crystallization of large membrane protein complexes.23 The detergent yields also tetragonal plate crystals at higher ionic strength.42 The hexagonal and the tetragonal crystals diffract to 7 A and 10 A resolutions, respectively. In order to obtain crystals diffracting X-rays at higher resolution, the enzyme is stabilized with dodecyl octaethyleneglycol monoether, CH3(CH2), ,0(OCH2CH20)8H, instead of Briji-35, and crystallized using ammonium sulfate as the precipitant and hexamethyleneglycol as an additive. Batch crystallization produces tetragonal prism crystals diffracting to 5 A. Neither longer ethyleneglycol unit of 23 nor shorter ethyleneglycol unit of six produced crystal.42 By inspecting various crystallization conditions, crystals diffracting to better than 2.8 A resolution was obtained when decylmaltoside, as a stabilizing agent, and polyethyleneglycol, as a precipitant, were used.49 These results indicate that an optimum size of the detergent is critical for crystallization of membrane proteins.

CRYSTAL L IZATION METHODS OF MEMB RAN E PROTEINS YIELDING TYPE 1 CRYSTALS DIFFRACTING X-RAYS AT HIGH RESOLUTION The crystallization methods developed for water soluble proteins are applicable to the membrane proteins solublized with detergent. Most of the crystals obtained from detergent solutions are of type 11, in which the hydrophilic regions almost exclusively make inter-molecular contacts.28

Page 136: Conformational Proteomics of Macromolecular Architecture

High Resolution Structure of Membrane Proteins by X-Ray Crystallography 123

In addition to these methods, special techniques have been developed for crystallization of membrane proteins. Landau and RosenbuschZ4 devised a rational approach to obtain highly ordered crystal of bacteriorhodopsin (BR) using a quasi-solid lipid cubic phase. Bicontinuous cubic phases whose membrane bilayer expanded three- dimensionally consisted of monoolein ( 1 -monooleoyl-rac-glycerol) and aqueous buffer. Components of the cubic phase were added in the following sequence: (i) monoolein, (ii) phosphate buffer to yield 3.3M and pH5.6, (iii) BR to give a final protein concentration of 3.5 mg/ml, and (iv) buffer containing methylpentandiol (0.05%) and a-octylglycopyranoside (1.2%). The BR-containing bicontinuous cubic phases initially with homogeneous protein distribution yielded hexagonal crystals after several days. The crystals diffracted X-rays to 2.0 A at the synchrotron X-ray sourcez4.

A crystal of K intermediate of BR grown from the cubic phase diffracted to a resolution of 1.43 A, which is the highest resolution obtained for membrane protein crystals4'. The advantage of the cubic phase crystallization is that the protein is stable in a bilayer environment, and it can diffuse throughout the three-dimensional bilayer to reach and associate with the crystal nuclei. This method has been applied successfully to crystallize and solve the structures of several membrane

These crystals are of type I, in which proteins at high resolution. two-dimensional arrays of proteins are stacking on top of each other, while most crystals from detergent solution are of the type 11.

Though the lipid cubic phase is quite efficient and has suggested that membrane proteins can be crystallized from lipid bilayers, it has technical disadvantages due to its extremely high viscosity. Bicelles are small bilayer disks that form in certain lipidamphiphile mixture. Since bicelles are not viscous at low temperature and turn into a lamellar structure at high temperature, a new crystallization method using bicelles has been d e ~ e l o p e d . ~ The bicelle crystallization method is outlined as follows: (i) the protein is mixed with bicelle components and kept on ice to maintain the liquid state, (ii) the bicelles containing the proteins are mixed with precipitant on a cover slip and the drop is sealed for equilibrium by vapor diffusion; if a gel-like phase is desired, (iii) the crystallization plate is moved to an incubator at high temperature such as

2 1,2734

Page 137: Conformational Proteomics of Macromolecular Architecture

124 Hiroshi Aoyama et al.

37°C. Monoclinic BR crystal was obtained by bicelle crystallization and its structure was determined at 2.0 A resolution.’ The crystal belongs to the type I by judging from molecular packing.

A unique crystallization method for BR that was achieved without any step for complete solubilization of the protein was developed by Takeda et d4’ When purple membrane containing two-dimensional array of BR is incubated at high temperature (37°C) with a small amount of the detergent octylthioglucoside in the presence of the precipitant ammonium sulfate, a large fraction of the membrane fragment is converted into spherical vesicles with a diameter of 50 nm. A suspension of the spherical vesicles is concentrated and hexagonal crystals are produced on cooling to 4°C. These diffracted X-rays to 2.5 A. The crystal exhibits the type I packing as those obtained by the lipid cubic phase. The unique crystallization method may be applicable to such membrane proteins that are prone to denaturation upon isolation from the membranes by excess amount of detergent.

The crystals obtained by the lipid cubic phase, the bicelle and the spherical vesicle methods are of the type I. Each layer in the crystals is constructed by hydrophobic interactions among molecular surfaces of transmembrane regions. Not only the type I1 packing but also the type I packing can yield highly ordered crystals.

CHOICE OF CRYOPROTECTANTS Protein crystals are sensitive to free radicals generated by X-rays, especially at room temperature. The radiation induced decay of protein crystals is a serious problem when the crystals are irradiated by strong synchrotron X-rays. Freezing the crystals to cryogenic temperature reduces radiation damage. Since protein crystals are hydrated, ice formation upon freezing the crystals tends to destroy the ordered structures of the crystal lattices. In order to avoid ice formation, the crystal is soaked in solutions containing a cryoprotectant with which water molecules not only inside but also outside the crystal are replaced. The crystal containing a cryoprotectant in solvent channel within the crystal can be flash-cooled, with the solvent turned into a vitreous glass preserving lattice order.

Page 138: Conformational Proteomics of Macromolecular Architecture

High Resolution Structure of Membrane Proteins by X-Ray Crystallography 125

Table 2. List of cryoprotectants used successfully in flash-cooling macromolecular crystals

Inorganic Salts Alcohols Carbohydrates Oils

Lithium acetate

Lithium chloride

Lithium formate

Lithium nitrate

Lithium sulfate

Magnesium acetate

Sodium chloride

Sodium chloride

Sodium nitrate

Methanol Glucose Paraffin oil

Ethanol Inositol Paraton-N

2-propanol Sucrose

Polyehylene Glycol 400 Trehalose

Ethylene glycol Raffinose

Propylene glycol

2,3,R,R-butanediol

Diethylene glycol

Triethylene glycol

2-methyl-2,4-pentanediol

Glycerol

Erythritol

Xylitol

Since membrane protein crystals have more solvent content and less inter-molecular interactions than those of water-soluble proteins in general, they suffer seriously from several problems upon freezing. Freezing membrane protein crystals result frequently in unacceptable lattice distortion, and increases in B-factor and mosaic spread. The increase in B-factor effects directly in degradation of resolution, and the increase in mosaic spread lower the signal-to-noise ratio of the observed intensity data. Several crystals are sometimes required for the acquisition of intensity data at a high resolution even by the cryogenic experiment. It is, however, not easy to reproduce isomorphism upon the freezing membrane protein crystals because larger shrinkage of their cells than those of water-soluble proteins. Isomorphism of frozen crystals is achieved by adjusting accurately the experimental conditions of freezing.

Cryoprotectants that have been used for cryogenic experiments are listed in Table 2. Cryoprotectant is added to the mother liquor of

Page 139: Conformational Proteomics of Macromolecular Architecture

126 Hiroshi Aoyama et al.

crystallization in general. It is, however, possible to transfer crystals from the mother liquor of crystallization to unrelated solution containing

Though glycerol is the most popular cryoprotectants cryoprotectant. for the membrane proteins as shown in Table 1, various reagents are surveyed for each protein crystal in order to find the most suitable cryoprotectant. To avoid damaging crystal during soaking into solution containing cryoprotectant, it is usually necessary to introduce the cryoprotectant slowly, either by stepwise dialysis or by serial transfer.

36,39,55

ADJUSTING WATER CONTENT IN THE CRYSTAL TO IMPROVE RESOLUTION An effective approach to obtain crystal diffracting at high resolution is to manipulate the already grown crystals. In the early years of protein crystallography, crystals with a different lattice from that of the original crystal were prepared by changing the relative humidity surrounding the crystal, to develop phasing method utilizing intensity change accompanied by change of lattice dimension^.^^" The monoclinic hen egg white lysozyme crystals diffracting to 1.75 A were obtained by lowering the humidity surrounding the crystals that diffracted originally to 2.5 A.”

A similar transformation of crystal can be performed by changing the storage buffer. The crystal of the complex of EF-Tu and EF-Ts was transformed to a high-ordered form by increasing the polyethylene glycol (PEG) concentration in the mother liquor after crystall i~ation.~~ The icosahedral crystals of pyruvate dehydrogenase core were rehydrated after they had been completely dehydrated. The rehydratioddehydration transformhation improved diffraction quality of the crystal from 7 A to 4 &I4 Kiefersauer et all9 developed a mounting system for protein crystals to allow an accurate adjustment of humidity in the crystal environment. The CO dehydrogenase crystal was transformed successively to 84%, 8 1 % and 90% humidity under accurate control with the mounting system. The original crystal diffracted to 2.9 A. The 84% humidity crystal gained improvement of diffraction to about 2.0 A. A further reduction of humidity to 81% led to the destruction of the crystal order; a humidity jump to 90% restored the crystal diffraction to 1.8 A.

Page 140: Conformational Proteomics of Macromolecular Architecture

High Resolution Structure of Membrane Proteins by X-Ray Crystallography 127

Kiefersauer et al. l 9 successfully manipulated crystals from the membrane protein ba3-cytochrome c oxidase to improve diffraction power. These crystals belong to the space group P43212 and show a large variation of up to 10 8, in their cell constants and diffraction power, which is not rare for membrane protein crystals. The crystals did not diffract X-rays to higher than 7 A using a laboratory X-ray source. The oxidase crystal was enclosed by paraffin oil and left for a sufficient amount of time in the cryo-loop. Dehydration occurred by evaporation of solvent through the oil layer, and accompanied the change of the crystal lattice and crystal order that were monitored by X-ray diffraction pattern. The crystal was frozen after the optimum dehydration state was achieved. The crystal diffracted X-rays to 2.4 8, at the synchrotron X-ray source and its structure was determined using MAD phasing.43 This experiment strongly suggests that the manipulation techniques developed for water soluble protein crystals are applicable to membrane protein crystals.

CONCLUSIONS Each membrane protein is purified from the membrane and crystallized under coexistence with suitable detergent. A universal detergent applicable to the purification or the crystallization of any membrane protein has not been found. It is probably because the natures of the transmembrane surfaces of the proteins that interact with the membranes are different from each other. There are two types of crystal paclung of membrane proteins. Almost all the packing interactions in the crystal obtained from the detergent solution are of type 11, while the novel crystallization methods which grow crystals through lipid layer containing protein molecules yield type I crystals in which trans membrane surfaces of the proteins make intermolecular contacts. Both of types I and I1 crystals can diffract X-rays to high resolution. Though the inter-molecular interaction in the membrane protein crystal is weak, it can be frozen without destruction of ordered lattice by choosing an adequate cryoprotectant. The quality of membrane protein crystals can be improved by adjusting solvent contents as that of water-soluble proteins.

Page 141: Conformational Proteomics of Macromolecular Architecture

128 Hiroshi Aoyama et al.

REFERENCES 1 . Abramson J, Riistama S, Larsson G, Jasaitis A, Svensson EM, Puustinen A,

Iwata S, and Wikstrom M. The structure of the ubiquinol oxidase from Escherichia coli and its ubiquinone binding site. Nut Struct Biol, 2000; 7:

2. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, and Bourne PE. The Protein Data Bank. Nucl Acids Res,

3. Bragg WL and Perutz MF. The external form of the haemoglobin molecule. Acta Cryst, 1952; 5: 323-328.

4. Buchanan SK, Smith BS, Venkatramani L, Xia D, Esser L, Palnitkar M, Chakraborty R, van der Helm D, and Deisenhofer J. Crystal structure of the outer membrane active transporter FepA from Escherichia coli. Nut Struct Biol,1999; 6: 56-63.

5 . Chang G, Spencer RH, Lee AT, Barclay MT, and Rees DC. Structure of the MscL homolog from Mycobacterium tuberculosis: a gated mechano- sensitive ion channel. Science, 1998; 282: 2220-2226.

6 . Deisenhofer J, Epp 0, Sinning I, and Michel H. Crystallographic refinement at 23 A resolution and refined model of the photosynthetic reaction centre from Rhodopseudomonas viridis. J Mol Biol, 1995; 246: 449-457.

7 . Faham S and Bowie JU. Bicelle crystallization: A new method for crystallizing membrane proteins yield a monomeric bacteriorhodopsin structure. J Mol Biol, 2002; 316: 1-6

8. Forst D, Welte W, Wacker T, and Diederichs K. Structure of the sucrose- specific porin ScrY from Salmonella typhimurium and its complex with sucrose. Nut Struct Biol, 1998; 5: 37-46.

9. Fu D, Libson A, Miercke LJW, Weitzman C, Nollert P, Krucinski J, and Stroud RM. Structure of a glycerol conducting channel and the basis for its selectivity. Science, 2000; 290: 48 1-486.

10. Hunte C, Koepke J, Lange C, Rossmanith T, and Michel H. Structure at 23 A resolution of the cytochrome Bc 1 complex from the yeast Saccharomyces cerevisiae co-crystallized with an antibody fv fragment. Structure, 2000; 8:

1 1. Huxley HE and Kendrew JC. Discontinuous lattice changes in haemoglobin crystals. Acta Cryst, 1953; 6: 76-80

12. Iverson TM, Luna-Chavez C, Croal LR, Cecchini G, and Rees DC. Crystallographic studies of the Escherichia coli quinol-fumarate reductase with inhibitors bound to the quinol-binding site. J Biol Chem, 2002; 277:

13. Iwata S , Lee JW, Okada K, Lee JK, Iwata M, Rasmussen B, Link TA, Ramaswamy S, and Jap BK. Complete structure of the 11-subunit bovine mitochondria1 cytochrome bcl complex. Science, 1998; 281: 64-7 1.

910-9 17.

2000; 28: 235-244.

669-684

16124-16130.

Page 142: Conformational Proteomics of Macromolecular Architecture

High Resolution Structure of Membrane Proteins by X-Ray Crystallography 129

14. Izard T, Sarfaty S, Westphal A, De Kok A, and Hol WGJ. Improvement of diffraction quality upon rehydration of dehydrated icosahedral Enterococcus faecalis pyruvate dehydrogenase core crystal. Protein Science, 1997; 6: 9 I 3-9 15.

15. Jormakka M, Tornroth S, Byrne B, and Iwata S. Molecular basis of proton motive force generation: Structure of formate dehydrogenase-N. Science,

16. Jordan P, Fromme P, Witt HT, Klukas 0, Saenger W, and Krauss N. Three- dimensional structure of cyanobacterial photosystem I at 25 A resolution. Nature, 2001; 431: 909-917.

17. Kamiya N and Shen J-R. Crystal Structure of Oxygen-Evolving Photosystem I1 from Therrnosynechococcus Vulcanus at 37-A resolution. ProcNutAcadSciUA, 2003; 100; 98-103.

18. Kendrew JC, Dickerson RE, Strandberg RJ, Hart RJ, Davies DR, Phillips DC, and Shore VC. Structure of Myoglobin. Nature, 1960; 185: 442-447.

19. Kiefersauer R, Than ME, Dobbek H, Gremer L, Melero M, Strobl S, Dias JM, Soulimane T, and Huber R. A novel free-mounting system for protein crystals: transformation and improvement of diffraction power by accurately controlled humidity changes. J Applied Crystallogr, 2000; 33: 1223- 1230.

20. Koepke J, Hu X, Muenke C, Schulten K, and Michel H. The crystal structure of the light-harvesting complex I1 (B800-850) from Rhodospirillum molischiunum. Structure, 1996; 4: 58 1-597.

21. Koble M, Besir H, Essen L-0, and Oesterhelt D. Structuire of the light- driven chloride pump halorhodopsin at 18 A resolution. Science, 2000; 288:

22. Koronakis V, Sharff A, Koronakis E, Luisi B, and Hughes C. Crystal structure of the bacterial membrane protein tolc central to multidrug efflux and protein export. Nature, 2000; 405: 914-919.

23. Kuhlbrandt W. Three-dimensional crystallization of membrane proteins. Q Rev Biophys, 1988; 4: 449-77

24. Landau EM and Rosenbusch JP. Lipid cubic phase: A novel concept for the crystallization of membrane proteins. Proc Nut1 Acad Sci USA, 1996; 93: 14532- I4535

25. Lancaster CRD, Kroeger A, Auer M, and Michel H. Structure of fumarate reductase from Wolinella Succinogenes at 22 Angstroms resolution. Nature,

26. Locher KP, Rees B, Koebnik R, Mitschler A, Moulinier L, Rosenbusch JP, and Moras D. Transmembrane signaling across the ligand-gated FhuA receptor: crystal structures of free and ferrichrome-bound states reveal allosteric changes. Cell, 1998; 95: 771-778.

2002; 295: 1863-1868.

1390-1396.

1999; 402: 377-385.

Page 143: Conformational Proteomics of Macromolecular Architecture

130 Hiroshi Aoyama et al.

27. Luecke H, Schobert B, Lanyi KJ, Spudich NE, and Spudich LJ. Crystal structure of sensory rhodopsin I1 at 24 A: Insights into color tuning and transducer interaction Science, 2001; 293: 1499-1 503.

28. Madhusudan L, Kodandapani R, and Vijayan M. Protein hydration and water structure: X-ray analysis of closely packed protein crystal with a very low solvent content. Acta Crystallogr, 1993; D49: 234-245.

29. Michel H. Crystallization of membrane proteins. Trends Biochem Sci, 1983;

30. Murakami S, Nakashima R, Yamashita E, and Yamaguchi A. Crystal structure of bacterial multidrug efflux transporter Acrb. Nature, 2002; 439:

3 1. Nogi T, Fathir I, Kobayashi M, Nozawa T, and Miki K. Crystal structures of photosynthetic reaction center and high-potential iron-sulfur protein from Thermochromatium Tepidum: Thermostability and electron transfer. Proc NatAcad Sci USA, 2000; 97: 13561-13565.

32. Ostermeier C, Harrenga A, Ermler U, and Michel H. Structure at 27 A resolution of the Paracoccus denitrGcans two- subunit cytochrome c oxidase complexed with an antibody FV fragment. Proc Natl Acad Sci U S

33. Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Le Trong I, Teller DC, Okada T, Stenkamp RE, Yamamoto M, Miyano M. Crystal structure of rhodopsin: A G protein-coupled receptor. Science,

34. Pebay-Peyroula E, Rummel G, Rosenbusch JP, and Landau. X-ray structure of bacteriorhodopsin at 25 A from microcrystals grown in lipid cubic phases. Science, 1997; 277: 1676-1681.

35. Perutz MF, Muirhead H, Cox JM, and Goaman LCG. Three dimensional Fourie synthesis of horse oxyhaemoglobin at 28 A resolution: The atomic model. Nature, 1968; 219: 131-139.

36. Petsko GA. Protein crystallography at sub-zero temperature: cryo-protective mother liquors for protein crystal. JMol Biol, 1975; 96: 381-392.

37. Phale PS, Philippsen A, Kiefhaber T, Koebnik R, Phale VP, Schirmer T, and Rosenbusch JP. Stability of trimeric OmpF porin: the contributions of the latching loop L2. Biochemistry, 1998; 37: 15663-15670.

38. Picot D, Loll PJ, and Garavito RM. The X-ray crystal structure of the membrane protein prostaglandin H2 synthase-1. Nature, 1994; 367: 243- 249.

39. Ray WJ Jr, Bolin JT, Puvathingal JM, Minor W, Liu U and Muchmore SW. Removal of salt from a salt-induced protein crystal without cross-linking. Biochemistry, 1991; 30: 6866-6875.

40. Schick B, and Jumak F. Extension of diffraction resolution of crystals. Acta Crystallogr, 1994; D50: 563-568.

8: 56-59.

587-593.

A, 1997; 94: 10.547-10553.

2000; 289: 739-745.

Page 144: Conformational Proteomics of Macromolecular Architecture

High Resolution Structure of Membrane Proteins by X-Ray Ctystallography 131

41. Schobert B, Cupp-Vickery J, Hornak V, Smith SO, and Lanyi JK. Crystallographic structure of the K intermediate of bacteriorhodopsin: Conservation of free energy after photoisomerization of the retinal. J Mol Biol, 2002; 321: 715-726.

42. Shinzawa-Itoh K, Ueda H, Yoshikawa S, Aoyama H, Yamashita E, and Tsukihara J. Effects of ethyleneglycol chain length of dodecyl polyethyleneglycol monoether on the crystallization of bovine heart cytochrome c oxidase. Mol Biol, 1995 ; 246: 572-5.

43. Soulimane T, Buse G, Bourenkov GP, Bartunik HD, Huber R, and Than

44. Stowell MH, McPhillips TM, Rees DC, Soltis SM, Abresch E, and Feher G. Light-induced structural changes in photosynthetic reaction center: implications for mechanism of electron-proton transfer. Science, 1997; 276:

45. Sui H, Han B-G, Lee JK, Walian P, and Jap BK. Structural basis of water- specific transport through the Aqpl water channel. Nature, 2001; 434: 872- 878.

46. Svensson-Ek M, Abramson J, Larsson G, Tornroth S, Brezezinski P, and Iwata S. The X-ray crystal structures of wild-type and Eq(1-286) mutant cytochrome C oxidases from Rhodobacter Sphaeroides. J Mol Biol, 2002; 321: 329-339.

47. Takeda K, Sato H, Hino T, Kono M, Fukuda K, Sakurai I, Okada T, and Kouyama T. A novel three-dimensional crystal of bacteriorhodopsin obtained by successive fusion of the vesicular assemblies. J Mol Biol, 1998; 283: 463-474.

48. Toyoshima C, Nakasako M, Nomura H, and Ogawa H. Crystal structure of the Calcium pump of sarcoplasmic reticulum at 26 A resolution. Nature, 2000; 405: 647-655.

49. Tsukihara T, Aoyama H, Yamashita E, Tomizaki T, Yamaguchi H, Shinzawa-Itoh K, Nakashima R, Yaono R, and Yoshikawa S. Structures of metal sites of oxidized bovine heart cytochrome c oxidase at 2.8 A. Science,

50. Wang BC. Resolution of phase ambiguity in macromolecular crystallography. Method Enzymol, 1985; 115: 90-1 17.

51. Wang YF, Dutzler R, Rizkallah PJ, Rosenbusch JP, and Schirmer T. Channel specificity: structural basis for sugar discrimination and differential flux rates in maltoporin. JMol Biol, 1997; 272: 56-63.

52. Wendt KU, Lenhart A, and Schulz GE. The structure of the membrane protein squalene-hopene cyclase at 20 A resolution. J Mol Biol, 1999; 286:

53. Wierenga RK, Zeelen JPh, and Nobel MEM. Crystal transfer experiments carried out with crystals of trypanosomal triosephosphate isomerase (TIM). J Cryst Growth, 1992; 122: 23 1-234.

ME. EMBO J , 2000; 19: 1766-1776.

8 12-816.

1995; 269: 1069-74.

175-1 87.

Page 145: Conformational Proteomics of Macromolecular Architecture

132 Hiroshi Aoyarna et u2.

54. Yoshikawa S, Tera T, Takahashi Y, Tsukihara T, and Caughey WS. Crystalline cytochrome c oxidase of bovine heart mitochondria1 membrane: composition and x-ray diffraction studies. Proc Natl Acad Sci U S A, 1988;

55. Yoshikawa S, Shinzawa-Itoh K, Nakashima R, Yaono R, Yamashita E, Inoue N, Yao M, Fei MJ, Libeu CP, Mizushima T, Yamaguchi H, Tomizalu T, and Tsukihara T. Redox coupled crystal structure changes in bovine heart cytochrome c oxidase. Science, 1998; 280: 1723-1729.

56. Zhang Z, Huang L, Shulmeister VM, Chi YI, Kim KK, Hung LW, Crofts AR, Berry EA, and Kim SH. Electron transfer by domain movement in cytochrome bcl. Nature, 1998; 392: 677-684.

57. Zhou Y, Morais-Cabral JH, Kaufman A, and Mackinnon R. Chemistry of Ion Coordination and Hydration Revealed by a K+ Channel-Fab Complex at 20 A Resolution. Nature, 2001; 434: 43-48.

58. Zouni A, Witt HT, Kern J, Fromme P, Krauss N, Saenger W, and Orth P. Crystal Structure of Photosystem I1 from Synechococcus Elongatus at 38 A Resolution. Nature, 2001; 409: 739-743.

85:1354-8.

Page 146: Conformational Proteomics of Macromolecular Architecture

Chapter 6

ON THE POSSIBILITY OF DETERMINING STRUCTURES OF MEMBRANE PROTEINS

IN TWO-DIMENSIONAL CRYSTALS USING X-RAY FREE ELECTRON LASERS

Michael Becker* and Edgar Weckert'

We consider the possibility that X-ray Free Electron Lasers might be useful for structure determination of membrane proteins in 2-dimen- sional crystals. In particular, we consider whether it might be possible to collect useful diffraction data from thin sheets of membrane proteins in single-shot experiments using ultrashort, ultrabright X-ray pulses, where the diffraction is expected to be faster than much of the damage. Two things are needed from the X-ray source: high intensity, and short pulses. Data could be collected via grazing-incidence diffraction on separate samples in different orientations. Advances in 'nanotechno- logy' for ordering samples on surfaces will be important for this en- deavor.

Keywords: X-ray free electron laser, two-dimensional crystal, mem- brane protein

INTRODUCTION Membrane proteins are important in biology. They play central roles in processes as diverse as solar-energy conversion in photosynthesis, nerve- cell ion conduction, cell growth, virus and other pathogen binding, and drug resistance associated with diseases. They constitute roughly 30% of

*Biology Department, Brookhaven National Laboratory, P.O. Box 5000, Upton, NY 11973 U.S.A.; Email address: [email protected] (Corresponding author) 'HASYLAB, DESY, Notkestr. 85, D-22607 Hamburg, Germany. Email address: Edgar.Weckert@ desy.de

133

Page 147: Conformational Proteomics of Macromolecular Architecture

134 Michael Becker & Edgar Weckert

the genes encoded by genomes. However, of the greater than 20,000 structures deposited in the Protein Data Bank (as of Feb. 25, 2003), less than 0.1 % are structures of intact integral membrane proteins.

There are several reasons for this paucity of membrane-protein struc- tures. For one, many membrane proteins are found in low abundance in cells, and methods for their overexpression by molecular-biology tech- niques are less well developed than for water-soluble proteins. Electron Microscopy (EM) techniques with two-dimensional crystals have pro- vided important structures, but obtaining structures near atomic resolu- tion so far has required heroic efforts. For X-ray crystallography, mem- brane proteins are notoriously difficult to crystallize in three dimensions. Again, notable successes have occurred, and increasing resources and efforts in both fields will surely provide more successes. Nonetheless, despite advances in screening and methodology, crystallization remains more in the domain of alchemy than science. The time between obtaining initial crystals and obtaining crystals that are well ordered to near-atomic resolution is often painfully long. Furthermore, the finicky nature of membrane-protein crystals tends to make them particularly uncoopera- tive for obtaining high-resolution phasing information. Alternative ap- proaches include expressing water-soluble domains of proteins, which provide valuable but limited information for understanding the behavior of the intact protein. Finally, Nuclear Magnetic Resonance techniques appear to have limited applicability for overall structure determination of large integral membrane proteins. Further advances are needed

New technologies often enable scientific advances. In the X-ray physics field, plans have evolved since the 1980s to develop pulsed X- ray sources that will provide ultrabright, ultrashort pulses. One type of source is X-ray Free Electron Lasers. These are based on linear accele- rators that may stretch over kilometers, where the electron beam is accelerated to 15 GeV or more (compared to 2.5 to 8 GeV at Synchro- trons), and that lase in a single pass through a long undulator without the need for mirrors. The lasing mechanism that is the basis for the current X-ray FEL plans is called Self-Amplified Spontaneous Emission, or SASE (Bonifacio et al., 1984; Kim, 1986). They will provide pulses in the 1-to-15-A range that are (a) ultrashort (in the range of 10s to 100s of fs), (b) ultrabright (peak brightness = photons/s/mm2/mrad2/0. 1 %

Page 148: Conformational Proteomics of Macromolecular Architecture

Determining Structures of Membrane Proteins 135

bandwidth, and having about 1OI2 photons per pulse), and (c) coherent. They also can provide spontaneous emission with short pulses and broad energies, but such pulses will be less bright and less coherent. X-ray FELS in the hard X-ray region have been planned for development at Stanford (Arthur, 2002) and at DESY (Materlik and Tschentscher, 2001). Another type of new source might be Energy Recovery Linacs (ERLs), which was first proposed by Tigner of CHESS (Tigner, 1965). These have a linear accelerator, but the spent electron beam is recovered so that almost all of the energy is conserved. These might provide short, bright pulses in the future as well. The pulses will not be as bright (having - lo9 photons per pulse) as an FEL, or as coherent, but they may have great flexibility in terms of polarization, repetition rate, etc. Other proposals have been forwarded elsewhere, but time will tell what technologies emerge further.

To address the question of how such sources might be used in struc- tural biology, first we consider general concepts in data collection. The scattering from a single particle in a given orientation yields a continuous scattering pattern. The diffraction from a crystal consists of the scattering from the single object multiplied with the reciprocal lattice of the crystal, due to the interference of waves scattering from the periodic arrangement of particles. Quantum mechanics tells us that the observable quantity is the square of the amplitude of the wave function (modulus), which re- sults in the loss of phasing information. The relative phases of the scat- tered photons, electrons, or neutrons contain the information about how the scattering atoms are spatially related to each other. The act of solving a structure consists of measuring amplitudes and phases throughout most or all of reciprocal space. Amplitudes are measures by exposing the sam- ple to radiation in enough orientations that all of reciprocal space is probed; in the case of crystals, this means exposing the crystal in such a way that the Bragg condition is satisfied throughout reciprocal space, i.e. that the Ewald sphere intersects all accessible regions of reciprocal space. Many techniques have been developed for crystallographic phas- ing, including the heavy-atom method, single- or multiple-wavelength anomalous diffraction, molecular replacement, and direct methods. For determining atomic structures of macromolecules, since the ratio of ob- servable to unknown atomic positions is not particularly high, there is

Page 149: Conformational Proteomics of Macromolecular Architecture

136 Michael Becker & Edgar Weckert

experimental uncertainty, and one generally back-calculates an electron density map in X-ray crystallography, and fits an atomic model into the density using refinement techniques.

In such diffraction experiments, a key issue is radiation-induced damage. In general, for every hard X-ray photon that is elastically scatte- red, ten X-ray photons are absorbed. To determine a structure at high resolution (d-spacing; Q), a large number of waves must be scattered. One advantage of crystals is that they provide a large number of scatter- ers, which gives sufficient signal-to-noise to do the experiments. How- ever, as a sample is exposed to radiation, the sample becomes ionized by various mechanisms, which degrades the structure of the sample. With a smaller sample, more X-rays per atom are needed to determine a struc- ture to an equivalent resolution. In an important paper in 1995, Hender- son compared the utility of using electrons vs. X-rays vs. neutrons for determining structures at atomic resolution. He derived a limit on the dose that a sample might sustain and allow a structure to be determined. Cryoprotection of samples allows improved protection from secondary damage (Garman & Nave, 2002; Teng & Moffat, 2002), but primary damage is still a problem. For X-rays with modern undulator sources, Sliz et al (2003) and others have estimated that the minimum volume of a 3-D crystal that can be used for structure determination. The stronger scattering cross section of electrons allows for even smaller samples to be investigated, including single particles and two-dimensional crystals (for example, see Frank, 1998; Subramaniam and Henderson, 1999). These studies are very valuable, but they also suffer from damage and other difficulties.

In the 1980s, Solem (1986) suggested, in the context of soft X-ray microscopy and holography, that it might be possible to image whole cells, via soft-X-ray microscopy or holography, using very short pulses. As scattering occurs in about 1 attosecond (10-'8s), much damage might be expected to occur more slowly. Thus, through pulsed experiments, it might be possible to partially overcome damage issues. More recently, Hajdu, Neutze, Weckert and co-workers have conducted simulations on single protein and cluster scattering in the hard X-ray region using sub- picosecond pulses as well (Neutze et al., 2000); they estimated an in- creased damage barrier.

Page 150: Conformational Proteomics of Macromolecular Architecture

Determining Structures of Membrane Proteins 137

PROPOSAL: 2-D CRYSTALS & AMPLITUDES It has been proposed that X-ray Free Electron Lasers might be useful for structure determination of membrane proteins in 2-dimensional crystals (Becker, 1999a,b). The essence of the proposal is that it might be pos- sible to collect diffraction data from thin sheets of membrane proteins in single-shot experiments using ultrashort, ultrabright X-ray pulses, where the diffraction is expected to be faster than much of the damage. For such experiments, the X-ray source needs to provide two things: high inten- sity, and short pulses.

For two-dimensional crystals in general, reciprocal space is charac- terized as an array of parallel rods oriented normal to the plane of the two-dimensional crystal, called Bragg rods. The following equation con- siders kinematic scattering from such a sample:

(1)

Where, I = the two-dimensional scattering cross-section (photons/sec. I , = photons/unit area, ro = Thomson scattering length, P = polarization factor, Fhk = Bragg-rod structure factor, N = number of unit cells, A = area of a unit cell, h = wavelength, and LZf = Lorenz factor.

Weckert (2001) has given an initial estimate of elastic scattering from lysozyme in a 2D lattice, and similar calculations are being con- ducted on bacteriorhodopsin trimers (Becker & Weckert - work in pro- gress). In this calculation, bacteriorhodopsin trimers (Luecke et al., 1999) were placed in a 2-D lattice of dimensions 10 p m x 10 pm, with a unit cell of 61 A x 61 8, x 54 A. It was initially assumed that 400 photons/A’ is the maximum flux density that a sample can withstand in a ‘single-shot’ experiment (based on an adjusted estimate from Neutze et al. (2000)), using X-rays at 1.5 A (8 keV). Over these dimensions, this corresponds to 4 x lo’* photons per pulse, which is in the range of SASE-based FEL proposals. Several calculated Bragg rods are shown in Fig. 1. Elastically scattered photon numbers are estimated to peak be- tween about 10 and 100 in the 3-to-4 A resolution range from this single pulse.

I = I,, ro2 P I F ~ ~ I * NIA h2 L , ~

Page 151: Conformational Proteomics of Macromolecular Architecture

138 Michael Becker & Edgar Weckert

10000 I I r I I

h k r e d = ( 3 0 )

green = ( 6 0 1 b lue=(9 0 )

- w (0 II

3

0) f a

8 I a (3

- 3

5 10 15 20 25

ROD COORDINATE [1/54 Angstronis]

Fig. 1. Calculated Bragg rods for 10 ,urn x 10 ,urn Bacteriorhodpsin crystals. (Red line solid, green hatched and blue dotted.)

The ERLs or the spontaneous emission of FELs would have about lo9 photons per pulse at 1.5 A, which requires a sample of about 300 pm x 300 pm to give similar counts. However, larger samples would result in lower flux densities in one shot, and it might be possible to get more than one shot per sample. The volume of a 300 ym x 300 pm x 0.005 ym crystal (- 450 pm3) is roughly a factor of 10 less than that of a 20-pm diameter 3-D crystal, which Sliz et al. (2003) have recently proposed is the smallest crystal volume (having a larger unit cell) for which a 3.5-A resolution structure can be determined from a 3rd generation synchrotron. That study was based on experiments where samples accumulated dam- age over the course of many synchrotron pulses. Further calculations are being pursued at increasing levels of sophistication (Weckert & Becker).

The above calculations give estimates of absolute scattering. Data could be collected via grazing-incidence diffraction on separate samples in different orientations, or perhaps as a tilt series on separate samples.

Page 152: Conformational Proteomics of Macromolecular Architecture

Determining Structures of Membrane Proteins 139

The preferred method will likely be to use grazing-incidence diffraction (Als-Nielsen et al., 1994). In grazing-incidence experiments, Snell's Law defines a, as the critical angle for total external reflection. If the incident beam impinges upon a sample at an angle, a, smaller than a,, then pene- tration of the incident beam into the supporting material is exponentially reduced. Separate samples oriented around the normal to the membrane axis could be exposed to single shots to obtain mostly complete data. Above this angle, experiments on weakly scattering samples like proteins are more challenging, but small tilts might be sufficient to fill in the missing zone of data generated as a result of processing the Ewald sphere around the membrane normal. Scaling and merging the data could be challenging. If collected mainly as a tilt series, scattering from support material would need to be low, so the support must be very thin and of extreme crystalline perfection in order to avoid diffuse scatter.

In fact, Verclas et al. (1999) have conducted grazing-incidence dif- fraction on bacteriorhodopsin monolayers. These provided measurements of Bragg rods. However, the samples were large and rotationally disor- dered, yielding Ewald cuts through a powder-like diffraction pattern. When the sample is large, ie. the footprint of the beam on the sample is large, the Bragg rod reflections overlap, and one typically uses Soller slits to select only 1 Bragg rod at a time, which is time consuming and inefficient for data collection. On the other hand, if the sample is ex- tremely small, i.e. less than about 1 pm or so, one might observe broad- ening as a result of finite-size effects. In the limiting case, this will give the continuous scattering patterns of single particles.

Phases:

Phases might be determined by a variety of methods, including adap- tation of a range of standard X-ray crystallography techniques, or by ap- plication of hybrid methods using low-resolution Electron Microscopy (EM) phases. If absorption edges respond linearly, anomalous scattering techniques might be attractive. Rapid wavelength tunability at high flux will be necessary for convenient adaptations of the Multi-wavelength Anomalous Dispersion phasing techniques that are now common in crys- tallography (Hendrickson, 1991). Due to the lower overall flux, sponta-

Page 153: Conformational Proteomics of Macromolecular Architecture

140 Michael Becker & Edgar Weckert

neous emission of a SASE FEL would require relatively large samples, but it could provide relative ease of MAD or 'pink beam' Laue crystallog- raphy if overlap problems are not too severe. But with the brighter co- herent beam, Single-Wavelength Anomalous Dispersion (Dauter, 2002) might be particular attractive.

Phasing by holographic or microscopic methods has been reviewed in Sayre and Chapman (1995). In holography, there is an object beam that scatters off of the molecule, and there is a reference beam; the inter- ference between these beams provides the phasing information. Chap- man and Sayre contrast Gabor-style, in-line holography vs. Fourier-style holography. Gabor-style holography seems challenging, as interference fringes separated by about 1 A, i.e. less than the size of an atom on a de- tector, would need to be measured in order to obtain phasing information to that resolution. Fourier-style holography offers a range of techniques. In some cases, the resolution is limited by disorder of reflecting or ex- panding optics. For some examples of ideas to use supporting material for providing phasing via holography, see Xu (1996) and Doniach (1996). In other Fourier-style techniques, where a reference scatterer is inside each unit cell of a crystal, phasing information can be extracted despite having some mosaicity of the crystals. For example, see the work of Szoke (1993), where a known part of a structure inside the unit cell is used to reference the unknown part of the unit cell. Indeed, it has been shown that the heavy-atom method in crystallography is essentially a holographic method, i.e. scattering off of the heavy atoms generates the reference beam (Tollin et al., 1966). Also, oversampling techniques might be applied, as pioneered by Sayre. Some examples include the works of Miao et al. (2001), Millane (1990), and Robinson (2001).

Notice that many of these techniques do not necessarily require that the incident beam be coherent. Interference can be detected between scattered waves of the reference scatterer and from the molecule of in- terest, i.e. the scattered rays from an ordered sample may be coherent, even though the incident beam may be incoherent. However, having a coherent beam, where the sample is well ordered within the coherence volume of the beam, would provide improved signal-to-noise. Motion and disorder in the crystal will result in a speckle pattern; beneficial or counterproductive nuances of this must be explored further.

Page 154: Conformational Proteomics of Macromolecular Architecture

Determining Structures of Membrane Proteins 141

Why?

1) Grazing-incidence diffraction with X-rays is well suited for allowing techniques in ‘nanotechnology’ to be used to generate samples.

Membrane proteins have a natural tendency to arrange in two dimen- sions. Despite advances in crystallization techniques in two and three dimensions, crystallization of proteins remains more in the realm of al- chemy than science. Probably the greatest challenge of the experiments will be to obtain two-dimensional crystals that are reproducible and nearly atomically ordered over the dimensions of the crystals needed to perform the experiments. With the current emphasis on ‘nanotechnology’, there is hope that someday well-ordered samples might be rationally de- signed. Such efforts might be useful for EM experiments as well, but due to the stronger scattering of electrons, grazing-incidence diffraction with electrons does not appear to be readily amenable. With X-rays, there is less restriction in what types of supporting material might be used. Also, it might be desirable to incorporate phasing information into the support- ing material. For some examples of two-dimensional sample prepara- tions, see: Darst et al. (1991), Jap et al. (1992), Reviakine et al. (1998), Edwards et al. (2000), and Lebeau et al. (2001). X-ray crystallography on two-dimensional crystals could become a potentially important area for water-soluble proteins as well.

2) X-ray diffraction experiments and electron microscopy experiments

Electrostatic effects are of fundamental importance to protein function. In X-ray experiments, programs are regularly used to calculate electro- statics (for example, see Sheinerman et al., 2000; Warshel, 1991), yet detailed experimental numbers to test the validity of the calculations on proteins are limited. In EM experiments on proteins, charge effects have been reported (Kuehlbrandt et al., 1994; Mitsuoka et al., 1999).

Although X-ray and EM maps are commonly used together to model large complexes, such as with viruses, the limits of map interpretation have not been fully explored. Experimentally, electron microscopy or diffraction experiments yield a Coulomb-Potential map; X-Ray crystal- lography experiments yield an electron-density map. These are related by

are complementary.

Page 155: Conformational Proteomics of Macromolecular Architecture

142 Michael Becker & Edgar Weckert

the Poisson equation (Chang et al., 1999), 2 v @ (x,y,z> = -4np(x,y,z)k

Where E = the dielectric constant, @ (x,y,z) = the Coulomb Potential, and p(x,y,z) = the electron density.

Considering E explicitly as a function of position, and rearranging, we obtain,

(3) E ( X Y , Z ) = [V2@ ~ x y , z ) l ~ ~ - ~ n p ~ x , y , z ) l

which could represent a ‘dielectric map’ of a molecule. Great care must be taken in this interpretation. The two maps must

be normalized appropriately. Also, both X-ray and EM maps are aver- aged over time. With an FEL, the nature of the electron-density averag- ing in the X-ray experiment may be different. This may require taking into account the complex dielectric constant, including high-frequency dielectric terms. Note that the dielectric constant is a macroscopic ap- proximation that is meaningful within a limited range. At high resolution, it may be preferable to think in terms of atomic or group polarizabilities. Also, note that the reported charge-effect signal in EM experiments (Kuhlbrandt et al., 1994; Mitsuoka et al., 1999) is strongest in the low- resolution range, i.e. below 6 A, which means that an atomic-resolution EM map is not necessarily needed to attempt a fruitful comparison. However, there is currently controversy about the interpretation of charge effects, and the signal may be small compared to systematic prob- lems. But in any case, it is desirable to attempt to use two-dimensional crystals for both X-ray and EM and experiments to minimize systematic differences between the maps for any types of structural and functional comparisons.

3) There are advantages for observation of time-resolved, dy- namic states in crystals of bio-molecules.

The direct observation of how molecules move and interact is a bio- chemist’s dream. Crystallography is very powerful in that it yields de- tailed, 3-dimensional snapshots of molecules in well-defined states. To understand transitions between states, such as in catalysis, most efforts today focus on crystallizing molecules under different conditions, or on

2(2)

Page 156: Conformational Proteomics of Macromolecular Architecture

Determining Structures of Membrane Proteins 143

generating trapped ‘intermediate’ states. Much can be learned from crys- tallography of molecules in such states. However, by definition, trapped states are not necessarily identical to the true intermediate states along a reaction path. To investigate true intermediate states more directly, it is necessary to perform time-resolved experiments. Such time-resolved ex- periments typically employ a light flash to trigger a reaction, using bio- molecules with intrinsic pigments, such as in photosynthesis, vision, or phototaxis, or using artificial caged compounds as the trigger (for exam- ple, see Ren et al., 2001).

These experiments are difficult. One major difficulty is collecting data rapidly enough to determine a structure with sufficient signal-to- noise to address the problem of interest. Another major problem is in rapidly, synchronously triggering reactions in high yield, which can be very difficult in crystals. Whereas diffusion-dependent reactions tend to be slower in crystals than in solution, excited states tend to decay faster in the crystalline state. Saturation effects in picosecond experiments can occur well below 100% with photosynthetic pigments and proteins, even in solution (Becker et al., 1991). Exciton annihilations, which reduce yields of excited states and complicate kinetics, will likely be the rule rather than the exception with photo-excited crystalline states. These types of effects can obscure signals of interest. Also, dynamics of bio- molecules are often ‘non-exponential’ ; i.e. intermediate states may have a range of significantly different conformations. Over the course of an experiment, X-ray photolysis of water in the crystals can create undesir- able side reactions that interfere with the reaction of interest.

Nonetheless, despite these challenges, it is important to pursue such studies. Ultrafast X-ray experiments on two-dimensional crystals might help to alleviate some of the problems. For example, data could be col- lected more rapidly, and synchronization of triggering reactions might be relatively improved.

SUMMARY To make these experiments become reality, there are various detailed experimental problems that must be addressed. For example, for single- shot experiments, it is assumed that rapid, automatic sample mounting

Page 157: Conformational Proteomics of Macromolecular Architecture

144 Michael Becker & Edgar Weckert

will be necessary; thus, the repetition rate of the X-ray FEL can be low. However, for samples where multiple shots can be taken, it would be ad- vantageous to have a high frequency of shots. Of course, detectors must be developed that will allow efficient data collection. With regard to the timescale of damage, cryo-cooling may be somewhat irrelevant; how- ever, cryo-cooling may be desirable for ease of sample manipulation. Much of the scattering path after the sample should probably be in vacuum.

In conclusion, it seems likely that the main impediment to success will be to obtain two-dimensional crystals that are well ordered over the dimensions required to do the experiments. Compared to two-dimen- sional crystals in EM, where a crystal of 10-pm diameter is near the up- per size limit, the desired crystal size seems rather large. However, com- pared to supported bilayers that are currently used for diffraction at syn- chrotrons (Verclas et al., 1999), this is very small. Although EM has some advantages (Henderson, 199.3, the ability to perform grazing-inci- dence diffraction experiments with X-Rays would allow a broader range of support materials for two-dimensional crystals to be tried. The recent emphasis on ‘nanotechnology’ gives cause for hope that this hurdle might be overcome.

ACKNOWLEDGEMENTS Michael Becker is grateful to a large number of colleagues for helpful comments. The support and comments of former and current colleagues at Brookhaven -- Cathy Lawson, Lonny Berman, Bob Sweet, Dieter Schneider, and Malcolm Cape1 -- are greatly appreciated. The National Institutes of Health and the Department of Energy are acknowledged for financial support, and the Deutsches Elektronen Synchrotron is acknowl- edged for sponsoring a visit to the lab of E.W.

REFERENCES 1. Als-Nielsen J, Jacquemain D, Kjaer K, Leveiller F, Lahav M, and Leise-

rowitz L. Principles and applications of grazing incidence X-ray and neu- tron scattering from ordered molecular monolayers at the air-water inter- face. Physics Reports, 1994; 246; 25 1-3 13.

Page 158: Conformational Proteomics of Macromolecular Architecture

Deternzining Structures of Membrane Proteins 145

2. Arthur J . Status of the LCLS x-ray FEL program (invited). Rev.Sci.Znst., 2002; 73; 1393-1395.

3. Becker M. Preliminary considerations on the possibility of using hard X- rays from a Free Electron Laser to determine structures of membrane pro- teins in 2-dimensional crystals. Biophysical J , 1999; 76: A121.

4. Becker M. Considerations on the possibility of using hard X-rays from a Free Electron, Laser to determine structures of membrane proteins in 2- dimensional crystals, in Transparencies from the EMBO Workshop: Poten- tial Future Applications in Structural Biology of an X-ray Free Electron Laser at DESY, EMBL, Hamburg, 1999, pp. 184-198. (Note: there was an error in this early proposal regarding the wavelength dependence of the scattering.)

5 . Becker M, Nagarajan V, and Parson WW. Properties of the excited singlet states of bacteriochlorophyll a and bacteriopheophytin a in polar solvents. J.Anz.Chern.Soc., 1991 ; 113; 6840-6848.

6. Bonifacio R, Pellegrini C, and Narducci L. Collective instabilities and high gain regime in a free-electron laser. Optics Conzm., 1984; 50; 373.

7. Chang S, Head-Gordon T, Glaeser RM, and Downing KH. Chemical bond- ing effects in the determination of protein structures by electron crystallog- raphy. Acta Cryst. A, 1999; 55; 305-3 13.

8. Darst SA, Ahlers M, Meller PH, Kubalek EW, Blankenburg R, Ribi HO, Ringsdorf H, and Kornberg RD. Two-dimensional crystals of streptavidin on biotinylated lipid layers and their interactions with biotinylated macro- molecules. Biophys. J., 1991; 59; 387-396.

9. Dauter Z. New approaches to high-throughput phasing. Curr.Opin.Struct.Biol., 2002; 12; 674-678.

10. Doniach S. J.Synchrotron Rad., 1996; 3; 260-267. 11. Edwards AM, Zhang K, Nordgren CE, and Blasie JK. Heme structure and

orientation in single monolayers of cytochrome c on polar and nonpolar soft surfaces. Biophys. J . , 2000; 79; 3105-31 17.

12. Frank J. The ribosome structure and functional ligand-binding experiments using cryo-electron miscroscopy. J.Struct.Biol., 1998; 124; 142-150.

13. Garman E, and Nave C. Radiation damage to crystalline biological mole- cules: current view. J.Synchrotron Radiat., 2002; 9; 327-328.

14. Henderson R. The potential and limitations of neutrons, electrons and X- rays for atomic resolution microscopy of unstained biological molecules. Quart Rev Biophys, 1995; 28; 171-193.

Page 159: Conformational Proteomics of Macromolecular Architecture

146 Michael Becker & Edgar Weckert

15. Hendrickson WA. Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science, 1991 ; 254; 5 1-58.

16. Jap B, Zulauf M, Scheybani T, Hefti A, Baumeister W, Aebi U, and Engel A. 2D crystallization: from art to science. Ultramicroscopy, 1992; 46; 45- 84.

17. Kim K-J. Three-dimensional analysis of coherent amplification and self- amplified spontaneous emission in free-electron lasers. PhyxRev. Letters,

18. Kuhlbrandt W, Wang DN, and Fujiyoshi Y. Atomic model of plant light- harvesting complex by electron crystallography. Nature, 1994; 367; 6 14- 621.

19. Lebeau L, Lach F, Venien-Bryan C, Renault A, Dietrich J, Jahn T, Palm- gren MG, Kuhlbrandt W, and Mioskowski C. Two-dimensional crystalliza- tion of a membrane protein on a detergent-resistant lipid monolayer. J.Mol.Biol., 2001; 308; 639-647.

20. Luecke H, Schobert B, Richter HT, Cartaille, JP, and Lanyi JK. Structure of bacteriorhodopsin at 1.55 A resolution. J.Mol.Biol., 1999; 291; 899-91 1.

21. Materlik G, and Tschentscher T. in TESLA: Technical Design Report. Part V: The X-Ray Free Electron Laser, DESY, Hamburg, 2001.

22. Miao J, Hodgson KO, and Sayre D. An approach to three-dimensional structures of biomolecules by using single-molecule diffraction images. Proc.Nut1. Acad. Sci. U. S. A., 200 1 ; 98; 6535653 6.

23. Millane RP. Phase retrieval in crystallography and optics. J.Opt.Soc.Am.A,

24. Mitsuoka K, Hirai T, Murata K, Miyazawa A, Kidera A, Kimura Y, and Fu- jiyoshi Y. The structure of bacteriorhodopsin at 3.0 resolution based on electron crystallography: implication of the charge distribution. J.Mol.Biol.,

25. Neutze R, Wouts R, van der Spoel D, Weckert E, and Hajdu J. Potential for biomolecular imaging with femtosecond X-ray pulses. Nature, 2000; 406;

26. Ren Z, Perman B, Srajer V, Teng TY, Pradervand C, Bourgeois D, Schotte F, Ursby T, Kort R, Wulff M, and Moffat K. A molecular movie at 1.8 A resolution displays the photocycle of photoactive yellow protein, a eubacte- rial blue-light receptor, from nanoseconds to seconds. Biochemistry, 200 1 ;

1986; 57; 1871-1874.

1990; 7; 394-41 1.

1999; 286; 861-882.

752-757.

40; 13788-13801.

Page 160: Conformational Proteomics of Macromolecular Architecture

Determining Structures of Membrane Proteins 147

27. Reviakine I, Bergsma-Schutter W, and Brisson, A. Growth of protein 2-D crystals on supported planar lipid bilayers imaged in situ by AFM. J Strucf Biol, 1998; 121; 356-361.

28. Robinson IK, Vartanyants IA, Williams GJ, Pfeifer MA, and Pitney JA. Re- construction of the shapes of gold nanocrystals using coherent X-ray dif- fraction. Phys.Rev.Lett., 2001; 87; 195505.

29. Sayre D, and Chapman HN. X-ray microscopy. Acta Cryst, 1995; A51; 237- 252.

30. Sheinerman F, Norel R, and Honig B.Electrostatic aspects of protein- protein interactions. Curr.Opinion in Struc,Biol., 2000; 10; 153-159.

3 1. Sliz P, Harrison SC, and Rosenbaum G. How does radiation damage in pro- tein crystals depend on X-ray dose? Structure, 2003; 11; 13-19.

32. Solem JC. Imaging biological specimens with high-intensity soft x-rays. J OptSoc Am B , 1986; 3; 1551-1565.

33. Subramaniam S, and Henderson R. Electron crystallography of bacterio- rhodopsin with millisecond time resolution. J.Struct. Biol., 1999; 128; 19- 25.

34. Szoke, A. Holographic methods in X-ray crystallography 11. Detailed theory and connection to other methods of crystallography. Acta Cryst, 1993; A49;

35. Teng TY, and Moffat K. Radiation damage of protein crystals at cryogenic temperatures between 40 K and 150 K. J.Synchrotron Radiat., 2002; 9; 198- 201.

36. Tigner, M. A possible apparatus for electron clashing-beam experiments. Nuovo Cimento, 1965; 37; 1228- 123 1.

37. Tollin P, Main P, Rossman MG, Stroke GW, and Restrick RC. Holography and its crystallographic equivalent. Nature, 1966; 209; 603-604.

38. Verclas SAW, Howes PB, Kjaer K, Wurlitzer A, Weygand M, Buldt G, Dencher NA, and Losche M. X-ray diffraction from a single layer of purple membrane at the aidwater interface. J M o l Biol, 1999; 287; 837-843.

39. Warshel, A. Computer Simulations of Chemical Reactions in Enzymes and Solutions. 1991, John Wiley & Sons, New York.

40. Weckert E. in TESLA Technical Design Report. Part V: The Free Electron Laser, 2001, pp. 160-161, Ed. Materlik, G., Tschentscher, T., DESY, Ham- burg, Germany.

41. Xu, Atomic resolution X-ray hologram Appl.Phys.Lett., 1996; 68; 1901- 1903.

8.53-866.

Page 161: Conformational Proteomics of Macromolecular Architecture

Chapter 7

FUNCTIONAL DETAILS ON MEMBRANE PROTEINS OBSERVED THROUGH AN

ELECTRON BEAM

Yoshinori Fujiyoshi”

Water is the most abundant molecule in cells on the earth. The cell membranes are therefore required to provide an effective water channel function. By electron crystallography using electron cryo-microscopy, with a super cooled stage, the structure of the water channel protein; aquaporin-1 is analyzed. The resolved structure reveals a unique folding structure allowing an intriguing mechanism for water channeling. This exemplifies how improvement in technical design, based on theoretical considerations, has successfully advanced scientific insight.

Keywords: cryo-electron microscopy, helium cooled specimen stage, structure analysis, electron crystallography, water channel

HELIUM-COOLED SPECIMEN STAGE FOR HIGH- RESOLUTION ELECTRON MICROSCOPY We earlier developed a superfluid helium stage for the electron microscope, which then achieved a resolution of 2.6 A. A paper describing details of design and construction of that stage was published in 19915. However, for structure analysis of biological molecules at atomic resolution, an instrument yielding a resolution better than 2.6 A would be highly beneficial, because biological molecules consist mainly of light atoms which have very small values for their atomic scattering factors in the high-resolution range. Thus, the deterioration of the

‘Department of Biophysics, Faculty of Science, Kyoto University, Oiwake, Kitashirakawa, Sakyo- ku, Kyoto, 606-8502 JAPAN. Email address: [email protected]

148

Page 162: Conformational Proteomics of Macromolecular Architecture

Functional Details on Membrane Proteins 149

contrast transfer function strongly affects the signal-to-noise ratio (S/N) of the Fourier components of the images, especially in the high- resolution range. The achievable resolution in the structure analysis of biological molecules is therefore limited by the image quality (S/N ratio). Better phase information can only be obtained by evaluation of higher resolution images, which, in turn, can only be recorded by a higher resolution electron cryo-microscope. Furthermore, resolutions higher than 2.0 A enable us to observe the lattice lines of gold crystals, which are helpful for the calibration of the instrument. Therefore, the specimen stage and the pole pieces of the objective lens of our initial helium- cooled electron microscope were modified to reduce the spherical aberration of the lens from 2.6 mm to 1.9 mm at an acceleration voltage of 400 kV. These modifications improved Scherzer's limit" of the second version of our electron cryo-microscope from 2.6 A to 2.0 A

High-resolution structure analysis of proteins based on electron microscopy requires high-resolution images, including many images from highly tilted specimens. So far, the success rate in collecting high quality images from highly tilted 2D crystals is relatively low. Often the image quality is deteriorated by image drift and/or shift (very rapid drift), which are attributed to the electric charge-up of the irradiated area of the specimen. The low yield of high-resolution images forces us to spend many days at the microscope and to look at many specimens. Therefore, ease of operation of an electron cryo-microscope is a tremendous help in the tedious task of electron microscopic data collection and thus very important for the 3D structure analysis of biological molecules.

The supeffluid helium stage of our electron cryo-microscope enabled us to collect high-resolution images at a stage temperature of 1.5 K and from 4.2 K up to room temperature. However, as described previously5, at a temperature of 1.5 K we were confronted with a serious problem of high frequency vibration of the stage.

The main cause for the stage vibration was overcome by cutting off the turbulent flow of liquid helium through the capillary by heating a part of the capillary just above the BeCu wire inside the capillary near the helium tank. This modification stabilized the stage at 1.5 K and enabled us to record high-resolution images at this temperature. To collect images at 4.2 K, the evacuation of the pot through the exhaust pipe had

Page 163: Conformational Proteomics of Macromolecular Architecture

150 Yoshinori Fujiyoshi

to be stopped but a fully charged pot of liquid helium was sufficient to work for 15 min; during this time high-resolution images could reproducibly be obtained. However, this method required time to manipulate two valves in order to cool down the pot and to allow the pressure to again rise to atmospheric pressure after evacuation of the pot. This inconvenience was resolved by optimizing the helium flow through the capillary, which enables us now to record high-resolution images at 4.2 K without further manipulation for up to 5 hr.

The third version of our superfluid helium stage consists of 4 major parts: the liquid nitrogen tank, the liquid helium tank, the 1.5 K pot, and the specimen stage. The specimen stage is connected to the liquid nitrogen tank by flexible copper braids. The helium tank is kept at 4.2K. The capillary and exhaust pipe are connected to the pot. Both pot and specimen holder are cooled down to 1.5 K when the helium in the pot is evacuated through the exhaust pipe by means of a big rotary pump, because the liquid helium in the pot changes to the superfluid phase when it is cooled below 2.17 K under a saturated vapor pressure of 4.16 x103 Pa. There is no temperature gradient across the liquid helium under a finite heat input. Evaporation of helium occurs without formation of bubbles, because the effective thermal conductivity of superfluid helium is extraordinarily high. Due to the convective type of heat conduction, it may become about 1,000 times higher than that of copper at room temperature. The pot is mounted on a thermal insulator made of fiber- reinforced plastic (FRP) that is fixed on the specimen stage. The specimen stage is also made of FRP and is firmly connected to the lower part of the thermal shield, which is cooled with liquid nitrogen by flexible copper braids. While the stage is thermally very strongly coupled to the nitrogen tank, the mechanical coupling is only very weak. This mechanical uncoupling of our specimen stage from vibration sources such as the nitrogen and helium tanks is a very important feature of the stage.

It should be noted that we constructed and arranged all parts of our cryo-state, namely, the nitrogen tank, helium tank, the pot and pot support, upper and lower thermal shields, and the flexible copper braids, with an axial symmetry in order to minimize the drift induced by any temperature change in an individual part of the stage. This is one of the

Page 164: Conformational Proteomics of Macromolecular Architecture

Functional Details on Membrane Proteins 151

main reasons why a top-entry design was chosen for our specimen stage instead of a more commonly used side-entry design. Only a top-entry stage can be constructed symmetrical about the electron-optical axis and this symmetrical construction of the cryo-stage enables us now to record easily and reproducibly images of gold crystals displaying a lattice resolution better than 1 A at a stage temperature of 4.2 K.

The low consumption rates of liquid nitrogen and liquid helium, which are 170 ml/hr and 140 mlhr, respectively, are another important feature of our stage. After refilling the two coolants, the working time of our cryo-stage is about 5 hr, during which no maintenance is required. From our experimental perspective, these low consumption rates are very important for efficient and successful high-resolution electron cryo- microscopy. With the prototype of the cryo-electron microscope thus developed, the resolution and operational difficulties were improved6.

MOLECULAR CONTRIVANCE OF WATER SELECTIVE CHANNELS By electron crystallography using electron cryo-microscopy, structure of water channel protein; aquaporin- 1 was analyzed and revealed molecular contrivance such as peculiar aquaporin fold. The structure analyses gave an answer to the puzzling questions about the effective water selectivity that is such an essential basic function for signal transduction and homeostasis in all life forms.

Water is the most abundant molecule in any cells on the earth. The cell membranes are therefore required to provide an effective water channel function. Almost ten years ago, a 28kD membrane protein, which was eventually named aquaporin-1 (AQPl), was identified in red blood cells and the water channel function was clearly shown in Xenopus oocytes because simple lipid bilayer can penetrate only limited amount of water. The cell membrane exquisitely regulates entry and exit of ions because ion concentration and its dynamic change are strongly related with cell signaling functions. The water channels, therefore, need to keep ionic conditions in a cell, while these channels permeate a large amount of water. The pH regulation in the cell was well known to be also crucially important for cell functions. The aquaporin- 1 molecule attained

10

Page 165: Conformational Proteomics of Macromolecular Architecture

152 Yoshinori Fujiycrshi

effective water selective transport keeping the strict selectivity and it gave us puzzling questions. An atomic model analyzed at 3.8 resolution by electron crystallography gave an answer to such puzzling questions and revealed a molecular contrivance of water selective channel'. The atomic model is interestingly a first structure of human source membrane proteins. For accomplishing the effective water channel functions, the structure showed peculiar structural determinants including an unusual fold (AQP fold) as shown in Fig. 1.

The handedness of aquaporin- 1 structure was carefully examined and the right handed helical bundle structure of AQPl was confirmed by using bacteriorhodopsin structure which was analyzed at an atomic resolution' before publication of a paper'*, because relatively low resolution, such as 6 A, did not allow anyone to make an atomic model but just assign a helix on a rod shape density and required confirmation of the handedness. The handedness was directly confirmed at a higher resolution of 3.8 A at which an atomic model was constructed in 20009. While the right-handed helical bundle of water channel was thought to be less usual, it is suitable for water selective channeling function, especially for high-speed water permeation and for formation of dielectric barrier for ions, because the right-handed helical interaction stabilizes a larger angle of the helical bundle'. Such large angle of the interaction enable the molecule to form a large inner space separated from lipid environment with only small number of helices, such as 6 helices. The large inner space is important for the functions at least from following two reasons. First, the large space enables the two prominent loops, LB and LE with short helices to come into the inside of the six helical bundles. The two short helices are very important for the formation of electrostatic field at the membrane center where the conserved NPA sequences on both loops are faced. The electrostatic field is responsible for the formation of the largest orientation vector at the facing position of the two short helices7. Secondly, the highly tilted helical interaction gives the widely opened channel entrances, which form so-called hourglass shape of the pore show different channel shapes and arrangement of the characteristic two short helices in water channel from those in both of KcsA, K' channel3 and C1C type C1- channel4. The both channels were also revealed to have the right-handed helical bundle.

Page 166: Conformational Proteomics of Macromolecular Architecture

Functional Details on Membrane Proteins 153

Importantly, the shape of AQPl is opposite of a potassium channel3 and the constriction rather than a cavity at membrane center different from the potassium channel could ensure a high dielectric barrier so that any ions are repelled but neutral molecules such as water might be allowed to go through.

The secondary structure with two tandem repeats and the structure analyzed by electron crystallography suggested two-fold symmetry of the AQPl protomer. The pseudo two fold symmetry based on a peculiar helical arrangement of the molecule is also nicely coincide with the symmetrical channel function transferring water molecules from both channel sides according with osmotic pressure and also molecular symmetry of water molecule with two mirror symmetry (Fig. 1). Near the NPA motifs, which are highly conserved in aquaporin family, the symmetrical constriction including electrostatic field formed by the two short helices is functionally important for orientation of a symmetrical water molecule.

The three helices in the first tandem repeat in aquaporin protomer form a roughly linear arrangement but not according to their position in the primary structure. Instead the first helix named H1 is sandwiched between the other helices of the bundle, that is, 2-1-3. The loop B between helix 2 (H2) and helix 3 (H3) folds back into the membrane from the cytoplasmic side and forms a short helix named HB after the NPA sequence. The short helix HB located adjacent to helix 6 (H6) in the second AQPl repeat. The second AQPl repeat adopts essentially exactly the same fold as the first repeat but the second repeat is incorporated into the membrane in the opposite orientation compared to the first repeat. The hvo repeats are connected by loop C, which spans the entire width of the molecule on the extracellular surface. In the second repeat, loop E folds back into the membrane from the extracellular side. After the NPA motif, loop E forms the second short helix HE that interacts with H3 of the first AQPl repeat and brings the polypeptide chain back to the extracellular surface. The H6 crosses the membrane adjacent to helix 4 (H4) on the opposite side from helix 5 (H5).

Stable helix-helix interactions are used for stabilizing the unusual aquaporin fold. The interaction between H2 and H5 might be especially

Page 167: Conformational Proteomics of Macromolecular Architecture

154 Yoshinovi Fujiyoshi

Fig.1: Structure of AQPl tetramer. In one protomer, the six transmembrane helices and also the two short helices are represented in different colors (a). Each helix is indicated by its helical number.

important for the folding process of the molecule, because two helical bundles 1-3 and 4-6 might meet through interaction between H2 and H5 after each bundle formation in the folding process of the unique aquaporin fold. Highly conserved Gly residues of G57 and G173 on H2 and H5 of AQPI stabilized the helical interaction. The two-glycine residues form stable contact between helices H2 and H5 such as the ridges into grooves structure.

The critical function of AQPl is exceptionally high water permeability, 2 billion water molecules per monomer per second. The

Page 168: Conformational Proteomics of Macromolecular Architecture

Functional Details on Membrane Proteins 155

loops B and E form part of the surface of the aquaporin pore. The helices, H2 and H5 together with the C-terminal halves of H1 and H4 but not with H3 and H6 form the remaining surface of the aqueous pore. Almost all residues within central 20 A zone in the pore are highly hydrophobic, while one might expect that the AQPl has a hydrophilic pore because of the water channel. High resolution X-ray crystallography combined with low temperature measurements revealed that the movable water molecules at room temperature were not in a hydrophilic atmosphere but in a hydrophobic one. The data collected at room temperature showed less density of water molecules surrounded by hydrophobic amino acids while the water molecules gave clear density at lower temperature. On the other hand, the density of water molecules in hydrophilic amino acids gave no significant density change by rising up to room temperature [Tsukihara, private communications]. The results suggest the hydrophobic pore wall might help water to travel through the pore with high speed at a temperature in ordinary living systems.

A narrow part of the pore with about 3 A in diameter is located at the middle of the membrane where loops B and E interact with each other especially with proline 77 and 193 of the conserved NPA sequences. Despite the enormous capacity for water conductance, the AQPl pore also exhibits marked selectivity. Water molecules were found to be strongly oriented in the channel interior, through alignment of their dipoles with the electric field exerted by the protein, causing water molecules to rotate by 180 degrees upon passage7. Hydrogen bond competition between water molecules and polar groups in the protein interior were found to dominate the permeation process. Two major interaction sites for water molecules were identified inside the channel: the NPA and Arm regions. The two highest enthalpic barriers for water molecules are located directly adjacent to the NPA region. This, together with the water rotation that is centered also here, renders the NPA region a major selectivity filter. Contiguous hydrogen-bonded water chains are known to be efficient proton conductors. Aquaporin must prevent proton conduction along its pore, to maintain the proton gradient across the cell membrane that serves as a major energy storage mechanism.

Water regulation is crucially important for every cells and therefore for all life forms on the earth. The water channels have been identified in

Page 169: Conformational Proteomics of Macromolecular Architecture

156 Yoshiiiori Fujiynslti

almost every 1 iving organim from plants to animals, from prokaryotes to cukaryotes including humans. Evolutional processcs developed molecular contrivanccs of very efcective water channels' based presumably on a unique AQP fold. The folding instability as well as the complexity of the folding process was overcome by stabilizing machineries in aquaporins, such as stable helix-helix interactions and the tetramer formation. Unexpected structural features, such as right handed helical bundle and hydrophobic channel wall was revealed to facilitate the water transport through the channel. All of these were impossible to predict without structure analysis at high resolution.

A C K N O W L E D G E M E N T S We express our sincere thanks to Drs. K. Murata, K. Mitsuoka, T. Hirai. T. Waltz, P. Agre, J.B. Heyman and A. Engcl for their collaboration to work on Aquaporin-1 .

REFERENCES 1. Agre, P., King, L.S., Yasui, M.. Guggino, W.B., Ottersen, O.L., Fujiyoshi,

Y., Engel, A. and Nielsen, S.: Aquaporin water channels--from atomic structure to clinical medicine, J. Physiology 542, 3-16 (2002).

2. Branden, C. and Tooze, J.: Introduction to Protein Structure, Garland Publishing, Inc. New York, 40-42 (1999).

3. Doylc, D., Cabral, J.M., Pl'uetzner, K.A.; Kuo, A., Gulbis, J.M., Cohen: S.T+ Chait, B.T and MacKinnon, R.: The structure of the potassium channel: molecular basis of R+ conduction and selectivity, Science 280, 69- 77 (1998).

4. Dutzeler, R., Campbell, E.B., Cadene, M., Chait, B.T., and MacKinnon, R.: X-ray structure of a CIC chloride channel at 3.0 A rcvcals the molecular basis of anion selectivity, Nature 415, 287-294 (2002).

5. Fujiyoshi, Y., Mizusaki, T.: Morikawa, K. et al.: Development of a Superfluid Helium Stage for High-Resolution Electron Microscopy, Ultramicroscopy 38,241-251 (1991).

Crystallography, Adv. Biuphys. 35,2580 (1998). 6. Fujiyoshi, Y.: The Structural Study of Membrane Proteins by Electron

Page 170: Conformational Proteomics of Macromolecular Architecture

7. de Groot, B.L. and Grubmueller. H.: Water permeation across biological membranes: mechanism and dynamics of aquaporin- I and G1 pF, Science,

8. Kimura, Y. Vassylyev, D.G., Miyazawa. A,. Kidera, A., Matsushima, M., Mitsuoka, K., Murata, K., Hirai: T. and Fujiyoshi, Y.: Surface of Bacteriorhodopsin Revealed by High-Resolution Electron Crystallography, Nature 389,206-21 1 (1997).

9. Murata, K., Mitsuoka, K., Hirai, T.. Walz, T., Agre, P., Heymann, J.B., E.nge1, A., Fujiyoshi, Y.: Structural determinants of water permeation through aquaporin- I . N ~ i h r ~ 407,599-605 CZOOO).

10. Preston, G.M., Carroll, T.P., Gugginn. W.R., Agre: P: Appcarance of watcr channels in Xcnopus oocytcs expressing rcd cell CHIP 28 protein. Scierm!

1 1 . Scherzcr, 0.: The Theoretical Resolution Lirnil of the Electron Microscope. .I. 4pp1. Phys. 20, 20-29 (1949).

12. Walz, T., Hirai, T., Murata, K., Heymann: J.B., Mitsuoka, K., Fujiyoshi, Y., Smith, B. L., Agre, P. and Engel, A,: Three-Dimensional Structure of Aquaporin 1, Nature, 387, 624-627 (1997).

294,2353-2357 (2001).

256,385-387 ( 1992).

Functional Details on Membnrane Proteins 157

Page 171: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 172: Conformational Proteomics of Macromolecular Architecture

PART 111

PROTEIN SHUTTLE

Page 173: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 174: Conformational Proteomics of Macromolecular Architecture

Chapter 8

CLATHRIN AND COMPANIONS

Barbara Pearse*

The present understanding of Clathrin and its functions in cellular con- texts comes from the combined efforts in different fields. These include strikingly diverse examples: the developing chicken oocyte as the yolk proteins are internalized, human placenta, nerve synapses and unfortu- nately, viral infection. Clathrin cage assemblies with adaptor protein AP2 and other components are here described, based on studies by electron cryomicroscopy and X-ray crystallography. Keywords: Clathrin; coated vesicles; electron cryomicroscopy; clathrin adaptor AP2; AP180; auxilin

THE CAGE Electron micrographs of glutaraldehyde fixed, thin sections of cells first revealed coated membrane and the process of coated vesicle formation. Based on such studies in cells, steps in the endocytic cycle in nutrient uptake and during synaptic transmission were described; coated pit as- sembly on the cytoplasmic surface of the plasma membrane, invagination and budding, scission of the coated vesicle containing concentrated cargo, uncoating and fusion of the vesicle with an acceptor compartment.

When extracts of particles were made from homogenized brain, structures, negatively stained in 1% uranyl acetate, were observed in electron micrographs and were thought to represent these endocytic- coated profiles. The particles were purified and the major complexes characterized, now well known as clathrin and the AP2 adaptor complex, in association with lipid vesicles. The particles were heterogeneous in size and shape but a small number of symmetrical examples were identi-

'Medical Research Council Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, U.K. Email address: <[email protected]> <http://www.mrc-lmb.cam.ac.uk>

161

Page 175: Conformational Proteomics of Macromolecular Architecture

162 Barbara Pearse

fied, initially by eye, indicating that the outer cage was constructed from a lattice of 12 pentagons and a variable number of hexagons depending on the size. One of these was the hexagonal barrel, shown in Fig. 1, with a height of 650A and exhibiting 622 symmetry'.

As purification protocols for the individual protein complexes devel- oped5", it became possible to reunite them to form these smaller particles with a significant proportion of hexagonal barrels. These were then ex- amined by cryoelectron microscopy, the frozen specimens being tilted as in the previous study, to obtain a series of micrographs. The 3- dimensional LEG0 model could then be generated by image reconstruc- tion of the density observed. These experiments showed that the coats displayed three shells of density: the outermost representing the familiar clathrin polyhedral cage; the second contributed by the N-terminal re-

gions of the clathrin heavy chain and the third by the AP2 complexes. In projections of isolated coated vesicles, a fourth shell of high density correlated with the membrane of the internal vesicle and its cargo'. Meanwhile the loca- tion of the proteins in cells was investigated by im-

Figure 1. The picture illustrates features of our current structural information on the clathrin molecule. A single triskelion is traced in the cryoEM map, showing how clathrin is built into its characteristic coat assem- bly with its companion complex, the AP2 adaptor, Fitting the avail- able crystal structures of domains of the clathrin heavy chain result in an almost complete molecular model of clathrin within its natu- ral lattice'.''

Page 176: Conformational Proteomics of Macromolecular Architecture

Clathrin and Companions 163

munofluorescence and immunoEM and their sequences determined by analysis of cDNAs encoding for thems96.

The next stage of higher resolution image reconstruction relied on this earlier LEG0 model as a reference. In this work, over 1000 images of the hexagonal barrel were examined by FREEALIGN, based on the SPIDER computer-imaging package, and used to construct the 3D-map at 21A resolution presented here7. At this definition it is possible to dis- tinguish all of the clathrin molecules, each heavy chain of 450A length, in their packing arrangement within the coat. The X-ray structures of the N-terminal P-propeller domain and linker, and a portion of the proximal rod domain (confirming a repeating structural unit within the heavy chain) dock neatly into the EM map to reach the current In addition, analysis of the AP2 complex is well under

THE SHUTTLE However this is only the beginning of the story: a static image represent- ing a structural basis for understanding the endocytic cycle. In the living brain, this is a highly dynamic process under the exquisite control of a myriad of accessory and regulatory proteins that act sequentially in the cycle. This metastable coated scaffolding, assembled on the cytoplasmic surface of the specialized synaptic membrane, integrates signals from the exterior and not only retrieves synaptic vesicle containers, for refilling with transmitter, but also helps modulate the integrity of the synapse de- pending on the cargo gathered there2343s.

The different structural layers display binding-sites that are crucial to these functions, as shown in the schematic diagram on the opposite page. Receptors, which span the membrane of the vesicle, generally have tight, highly specific, binding sites for their ligands (e.g. nutrients and growth factors) in order to concentrate them from the external medium. These receptors carry an endocytic motif (YxxQ) on their cytoplasmic portions which slots into a receiving socket on the 1-12 subunit of the AP2 adaptor rather like a two-pin plug. Other types of receptor, spanning the mem- brane 7 times like G-protein-coupled receptors, bear other motifs recog- nized by monomeric adaptors, including arrestin, which contribute to a network of interactions in the inner shell of density surrounding the

picture3,5,10.

Page 177: Conformational Proteomics of Macromolecular Architecture

L 64 Barbara Pearse

.+ Extracellular ligand yxxa-bearing receptor AP-2

Figure 2. The clathrin and companions in their functional organization. Derived from Musacchio et al.3

membrane and modulate the signaling capabilities of the receptor. In turn, the AP2 adaptor promotes the assembly of the structure by clipping onto an available clathrin heavy chain N-terminal hook with two distinct regions of its p2 subunit. One of these is the LLNLD amino acid se- quence, found in its 2-300A long flexible portion, which occupies a

Page 178: Conformational Proteomics of Macromolecular Architecture

Clathrin and Companions 165

groove in the clathrin P-propeller. The second is a hydrophobic cup formed by the P2 -appendage domain thought to bind weakly to a DVF motif in the first canonical structural repeat of the clathrin distal rod. The clathrin polyhedral lattice is formed on the membrane as the terminal domains are gathered and group together, making the possibility of fur- ther contacts with other triskelions, one per vertex, over a considerable area’-5.

THE COMPANIONS How is this assembly, and also disassembly, accomplished in the cell? Two other major structural proteins form part of the coat, in addition to clathrin and AP2. These are AP180 and auxilin, members of the list of accessory and regulatory proteins, which bind to the AP2 adaptor a- and P-appendage domains, by virtue of the DPWDPF motifs in their respec- tive sequences. These additional proteins, including epsin, Eps 15, am- phiphysin etc. are thus recruited as they are required during the CCV cy- cle. AP180 and auxilin are thought to act as co-chaperones of clathrin throughout the cycle as it assembles and is released from the

AP180, in its N-terminal region, has a binding site for phosphatidy- linositol-4,s-biphosphate (PI(4,S)P2). This enables it to bind transiently to lipid monolayers and bilayers rich in this molecule2. Beyond this structured domain AP180 continues as a very long flexible polypeptide reminiscent of the p2-hinge region, containing similar clathrin box bind- ing motifs. Thus it is able to recruit clathrin to the membrane and pro- mote its assembly, as also is epsin. Auxilin qualifies as a third member of this group based on overall features of its sequence and properties and its location in the cage structure observed by cryoEM. Thus one can imag- ine that the remarkable, 3-dimensional, extended clathrin molecules are caught by their terminal domains and, in an orderly fashion, tethered to the membrane surface by a concerted action of these proteins and are thus ready, in the right orientation, to join a forming assembly. At the same time, AP2, also recognizing PI(4,S)P2, is enriched at the membrane and as it becomes activated and binds its cargo, it encourages invagina- tion of a coated pit by clipping onto the clathrin lattice via its 02- appendage + hinge portions, an assembly protein being d i ~ p l a c e d

Page 179: Conformational Proteomics of Macromolecular Architecture

166 Barbara Pearse

Auxilin is also characterized by its C-terminal J-domain, which is shown to interact with the chaperone Hsc70 by a conserved HPD motif in the presence of ATP. Together, apparently, they form an ATP- dependent chaperone complex positioned as required in the dynamic clathrin assembly/disassembly cycle, the maintenance of which is crucial to synaptic transmission.

Companion biophysical techniques together with cryoelectron mi- croscopy, X-ray crystallography and NMR studies of domains will help to unravel the delicate kinetic balance amongst the interacting facets of this metastable scaffolding. On this depend our memory, mood and be- haviour. The same system operating in every cell, responsive to the re- ceptor cargo, tells that particular cell how to respond to its environment and interacting cells, in order to carry out its required physiological func- tion. Mistakes in the machinery have been linked with both neurological conditions and cancers. Viruses hijack the process of internalization in a variety of ways causing disease and the immune system depends on en- docytosis to combat them.

REFERENCES 1. Crowther, RA, JT Finch, and BM Pearse. 1976. On the structure of coated

vesicles. J Mol Biol. 103:785-98. 2. Ford, MG, BM Pearse, MK Higgins, Y Vallis, DJ Owen, A Gibson, CR

Hopkins, PR Evans, and HT McMahon. 2001. Simultaneous binding of PtdIns(4,5)P2 and clathrin by AP180 in the nucleation of clathrin lattices on membranes. Science. 291: 1051-5.

3. Musacchio, A, CJ Smith, AM Roseman, SC Harrison, T Kirchhausen, and BM Pearse. 1999. Functional organization of clathrin in coats: combining electron cryomicroscopy and X-ray crystallography. Mol Cell. 3:76 1-70.

4. Owen, DJ, Y Vallis, BM Pearse, HT McMahon, and PR Evans. 2000. The structure and function of the beta 2-adaptin appendage domain. Embo J.

5. Pearse, BM, CJ Smith, and DJ Owen. 2000. Clathrin coat construction in endocytosis. Curr Opin Struct Biol. 10:220-8.

6. Ponnambalam, S, MS Robinson, AP Jackson, L Peiperl, and P Parham. 1990. Conservation and diversity in families of coated vesicle adaptins. J Biol Chem. 265:4814-20.

1914216-27.

Page 180: Conformational Proteomics of Macromolecular Architecture

Clathrin and Companions 167

7. Smith, CJ, N Grigorieff, and BM Pearse. 1998. Clathrin coats at 21 A reso- lution: a cellular assembly designed to recycle multiple membrane recep- tors. Embo J . 17:4943-53.

8. Ungewickell, E, and D Branton. 1981. Assembly units of clathrin coats. Nu- ture. 289:420-2.

9. Vigers, GP, RA Crowther, and BM Pearse. 1986. Location of the 100 kd-50 kd accessory proteins in clathrin coats. Embo J . 5:2079-85.

10. Ybe, JA, FM Brodsky, K Hofmann, K Lin, SH Liu, L Chen, TN Earnest, RJ Fletterick, and PK Hwang. 1999. CIathrin self-assembly is mediated by a tandemly repeated superhelix. Nature. 399:371-5.

Page 181: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 182: Conformational Proteomics of Macromolecular Architecture

PART IV GIANT ENZYMES

Page 183: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 184: Conformational Proteomics of Macromolecular Architecture

Chapter 9

MULTIFUNCTIONAL ENZYME COMPLEXES: MULTISTEP CATALYSIS BY

MOLECU LAR MACH I N ES

Richard N. Perham*

Multifunctional enzyme complexes can reach molecular masses of MDa and above, and exceed the ribosome in size. Covalently attached swinging arms, such as lipoyl groups, biotinyl groups and phospho- pantetheinyl groups, are essential to their reaction mechanisms. Un- expectedly, it turns out that protein domains contribute to the processes of molecular recognition that define, channel and protect the substrates and catalytic intermediates. The crucial part played by the mechanical motion of protein domains and the role of the molecular architecture that underlies the interactions of the component enzymes can now be identified and assessed. Such enzymes are better now regarded as so- phisticated biological nanornachines.

Keywords: Multienzyme complexes, multifunctional proteins, substrate channeling, self-assembly, nanomachines

INTRODUCTION The book of life is written in the language of DNA, but the story is made possible only by the activities of myriads of protein molecules. Among the most important of these proteins are the enzymes; without catalysis, a biologically useful time would be years rather than seconds or milli-

'Cambridge Centre for Molecular Recognition, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 IGA, England, UK. Email address: [email protected]

171

Page 185: Conformational Proteomics of Macromolecular Architecture

172 Richard N. Perham

seconds. The normal biological milieu of neutral pH and aqueous solution is inimical ro many chemical reactions. Enzymes make them possible. The basic principles of enzyme catalysis were first outlined some 100 years ago, with the proposal of the enzyme-substrate complex, the advent of the lock-and-key hypothesis to explain the remarkable specificity of enzyme action, and the concepts of saturation lunetics. Key steps along the way can be summarized roughly as follows:

Binding and specificity: lock-and-key hypothesis of the active site Michaelis-Menten kinetics Briggs-Haldane: transition state theory Free energy of binding and stabilization of the transition state Tunnelling: quantum effects

In the past few years, and increasingly so with the information that is flowing from the multiple genome sequencing projects on numerous or- ganisms, the importance of protein-protein interaction and protein- protein aggregation in catalysis and signal transduction processes has come to be recognised. Soon, it will no longer be enough to be aware that protein A interacts with protein B; it will be necessary to know how the relevant proteins recognise one another, how strong the interaction is, how long the interaction persists, what structure is formed, and what oc- curs as a result of the interaction that is beyond the capability of the indi- vidual proteins.

Among the earliest to be detected and the best documented of these synergistic protein interactions are the multienzyme systems: multien- zyme complexes and multifunctional proteins. Concerted multistep reac- tions require different solutions to the problems of catalysis, generally a bringing together of the enzymes that catalyse the individual steps and some means of promoting the passage of reaction intermediates between the contributing active sites. Early on (Reed, 1974; Perham, 1975), vari- ous possible advantages of multienzyme complexes were envisaged:

1. Enhancement of catalytic activity

This might be due to the limited distance required of an intermediate to diffuse between adjacent active sites (though with small molecules, diffusion of an intermediate could easily be seen as unlikely to limit the

Page 186: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 173

turnover rate of a typical enzymic reaction) or to some protein-protein interaction that enhanced catalytic performance (perhaps a protein con- formational change).

2. Substrate channelling

Given that some intermediates are common to more than one enzyme, or even (say) lipid-soluble and therefore potentially permeable to the cell membrane, the direction of metabolic flow through a pathway could be directed by requiring the intermediate to go to a partner enzyme. The means of channelling might be mechanical (literally a tunnel between active sites) or a more sophisticated chemical delivery (‘swinging arms’ - see below).

3. Protection of intermediates: the ‘hot potato’ hypothesis

The concept of channelling could be taken to the limit in envisaging a situation where the product of enzyme 1 would be too reactive to survive passage by free diffusion (in aqueous solution at neutral pH) to the active site of enzyme 2. Some form of protected delivery would be required.

4. Prosthetic groups as swinging arms

The covalent attachment of certain prosthetic groups makes possible the passage of reaction intermediates between active sites within multien- zyme complexes and multifunctional enzymes. The discovery of the swinging arms constituted by the lipoyl-lysine residues of the complexes that catalyse oxidative decarboxylation of 2-0x0 acids (Reed, 1998 and references therein) was soon followed by that of the biotinyl-lysine resi- dues of carboxylases and the phosphopantetheinyl-serine residues of fatty acid synthases (Lynen, 1967).

It is with these concepts, and with the potentiation of multistep reac- tions by the involvement of swinging arms, that this article is chiefly concerned. The principal topic of investigation has been the 2-0x0 acid dehydrogenase multienzyme complexes, but the story extends more widely and leads to some general conclusions of widespread applica- bility.

Page 187: Conformational Proteomics of Macromolecular Architecture

174 Richard N. Perham

SWINGING ARMS The three swinging arms of major interest are biotin, lipoic acid and phosphopantetheine, which are illustrated in Fig. 1. Each has a business end comprising a heterocyclic ring attached to a spacer region culmi- nating in a functional group that permits attachment to the polypeptide chain of the parent protein: a carboxyl group in the case of biotin and lipoic acid, that makes an amide link with the N6-amino group of specific lysine residues, and a phosphate that can generate a phosphodiester link with the hydroxyl group of a serine residue in the enzymes that require phosphopantetheine. In each instance, the attachment is highly specific for a given residue in the target protein and is catalysed by one or more dedicated enzymes [reviewed by Perham (2000)l.

The purpose of the swinging arm is to provide a flexible covalent link for the substrate and to convey it or an intermediate between the

Fig. 1. Structures of swinging arms found in multifunctional enzymes. A, biotin; B, its biosynthetic precursor before the insertion of the sulphur atom; C, lipoic acid; D, its bio- synthetic precursor before the insertion of the two sulphur atoms; E, phosphopantetheine.

Page 188: Conformational Proteomics of Macromolecular Architecture

Multi@nctional Enzyme Complexes I 75

participating active sites in the relevant complex. This is vividly por- trayed by the lipoyl-lysine residue in the 2-0x0 acid dehydrogenase com- plexes, where the acyl group that remains after the initial decarboxylation of the 2-0x0 acid is retained in thioester linkage with a lipoyl-lysine resi- due that is part of the second of the three enzymes involved. The reoxida- tion of the dihydrolipoyl group that is left after the transfer of the acyl group to CoA is then carried out by the third enzyme, to complete the reaction cycle, as outlined in Fig. 2 below.

0 P RCSCoA

0 0 Net Reprhn RCCOOH + NAD+ + &ASH + RCSCOA + NADH + H*+ CO,

Fig. 2. Reaction mechanism of 2-0x0 acid dehydrogenase complexes. ThDP, thiamin diphosphate; Lip, lipoic acid attached to E2.

THE 2 - O X 0 ACID DEHYDROGENASE COMPLEXES: A PARADIGM OF MULTISTEP CATALYSIS Three distinct multienzyme complexes catalyse the oxidative decarboxy- lation of 2-0x0 acids: pyruvate and 2-oxoglutarate in glycolysis and the citric acid cycle, and the branched chain 2-0x0 acids that are formed by transamination in the metabolism of valine, leucine and isoleucine (see Fig. 2). In a pyruvate dehydrogenase (PDH) complex, the first enzyme, E l p (EC 1.2.4. l), is a thiamin diphosphate-dependent decarboxylase, which catalyses the decarboxylation of pyruvate and the reductive acety- lation of the lipoyl group attached to the second enzyme, dihydrolipoyl acetyltransferase, E2p (EC 2.3.1.12). E2p transfers the acyl group to co- enzyme A, after which its now dihydrolipoyl group is reoxidized by a

Page 189: Conformational Proteomics of Macromolecular Architecture

Richard N. Perham 176

flavoprotein, dihydrolipoyl dehydrogenase (E3; EC 1.8.1.4). Comparable enzymes are found in the 2-oxoglutarate dehydrogenase (OGDH) and branched chain 2-0x0 acid dehydrogenase (BCDH) complexes and a similar complex exists for the catabolism of acetoin in bacteria. E l and E2 are specific for the particular 2-0x0 acid undergoing oxidative decar- boxylation, whereas E3, which carries out an identical reaction in each complex, is normally the same enzyme in the different complexes. 2-0x0 acid dehydrogenase complexes are virtually ubiquitous, and in eukaryo- tes are located in the mitochondria [for reviews, see Pate1 et al. (1996); de Kok et al. 1998; Perham, 2000)l.

The intact complexes are of enormous size, with molecular masses of 5-10 x lo6 and diameters of up to 50 nm, significantly bigger than a ribo- some. In the PDH complex from Bacillus stearothermophilus and most other Gram-positive bacteria, the E2p core comprises 60 E2p chains ar- ranged with icosahedral symmetry, whereas in the PDH complex from Escherichia coli and most Gram-negative bacteria, E2p contains 24 polypeptide chains in octahedral symmetry (Reed & Hackert, 1990; Per- ham, 1991; Perham, 2000). Multiple copies of the E l and E3 components are bound tightly but non-covalently to and around the outside of the E2 core, with many more copies of E l than E3 being present. The OGDH and BCDH complexes follow the same structural pattern, with E2 cores (E20 and E2b) that also exhibit octahedral symmetry.

The structural and mechanistic analysis of 2-0x0 acid dehydrogenase complexes has been a long and at times arduous process. The route adopted has become a classical one in such circumstances: learn how to take the complexes apart, analyse the structures and mechanisms of the individual pieces, discover how they go together, and then attempt a syn- thesis of the whole. Latterly it has been much aided by the ability to gen- erate recombinant proteins in large quantities.

The PDH complex of B. stearothermophilus is the one about which we know most in terms of structure and molecular mechanism. Its E2 chain is itself a multi-domain-and-linker structure. The lipoyl group is attached to a specific lysine residue of an independently folded domain of about 80 residues which forms the N-terminal part of the E2 chain. The solution structure of the domain, determined by means of nuclear

Page 190: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 177

magnetic resonance (NMR) spectroscopy, revealed it as a P-barrel formed from two four-stranded P-sheets, arranged with a two-fold axis of quasi-symmetry. The lipoyl-lysine is located at the tip of a tight, type I P- turn protruding from one P-sheet, and the N- and C-termini are close in space on the other P-sheet on the opposite side of the domain (Dardel et al., 1993). The lipoyl domain is followed by a much smaller domain, the so-called peripheral subunit-binding domain (PSBD) of only 35 amino acids, to which El and E3 can bind in a mutually exclusive way (Hipps et al., 1994; Lessard & Perham, 1995; Lessard et al., 1996). The stoichi- ometry of one El or one E3 per PSBD (effectively one per E2 chain) raises the possibility of multiple structural isomers occurring in the as- sembly of the intact complex (Doming0 et al., 1999) - which may be important mechanistically, as we shall see later. The structures of the PSBD on its own (Kalia et al., 1993) and in association with the dimeric E3 (Mande et al., 1996) have been determined by means of NMR spec- troscopy and X-ray crystallography, respectively. The domains are joined together by long linker regions (25-40 residues), rich in alanine, proline and charged amino acids.

The 28 kDa C-terminal domain of E2 has at least two functions: it houses the acetyltransferase active site and it associates with icosahedral symmetry to form the 60-mer inner core of the intact PDH complex, whose structure has been solved by X-ray crystallography (Izard et al., 1999). Although there is no crystal structure as yet for the B. stearother- mophilus El component, a plausible model can be built from the crystal

rnonas putida BCDH complex (Evarsson et al., 1999). There is clear biochemical evidence that E l binds to the PSBD close to its 2-fold axis and to the P-subunits of the a& heterotetramer in particular (Lessard & Perham, 1995). Putting all these data together, it proved possible to pro- pose a schematic structure for the intact B. stearothermophilus PDH complex, which immediately gave a different perspective of the 2-ox0 acid dehydrogenase complexes as molecular machines as much as en- zymes (Perham, 2000).

This structure is depicted in Fig. 3. It is likely to be essentially the same for all 2-0x0 acid dehydrogenase complexes, with some important

structure of the highly similar heterotetrameric (a2b2) E1 of the Pseudo-

Page 191: Conformational Proteomics of Macromolecular Architecture

178 Richard N. Perham

CAT

Fig. 3. Schematic model of the icosahedral B. stearothermophilus PDH complex. Only one acetyltransferase trimer is shown with its lipoyl domains and PSBDs projecting from the E2 inner core. Also for simplicity, only one E3 dimer (top) and one El hetero- tetramer (bottom) is shown bound to the PSBD. The middle PSBD is shown as unoccu- pied. CAT, acetyltransferase core domain in the form of 20 trimers. (After Perham, 2000.)

differences in detail. Thus, the number of lipoyl domains per E2 chain can vary, e.g. from one in the E2p chain of B. stearothermophilus PDH complex and the E20 chain of E. coli OGDH complex, to two in the E2p chains of the human and Enterococcus faecalis PDH complexes and three in the E2p of the E. coli PDH complex (Hackert & Reed, 1990; Perham, 1991). Likewise, the E2 core may be octahedral rather than ico- sahedral in symmetry, though it is clear that the differences are more su- perficial than might at first be apparent. Both types of structure are built of essentially the same trimer of E2 chains (Mattevi et al., 1993; Izard et al., 1999), a total of eight in the octahedral (24-mer) and twenty in the icosahedral(60-mer) structures, respectively. Moreover, in the octahedral E2s, E3 is bound by its attraction to the PSBD, but a major part at least of the binding site for E l resides in the acyltransferase inner core domain

Page 192: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 179

and linker region (Perham, 1991; de Kok et al., 1998; Perham, 2000). And in the PDH complexes of eukaryotes, the E3 dimer is bound to an additional polypeptide chain (nowadays referred to as the E3-binding protein) in the E2 core (Harris et al., 1997; McCartney et al., 1997 and references therein). E l too can vary; for example, in the PDH complexes of Gram-negative organisms and all OGDH complexes, the E l is a dimer (a2) of 100 kDa polypeptide chains (Perham, 1991). Nonetheless, the structure depicted in Fig. 3 serves well as the starting point for considera- tions of the mechanism, in particular that of active site coupling.

FUNCTIONS OF THE LIPOYL DOMAIN -THE SIGNIFICANCE OF COVALENT ATTACHMENT OF THE SWINGING ARM

It will be apparent even from a simple inspection of Fig. 3 that consider- able movement of the substrate is required between three physically re- mote active sites. Reductive acetylation of the lipoyl group on the lipoyl domain of E2 takes place at the active site of El; acyl transfer to CoA takes place in the active site of E2; and reoxidation of the dihydrolipoyl group takes place at the active site of E3. The requisite flexibility of the long linker region that tethers the lipoyl domain to the PSBD has long been evident from NMR spectroscopy (Perham, 2000). Add to this that the active site of E2 closely resembles that of the homologous enzyme, chloramphenicol acetyltransferase, which (a trimer) closely resembles the trimeric building blocks of E2 and has been studied in detail (Russell et al, 1992). The three active sites per trimer lie in the interfaces between the subunits; the acylated lipoyl domain (the counterpart of acetylated chloramphenicol) it is argued must approach the E2 active site from the ‘outside’ of the trimer on the surface of the assembled E2, whereas the CoA must approach from ‘inside’ the protein shell. This instantly makes sense of the giant holes up to 50 across on the 4-fold and 5-fold faces of the cubes and pentagonal docahedra that constitute E2s, as entry and exit points for CoA and acetyl-CoA, respectively.

Why then is the lipoyl group attached to the E2 protein? No reason can be derived from the chemical reaction mechanism as such. but it

Page 193: Conformational Proteomics of Macromolecular Architecture

180 Richard N. Perham

turns out that the lipoyl domain is a crucial part of the system. As shown first with the PDH complex of E. coli, free lipoic acid is acted upon rap- idly by E2p and E3 but is a surprisingly poor substrate for E l . In con- trast, the lipoyl domain of E2p is an excellent substrate for Elp, associ- ated with a dramatic rise in k,,dK,,, by a factor of lo4. Moreover, the li- poyl domains from the PDH and OGDH complexes of E. coli (Graham et al., 1989; Jones et al., 2000) and Azotobacter vinelandii (Berg et al., 1998) function as substrates only for their natural partner Els. Thus, the true substrate in these complexes is not lipoic acid or even lipoyl-lysine but the lipoylated domain of the E2 chain. This is the molecular basis of substrate channelling, which determines that only a lipoyl group cova- lently attached to a specific lysine residue in a domain of the intended E2 component can undergo reductive acylation. However, it should be noted that it reawakens the question of diffusion as a barrier to active site cou- pling, since a 10 kDa protein domain is going to diffuse much more slowly than free lipoic acid. En passant, it is important to note that, as in a conventional metabolic pathway, the multistep process of oxidative decarboxylation becomes committed at the first enzyme-catalysed reac- tion, which thereby dictates the course of the subsequent flow of sub- strate.

Fig. 4. The lipoyl-lysine swinging arm modelled in a fully extended form from the lipoyl domain into the active site of the El component of the P. putidu BCDH complex. His312a and His1310 are key residues in the active site. (After Evarsson et al., 1999.)

Page 194: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 181

Fig. 5. Cryo-EM reconstruction of an E1E2 subcomplex of the B. stearothermophilus PDH complex. The El subunits are located in a shell 9 nm from the inner E2 core. The linker region between the CD and PSBD domain crosses the gap only once. On the left are instances of fitting the electron density to that of the E l crystal structure. (After Milne et al., 2002.)

The ThDP in the active sites of the heterotetrameric (a&) El from P. putida BCDH complex (Evarsson et al., 1999) lies buried at the bot- tom of a 2 nm-deep funnel-shaped hole at the interface between the a- and P-subunits (Fig. 4). This is well beyond the reach (1.4 nm) of the fully extended lipoyl-lysine swinging arm. A prominent surface loop in the lipoyl domain, linking P-strands 1 and 2, lies close in space to the lipoyl-lysine P-turn (Fig. 3). As indicated by NMR and directed mutagenesis experiments with the homologous B. stearothermophilus Elp, this loop and certain residues flanking the lipoyl-lysine residue in its p-turn and elsewhere more remote from the lipoyl group are likely to make transient contact with both E l a and E1P subunits during catalysis and to be of critical importance in directing the interaction (Wallis et al., 1996; Howard et al., 2000). In the crystal structure of the dimeric (a2) E l p from the E. coli PDH complex (Arjunan et al., 2002), the ThDP is situated similarly and a comparable interaction with the relevant E. coli lipoyl domain must apply (Jones et al., 2001). The mechanistic impera-

Page 195: Conformational Proteomics of Macromolecular Architecture

182 Richard N. Perham

tive is that the protruding D-turn housing the lipoyl-lysine residue in a lipoyl domain must enter the active site funnel in E l , bringing the nearby surface loop among other parts of the lipoyl domain into close contact with El. Reductive acylation of the dithiolane ring at the end of the ex- tended lipoyl-lysine residue will then ensue only if the domain to which the lipoyl group is attached is specifically recognised by E l .

THE 2 - O X 0 ACID DEHYDROCENASE COMPLEX AS A MOLECULAR MACHINE The innate flexibility in the E2 chains and lipoyl domains and also per- haps the lack of structural uniformity observed in purified PDH com- plexes, referred to above structure, may well be responsible for the fail- ure thus far to crystallize an intact assembly. However, recent advances in cryo-electron microscopy (cryo-EM) offer real hope of a detailed structural analysis. The overall structure of the B. stearothermophilus E l E2-subcomplex, reassembled from its constituent parts in vitro such that all 60 binding sites on E2 are occupied by E l heterotetramers, has been reconstructed from multiple cryo-EM images, as shown in Fig. 5 . Far from the lipoyl domains interdigitating between the El and E3 sub- units (Fig. 3), the E l subunits form a surprisingly uniform shell, concen- tric with the icosahedral E2 inner core of acetyltransferase domains, about 9 nm above the surface of the inner core. The linker region of some 30 amino acids between the PSBD (to which the E l s are bound) and the acetyltransferase domain is sufficient to cross the gap only once in a more or less extended conformation (which probably explains why it is too faint to be seen in the electron density map). Most importantly, the lipoyl domains, tethered to the PSBD by a linker region of some 45 amino acids, must move about within the annular space between the E l and the E2 inner core (Milne et al., 2002). The E2E3-subcomplex is thought to have a comparable structure but it remains to be determined how the E l and E3 cohabit on the surface of the E2 core (J. S. Milne, S. Subramaniam and R. N. Perham, unpublished work).

Similar work has been reported on a partly assembled bovine ktdney PDH complex containing both E l and E3 (Zhou et al., 2001). Various differences from the work of Milne et al. (2002) remain to be resolved,

Page 196: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 183

some of which may be due to the fact that the E2 chain of the bovine PDH complex has two lipoyl domains and that the E3 subunits are bound to the E2 core by virtue of the E3-binding protein, which has no counter- part in the B. stearothermophilus assembly (see above). Even so, it is reasonable to infer that the basic concept of an inner E2 core, an outer shell of E l and perhaps E3 subunits, and the reactions involving the li- poyl domain taking place in the confined annular space thus created, will be common to all 2-0x0 acid dehydrogenase complexes (Milne et al.,

Constraining the lipoyl domains in this way means that their local concentration is high. It is difficult to estimate it with certainty without knowing more about the constraints imposed by the linker regions that tether them to the PSBDs, but it must be of the order of 1 mM and cer- tainly well above the K, for the reductive acylation catalysed by E l , es- timated to be something like 20 pM (Graham et al., 1989). Since the true substrate for E l is the lipoyl domain, which will diffuse much more slowly than free lipoic acid, the confinement of the lipoyl domains to the annular shell is obviously highly advantageous. The fall in rate by a fac- tor of 100 or more that accompanies the detachment of the lipoyl do- mains from E2 (Perham, 2000; Berg et al., 1998) is readily explained.

A remarkable feature of the model of the B. stearothermophilus E1E2 complex (Milne et al., 2002) is that the lipoyl domain of any one E2 chain is capable of reaching all three E2 active sites in the trimer of acetyltransferase domains directly below the relevant PSBD (within -105 A) and a further three neighbouring E2 sites (within -140 A of the same PSBD). Vice versa, one of the active sites of each of six E l hetero- tetramers is within -120 A of the same PSBD, and an additional three E l active sites are located within a distance of - 140 A, malung a total of nine potentially accessible to any one lipoyl domain. This may be a con- servative estimate because it omits the extra range afforded by the length of the lipoyl domain itself and the possibility of alternative fits to the electron density map. Such a structure clearly obviates any need for a strict stoichiometric relation between the numbers of E l , E2 and E3 cata- lytic sites, as observed (Perham, 1991; de Kok et al., 1998; Perham,

2002).

2000).

Page 197: Conformational Proteomics of Macromolecular Architecture

184 Richard N. Perham

It also puts structural flesh on the bones of a mechanistic paradox. A single El molecule on an E2 core can catalyse the reductive acetylation of many, if not all, of the lipoyl domains in the E2 core (Bates et al., 1977; Collins & Reed, 1997; Packman et ul., 1983). In the same vein, many of the lipoyl domains, identified as an essential catalytic inter- mediate, can be excised from the E2 core without a corresponding de- crease in overall complex activity (Reed & Hackert, 1990; Perham, 1991). The answer, curious at first sight, must be that there is a super- fluity of lipoyl domains, but this is put to good effect. Given that El ca- talyses the rate-limiting reaction in the complex (Danson et ul., 1978; Cate et al., 1980), optimal activity will be achieved when there is an ex- cess of E l over E3, as exemplified by the B. stearothermophilus PDH complex, for which a stoichiometry of -42-48 El and -6-12 E3 for each E2 icosahedral core offers maximal activity. The multiplicity of inter- actions of a single lipoyl domain with different El and E3 molecules on the surface of the same E2 core provides a fitting structural basis, of particular interest in that what makes it possible is a mechanical rather than a chemical property of the protein complex. In short, we have a nanomachine .

CLYCINE CLEAVAGE A N D THE HOT POTATO HYPOTHESIS The glycine cleavage (glycine decarboxylase) system comprises four proteins, with more than superficial similarity to the components of a 2- 0x0 acid dehydrogenase complex: the dimeric P-protein, a pyridoxal phosphate-dependent decarboxylase; the monomeric H-protein with a lipoyl-lysine residue that becomes reductively aminomethylated; the T- protein, another monomer, requiring tetrahydrofolate as cofactor to ac- cept the one-carbon methylene fragment from the charged H-protein with the simultaneous release of NH,; and the dimeric L-protein, dihydro- lipoyl dehydrogenase again, to reoxidize the dihydrolipoyl group. To- gether they catalyse a coordinated set of reactions (Fig. 6), but their in- teractions appear to be weak and transient. They are widely distributed in bacteria and the mitochondria of plants and mammals (Fujiwara et al., 1992; Douce et al., 1994).

Page 198: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 185

The lipoylated pea leaf H-protein (13 1 residues) is essentially an 80- residue lipoyl domain with 20-residue extensions at both ends, but incor- porating a new short helix (residues 29-35) between P-strands 1 and 2, no P-strand 7, and an additional C-terminal helix (residues 121-131). The lipoyl-lysine swinging arm on the surface of the protein, however, is lo- calized, albeit in a somewhat flexible conformation, and is not apparently free to move (Pares et al., 1994), as shown in Fig. 7. Even more intrigu- ingly, in the reductively aminomethylated form, the swinging arm pivots into a nearby cleft on the surface (for which there is no counterpart on a lipoyl domain) and becomes more tightly immobilized in the process (Fig. 7). The unstable aminomethylated dihydrolipoyl group is left buried in a predominantly hydrophobic pocket (Cohen-Addad et al., 1995;

Protcein H Oxidised

,-b Serine Glycine + HzO e-b

Fig. 6. Reaction mechanism of the glycine cleavage system. H, the lipoylated H protein; SHMT, serine hydroxymethyltransferase. (After S. Pares.)

Page 199: Conformational Proteomics of Macromolecular Architecture

186 Richard N. Perhnm

Fig. 7. Structures of the H protein of the pea leaf glycine cleavage system. A, the oxi- dized lipoylated protein; B, the reductively aminomethylated protein. The helix is at the C-terminal end of the protein. Note the movement of the lipoyl-lysine residue shown in ball-and-stick mode between forms A and B. (After Cohen-Addad et al., 1995).

Guilhaudis et al., 1999) which renders it stable until the approach of the T-protein causes a rapid unloading (Guilhaudis et al., 1999), thereby making it a clear-cut example of the ‘hot potato’ hypothesis (Perham, 1975). The T-protein would appear to prize the swinging arm from its resting place by a protein-protein interaction. The C-terminal helix of the H-protein (Fig. 7) may act as a lever on the surface pocket, given its ri- gidity in the aminoethylated domain and greater flexiblity in the unloaded domain (Guilhaudis et al., 1999). The mating surface on the T- protein appears to be its N-terminal region, interacting with the H-protein to induce the necessary conformational change (Okamura-Ikeda et al., 1999).

BIOTIN AS A SWINGING ARM IN CARBOXYLASES Biotinyl-lysine acts as the swinging arm in three different types of enzyme reaction: ATP-dependent carboxylations of acceptors such as acetyl-CoA or pyruvate; sodium transport coupled to the decarboxylation of p-keto acids and their thioesters; and the inter-conversion of oxalo- acetate and propionyl-CoA into pyruvate and methylmalonyl-CoA

Page 200: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Comple.xes 187

catalysed by a specific transcarboxylase (Knowles, 1989; Jitrapakdee & Wallace, 1999). In acetyl-CoA carboxylase, one of the best studied of the biotin-dependent enzymes, the biotinyl-lysine residue is located in a bio- tin carboxy carrier protein (BCCP) and the carboxylation of the biotin moiety is catalysed by a separate enzyme, biotin carboxylase, in an ATP- dependent reaction:

ATP + HC03- + biotin = N’-carboxy-biotin + ADP + P, + H+

The carboxylation of acetyl-CoA is then catalysed at a second active site in a further enzyme, acetyl-CoA carboxyltranferase:

N’-carboxy-biotin + acetylCoA = malonyl-CoA + biotin

The biotin-dependent enzymes exhibit several different quaternary structures. For example, in E. coli acetyl-CoA carboxylase, the BCCP, carboxylase and carboxyltransferase associate non-covalently in the in- tact enzyme, but the pyruvate carboxylases commonly appear as homo- tetramers, with each polypeptide chain housing (from the N-terminus) the carboxylase, carboxyltransferase and BCCP (Jitrapakdee & Wallace, 1999). The biotinyl-lysine is thought to visit successive active sites, first for carboxylation and then for transcarboxylation.

The BCCP of E. coli acetyl-CoA carboxylase is a homodimer of two polypeptide chains of 156 amino acids. The biotinylated part forms an independently folded C-terminal domain and the structure of this frag- ment has been determined by means of X-ray crystallography (Athap- pilly & Hendrickson, 1995) and NMR spectroscopy (Roberts et al., 1999). It turns out to be highly similar to that of a lipoyl domain, com- prising two four-stranded b-sheets with the biotinyl-lysine in a tight type- I p-turn (Fig. 8). As with the H-protein, the swinging arm surprisingly is immobilized, resting in a ‘thumb-like’ protrusion comprising residues 94-101 between P-strands 2 and 3 that is not observed in lipoyl domains (Fig. 8). The fact that the N1’ atom of the biotin, the site of carboxylation in the intact acetyl-CoA carboxylase, is oriented towards the protein means that the biotin ring must reorientate to become carboxylated. In contrast, in the transcarboxylase of Propionobacter shermanii, the BCCP is part of a multi-subunit enzyme of 1.2 MDa. In addition to 12 1.3s monomers, there are six 12s subunits in a cylindrical central unit and

Page 201: Conformational Proteomics of Macromolecular Architecture

188 Richard N. Perham

(A) Holo-domain (B) .2+.. . (C) Apo-domain

6

Fig. 8. Structure of the biotinyl domain of the BCCP from E. coli acetyl-CoA carboxy- lase. A, holodomain, determined by means of X-ray and NMR spectroscopy; B, structure of holodomain with P-strands numbered from the N-terminus; C, apodomain, determined by means of NMR spectroscopy, with the position of the biotin in the holodomain still indicated. (After Roberts et al., 1999.)

twelve SS subunits in six outer dimers. The 1.53-residue 1.3s subunit houses the biotinyl-lysine residue (Lys89) and binds the central and outer subunits together (Shenoy et al., 1993). Its biotinyl domain (AlaS2- Gly123) has a fold that is essentially the same as that of E. coli BCCP (Reddy et al., 1998) but there is no evidence of any significant interac- tion between the biotinyl group and the protein. This is consistent with the fact that the protruding thumb region associated with immobilizing the biotinyl-lysine in the biotinyl domain of E. coli BCCP (Fig. 8) is lacking.

Immobilization of the biotinyl-lysine residue in the BCCP domain may be an unusual event, associated perhaps with the particular reaction catalysed by acetyl-CoA carboxylase. Just as free lipoic acid is not a good substrate for the El component of 2-0x0 acid dehydrogenase com- plexes, free biotin will serve as a substrate for only two biotin-dependent enzymes, acetyl-CoA carboxylase and p-methylcrotonyl-CoA decar- boxylase (Knowles, 1989; Sun et al., 1997; and references therein). Once again there is an issue here perhaps of substrate channelling dictated by the biotinyl domain; and in this regard it may be significant that acetyl- CoA carboxylase is the only biotinylated enzyme in E. coli and that a requirement for channelling would be otiose. NMR spectroscopy indi- cates that amino acid residues 70-79 preceding the biotinyl -domain in

Page 202: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 189

E. coli BCCP are highly flexible, suggesting that they form part of a tether that allows the biotinyl domain to move within the intact acetyl- CoA carboxylase (Roberts et al., 1999). Similar flexibility is also present in the N-terminal 50 residues of P. shermanii 1.3s protein, which like- wise may form part of a flexible linker region anchoring the biotinyl do- main to the rest of the assembled transcarboxylase (Reddy et al., 1998). Thus, it is reasonable to speculate that the biotin-dependent enzymes re- semble the 2-0x0 acid dehydrogenase complexes, not just in the need for a swinging arm but in the necessity of substantial movements of the rele- vant domain in active site coupling.

PHOSPHOPANTETHEINYL GROUPS AS SWINGING ARMS IN MULTIFUNCTIONAL SYNTHASES The phosphopantetheinyl group is found attached in phosphodiester link- age to serine residues in a variety of multifunctional enzymes that cata- lyse the biosynthesis of complex products, such as fatty acids (Schweizer et al., 1970), polyketides (Hopwood, 1997) and non-ribosomal peptides (Kleinkauf & von Dohren, 1996). The phosphopanthetheinyl group holds the biosynthetic intermediate in thioester linkage while it is modified and extended in some specific way by the other enzymes in the relevant sys- tem. In E. coli the phosphopantetheinyl-serine residue is part of a small (77-residue) acyl carrier protein (ACP) (Prescott & Vagelos, 1972) and fatty acid synthesis is carried out by a series of separate enzymes (type I1 fatty acid synthase, or FAS). In many other organisms the ACP, like the BCCP of some biotin-dependent enzymes, forms part of a multifunc- tional fatty acid synthase (type I FAS). The type I FASs vary widely in quaternary structure (McCarthy & Hardie, 1984); the yeast FAS is an a& heterododecamer, with three of the 7 required enzyme activities in one chain, and four in the other, whereas the homodimeric human FAS has all 7 enzymes fused into each polypeptide chain, with the ACP as the penultimate domain (Brink et al., 2002; Smith et al., 2003; and refer- ences therein). The ACP domain recurs in the polyketide and non- ribosomal peptide synthases, some of which are collections of separate

Page 203: Conformational Proteomics of Macromolecular Architecture

190 Richard N. Perham

Fig. 9. Structure of the ACP from the E. colz FAS. The dark helix IS at the C- terminal end and the phosphopantetheinyl swinging arm is in phosphodiester linkage with a serine residue in the loop at the bot- tom of the structure. (Based on Crump et al., 1997.)

enzymes that function successively (type II), and others of which (type I) reach gothic complexity in the numbers and arrangements of the various catalytic activities in multifunctional polypeptide chains (Carreras et al., 1997). For example, five multifunctional polypeptides, 10 biosynthetic modules and 11 ACP-like domains are required for the biosynthesis of rifamycin (Tang et al., 1998).

The structure of the holo-ACP from the E. coli (type TI) FAS is quite different from a lipoyl or biotinyl domain, primarily that of three helices surrounding a hydrophobic core (Holack et al., 1988). The ‘hollow’ core forms a presumptive site to accommodate the growing fatty acid chain in thioester linkage with the phosphopantetheinyl group, as synthesis pro- ceeds (Fig. 9). Like the lipoyl and biotinyl groups, the phospho- pantetheinyl group is attached to a loop region of its protein domain. The 86-residue type I1 apo-ACP from the actinorhodin PKS of S. coelicolor A3(2) is very similar, in keeping with the fact that it shares a sequence identity of 47% with the E. coli FAS ACP (Crump et al., 1997). It differs in that it contains several buried hydrophilic groups, notably Arg72 and Asn79. They may play some part in stabilizing the growing polyketide chain, by creating a more polar environment that includes suitable hy- drogen bond donors, compared with the hydrophobic core region of the FAS ACP (Crump et al., 1997). In this sense, it is another example of the ‘hot potato hypothesis’.

Page 204: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 191

Given these similarities in ACP structure across a wide range of en- zymes, it is perhaps not surprising that the ACP domain from the multi- functional rat fatty acid synthase (type I) can replace the ACP in the acti- norhodin polyketide synthase (type I) of Streptomyces coelicolor A3 (Tropf et al., 1998). However, in many other instances, ACPs fail to function when recruited into heterologous systems (Shen & Hutchinson, 1993). Substrate channelling dictated by the protein domain can further be inferred from the fact that the FAS ACP of S. coelicolor A3 is a very poor substitute for its counterpart in the actinorhodin polyketide synthase in the same cell (Revill et al., 1996). The importance of ACP/domain recognition by its partner enzymes is buttressed by the observation that the rhizobial protein NodF cannot be substituted by the E. coli FAS ACP despite a 25% sequence identity, but that a hybrid ACP, comprising resi- dues 1-33 of E. coli ACP fused to residues 43-93 of NodF, is functional (Ritsema et al., 1998). This would imply that the necessary signals are to be found in the C-terminal region of the protein in this instance.

The ACP is the point of entry into the multistep reaction catalysed by the FAS or PKS, and thus is the logical point at which to exert any mo- lecular recognition required for separation of potentially competing pathways. In this, as in other ways, there is a striking parallel with the lipoyl domains of 2-0x0 acid dehydrogenase complexes.

CONCLUSIONS Many features remain to be investigated further and put in a structural context. For example, there is some evidence that the lipoyl group may adopt preferred orientations on the surface of the lipoyl domain (Jones et al., 2000b) and that this may be accompanied by preferred trajectories of the lipoyl domains in their approach to E l active sites (Chauhan et al., 2000). Whatever remains to be discovered, it is abundantly clear that the 2-0x0 acid dehydrogenase complexes must now be viewed as multi- functional catalytic machines, which rely on the mechanical properties of the molecular assembly as much as the active site chemistry of the com- ponent enzymes. Enough has been learned of the biotin- and pantetheine- dependent enzymes to identify similar properties there too and propose

Page 205: Conformational Proteomics of Macromolecular Architecture

192 Richard N. Perham

them as advantages acquired by multifunctional enzymes that rely on swinging arms to bring about multistep reactions inside the living cell.

REFERENCES I . Kvarsson A, Seger K, Turley S, Sokatch JR, Hol WGJ. Crystal structure of

2-oxoisovalerate dehydrogenase and the architecture of 2-0x0 acid dehydro- genase multienzyme complexes. Nature Struct Biol, 1999; 6:785-792.

2. Arjunan P, Nemeria N, Brunskill A, Chandrasekhar K, Sax M, Yan Y, Jor- dan F, Guest JR, Furey W. Structure of the pyruvate dehydrogenase mul- tienzyme complex El component from Escherichia coli at 1.85 P\ reso- lution. Biochemistry,2 002; 415213-5221.

3. Athappilly FK, Hendrickson WA. Structure of the biotinyl domain of ace- tyl-coenzyme A carboxylase determined by MAD phasing. Structure, 199.5;

4. Bates DL, Danson MJ, Hale G, Hooper EA, Perham RN. Self-assembly and catalytic activity of the pyruvate dehydrogenase multienzyme complex of Escherichia coli. Nature, 1977; 268, 3 13-316.

5. Berg A, Westphal AH, Bosma HJ, de Kok A. Kinetics and specificity of re- ductive acylation of wild-type and mutated lipoyl domains of 2-ox0 acid dehydrogenase complexes from Azotobacter vinelandii. Eur J Biochem,

6. Brink J, Ludtke SJ, Yang C-Y, Gu Z-W, Wakil SJ, Chiu W. Quaternary structure of human fatty acid synthase by electron cryomicroscopy. Proc Nut1 Acud Sci USA, 2002; 99, 139-143.

7. Carreras CW, Pieper R, Khosla C. The chemistry and biology of fatty acid, polyketide, and nonribosomal peptide biosynthesis. Topics Curr Chem,

8. Cate RL, Roche TE, Davis LC. Rapid intersite transfer of acetyl groups and movement of pyruvate dehydrogenase component in the kidney pyruvate dehydrogenase complex. .I Biol Chem 1980; 225:7556-7562.

9. Chauhan HJ, Domingo GJ, Jung H-I, Perham RN. Sites of limited proteoly- sis in the pyruvate decarboxylase component of the pyruvate dehydrogenase multienzyme complex of Bacillus stearothemzophilus and their role in ca- talysis. Eur J Biochem, 2000; 267,7158-7169.

10. Cohen-Addad C, Pares S, Sieker L, Neuberger M, Douce R. The lipoamide arm in the glycine decarboxylase complex is not freely swinging. Nature Struct Biol, 1995; 2:63-68.

311407-1419.

1998; 252145-50.

1997; 188185-126.

Page 206: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 193

11. Collins JH, Reed LJ. Acyl group and electron pair relay system: a network of interacting lipoyl moieties in the pyruvate and alpha-ketoglutarate com- plexes from Escherichia coli. Proc Natl Acad Sci USA, 1977; 744223- 4227.

12. Crump MP, Crosby J, Dempsey CE, Parkinson JA, Murray M, Hopwood DA, Simpson TJ. Solution structure of the actinorhodin polyketide synthase acyl carrier protein from Streptomyces coelicolor A3(2). Biochemistry,

13. Danson MJ, Fersht AR, Perham RN. Rapid intramolecular coupling of ac- tive sites in the pyruvate dehydrogenase complex of Escherichia coli: mechanism for rate enhancement in a multimeric structure. Proc Nut1 Acad Sci USA, 1978; 755386-5390.

14. Dardel F, Davis AL, Laue ED, Perham RN. The three-dimensional structure of the lipoyl domain from Bacillus steathermophilus pyruvate dehydro- genase multienzyme complex. J Mol Biol, 1993; 229: 1037-1048.

15. de Kok A, Hengeveld AF, Martin A, Westphal AH. The pyruvate dehydro- genase multi-enzyme complex from Gram-negative bacteria. Biochim. Bio-

16. Doming0 GJ, Chauhan H, Lessard IAD, Fuller C, Perham RN. Self- assembly and catalytic activity of the pyruvate dehydrogenase multienzyme complex from Bacillus stearothermophilus. Eur J Biochem 1999; 266,

17. Douce R, Bourguignon J, Macherel D, Neuburger M. The glycine decar- boxylase system in higher plant mitochondria: structure, function and bio- genesis. Biochem Soc Trans, 1994; 22:184-188.

18. Fujiwara K, Okamura-Ikeda KK, Motokawa Y. Expression of mature bo- vine H-protein of the glycine cleavage system in Escherichia coli and in vi- tro lipoylation of the apoform. J Biol Chem, 1992; 267:2000 1-200 16.

19. Graham LD, Packman LC, Perham RN. Kinetics and specificity of reduc- tive acylation of lipoyl domains from 2-0x0 acid dehydrogenase multien- zyme complexes. Biochemistry, 1989; 28:1574-1581.

20. Guilhaudis L, Simorre J-P, Blackledge M, Neuburger M, Bourguignon J, Douce R, Marion D, Gans P. Investigation of the local structure and dynam- ics of the H subunit of the mitochondria1 glycine decarboxylase using het- eronuclear NMR spectroscopy. Biochemistry, 1999; 38:8334-8346.

21. Harris RA, Bowker-Kinley MM, Wu PF, Jeng JJ, Popov KM. Dihydroli- poamide dehydrogenase-binding protein of the human pyruvate dehydro- genase complex: DNA-derived amino acid sequence, expression, and recon-

1997; 36:6000-6008.

phys. Acta, 1998; 13851353-366.

1136-1 146.

Page 207: Conformational Proteomics of Macromolecular Architecture

194 Richard N. Perham

stitution of the pyruvate dehydrogenase complex. J Biol Chem, 1997;

22. Hipps DS, Packman LC, Allen MD, Fuller C, Sakaguchi K, Appella E, Perham RN. The periperal subunit-binding domain of the dihydrolipoyl ace- tyltransferase component of the pyruvate dehydrogenase complex of Bacil- lus stearothermophilus: preparation and characterization of its binding to the dihydrolipoyl dehydrogenase component. Biochem J 1994; 297: 137- 143.

23. Holack TA, Nilges M, Prestegard JH, Gronenborn AM, Clore GM. Three- dimensional structure of acyl carrier protein in solution determined by nu- clear magnetic resonance and the combined use of dynamical simulated an- nealing and distance geometry. Eur J Biochem, 1988; 175:9-15.

24. Hopwood DA. Genetic contributions to understanding polyketide synthases. Chem Rev, 1997; 97:2465-2497.

25. Howard, MJ, Chauhan, HJ, Domingo, GJ, Fuller, C, Perham, FW. Protein- protein interaction revealed by NMR T2 relaxation experiments. The lipoyl domain and El component of the pyruvate dehydrogenase multienzyme complex of Bacillus stearothermophilus. J Mol Biol2000; 295, 1023-1047.

26. Izard T, Aevarsson A, Allen MD, Westphal AH, Perham RN, de Kok A, Hol WGJ. Principles of quasi-equivalence and Euclidean geometry govern the assembly of cubic and dodecahedra1 cores of pyruvate dehydrogenase complexes. Proc Nut1 Acad Sci USA, 1999; 96: 1240- 1245.

27. Jitrapakdee S, Wallace JC. Structure, function and regulation of pyruvate carboxylase. Biochem J , 1999; 34O:l-16.

28. Jones DD, Horne JH, Reche PA, Perham RN. Structural determinants of post-translational modification and catalytic specificity for the lipoyl do- mains of the pyruvate dehydrogenase complex of Escherichia coli. J Mol Biol, 2000a; 295:289-306.

29. Jones, DD, Stott, KM, Howard, MJ, Perham, RN. Restricted motion of the lipoyl-lysine swinging arm in the pyruvate dehydrogenase complex of Es- cherichia coli. Biochemistry, 2000b; 39:8448-8459.

30. Jones, DD, Stott, KM, Reche, PA, Perham, RN. Recognition of the lipoyl domain is the ultimate determinant of substrate channelling in the pyruvate dehydrogenase multienzyme complex. J Mol Biol, 2001 ; 305:49-60.

3 1. Kalia YN, Brocklehurst SM, Hipps DS, Appella E, Sakaguchi K, Perham RN. The high resolution structure of the peripheral subunit-binding domain of dihydrolipoamide acetyltransferase from the pyruvate dehydrogenase

272119746-19751.

Page 208: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 195

multienzyme complex of Bacillus stearothermophilus. J Mol Biol, 1993; 230:323-341.

32. Kleinkauf H, von Dohren H. A nonribosomal system of peptide biosynthe- sis. Eur J Biochem, 1996; 236335351.

33. Knowles JR. The mechanism of biotin-dependent enzymes. Annu Rev Biochem, 1989; 58:195-221.

34. Lessard IAD, Perham RN. Interaction of component enzymes with the pe- ripheral subunit-binding domain of the pyruvate dehydrogenase multien- zyme complex of Bacillus stearotheriizophilus: stoicheiometry and specific- ity in self-assembly. Biochem J , 1995; 306:727-733.

35. Lessard IAD, Fuller C, Perham RN. Competitive interaction of component enzymes with the peripheral subunit-binding domain of the pyruvate dehy- drogenase multienzyme complex of Bacillus stearothermophilus: Kinetic analysis using surface plasmon resonance detection. Biochemistry, 1996;

36. Lynen F. The role of biotin-dependent carboxylations in biosynthetic reac- tions. Biochem J , 1967; 102:381-400.

37. Mande SS, Sarfaty S, Allen MD, Perham RN, Hol WGJ. Protein-protein in- teractions in the pyruvate dehydrogenase multienzyme complex: Dihydroli- poamide dehydrogenase complexed with the binding domain of dihydroli- poamide acetyltransferase. Structure, 1996; 4:277-286.

38. Mattevi A, Obmolova G, Schulze E, Kalk KH, Westphal A, de Kok A, Hol WGJ. Atomic structure of the cubic core of the pyruvate dehydrogenase multienzyme complex. Science, 1992; 255: 1544-1 550.

39. McCarthy AD, Hardie DG. Fatty acid synthase: an example of protein evo- lution by gene fusion. Trends Biochem Sci, 1984; 9:60-63.

40. McCartney RG, Sanderson SJ, Lindsay JG. Refolding and reconstitution studies on the transacetylase protein X (E2/X) subcomplex of the mammal- ian pyruvate dehydrogenase complex: Evidence for specific binding of the dihydrolipoamide dehydrogenase component to sites on reassembled E2. Biochemistry, 1997; 36:68 19-6826.

41. Milne JLS, Shi D, Rosenthal, PB, Sunshine JS, Doming0 GJ, Wu X, Brooks BR, Perham RN, Henderson R, Subramaniam S. Molecular architecture and mechanism of an icosahedral pyruvate dehydrogenase complex: a multi- functional catalytic machine. EMBO J , 2002; 21: 1-12.

42. Okamura-Ikeda K, Fujiwara K, Motokawa Y. The amino-terminal region of the Escherichia coli T-protein of the glycine cleavage system is essential for proper association with H-protein. Eur J Biochem, 1999; 264:446-452.

35: 16863-16870.

Page 209: Conformational Proteomics of Macromolecular Architecture

196 Richard N . Perham

43. Packman LC, Stanley CJ, Perham RN. Temperature-dependence of in- tramolecular coupling of active sites in pyruvate dehydrogenase multien- zyme complexes. Biochem J , 1983; 213,331-338.

44. Pares S, Cohen-Addad C, Sieker L, Neuburger M, Douce R. X-ray structure determination at 2.6-w resolution of a lipoate-containing protein: the H- protein of the glycine decarboxylase complex from pea leaves. Proc Nut1

45. Pate1 MS, Roche TE, Harris RA (eds). Alpha-Keto Acid Dehydrogenase Complexes, Birkhaiiser-Verlag, Basel, 1996.

46. Perham RN. Self-assembly o f biological macromolecules. Phil Trans Roy Soc Lond B, 1975; 272: 123-136.

47. Perham RN. Domains, motifs and linkers in 2-ox0 acid dehydrogenase mul- tienzyme complexes: a paradigm in the design of a multifunctional protein. Biochemistry, 199 1 ; 30:8501-8.5 12.

48. Perham RN. Swinging arms and swinging domains in multifunctional en- zymes: catalytic machines for multistep reactions. Annu Rev Biochem,

49. Prescott DJ, Vagelos PR. Acyl carrier protein. Adv Enzymol, 1972; 36:269- 311.

SO. Reddy DV, Rothemund S, Shenoy BC, Carey PR, Sonnichsen FD. Struc- tural characterization of the entire I .3S subunit of transcarboxylase from Propionibacterium shermanii. Protein Sci, 1998; 7:2156-2163.

Acad Sci USA, 1994; 91:4850-4853.

2000; 69,963-1006.

5 1. Reed LJ. Multienzyme complexes. Accounts Chem. Res, 1974; 7:40-46. 52. Reed LJ. From lipoic acid to multi-enzyme complexes. Protein Sci, 1998;

7:220-224. 53. Reed LJ, Hackert ML. Structure-function relationships in dihydrolipoamide

acyltransferase. J Biol Chem, 1990; 265:897 1-8974. 54. Revill WP, Bibb MJ, Hopwood DA. Relationships between fatty acid and

polyketide synthases from Streptomyces coelicolor A3(2): Characterization of the fatty acid synthase acyl carrier protein. J Bacteriol, 1996; 1785660- 5667.

55. Ritsema T, Gehring AM, Stuitje AR, vanderDrift KMGM, Dandal I, Lam- balot RH, Walsh CT, Thomas-Oates JE, Lugtenberg BJJ, Spaink HP. Func- tional analysis of an interspecies chimera of acyl carrier proteins indicates a specialized domain for protein recognition. Mol Gen Genet, 1998; 287:641- 648.

56. Roberts EM, Shu N, Howard MJ, Broadhurst RW, Chapman-Smith A, Wal- lace JC, Morris T, Cronan JE Jr, Perham RN. Solution structures of apo and

Page 210: Conformational Proteomics of Macromolecular Architecture

Multifunctional Enzyme Complexes 197

holo biotinyl domains from acetyl-coenzyme A carboxylase of Escherichia coli determined by triple-resonance nuclear magnetic resonance spectros- copy. Biochemistry, 1999; 385045-5053.

57. Russell GC, Machado RS, Guest JR. Overproduction of the pyruvate dehy- drogenase multienzyme complex of Escherichia coli and site-directed sub- stitutions in the E l p and E2p subunits. Biochem J , 1992; 287:611-619.

58. Schweizer E, Willecke K, Winnewisser W, Lynen F. The role of phospho- pantetheine in the yeast fatty acid synthetase complex. Vitam Horm. 1970;

59. Shen B, Hutchinson CR. Enzymatic synthesis of a bacterial polyketide from acetyl and malonyl coenzyme A. Science, 1993; 262: 1535-1540.

60. Shenoy BC, Kumar GK, Samols D. Dissection of the biotyinyl subunit of transcarboxylase into regions essential for activity and assembly. J Biol Chem, 1993; 268:2232-2238.

61. Smith S, Witkowski A, Joshi AK. Structural and functional organization of the animal fatty acid synthase. Prog Lipid Res. 2003; 4:289-317.

62. Sun JD, Ke JS, Johnson JL, Nikolau BJ, Wurtele ES. Biochemical and mo- lecular biological characterization of CAC2, the Arabidopsis thaliana gene coding for the biotin carboxylase subunit of the plastidic acetyl-coenzyme A carboxylase. Plant Physiol, 1997; 115:1371-1383.

63. Tang I, Yoon YJ, Choi CY, Hutchinson CR. Characterization of the enzy- matic domains in the modular polyketide synthase involved in rifamycin B biosynthesis by Amycolatopsis rnediterranei. Gene, 1998; 216:255-265.

64. Tropf S, Revill WP, Bibb MJ, Hopwood DA, Schweizer M. Heterologously expressed acyl carrier protein domain of rat fatty acid synthase functions in Escherichia coli fatty acid synthase and Streptomyces coelicolor polyketide synthase systems. Chem Biol, 1998; 5: 135-146.

65. Wallis NG, Allen MD, Broadhurst RW, Lessard IAD, Perham RN. Recog- nition of a surface loop of the lipoyl domain underlies substrate channeling in the pyruvate dehydrogenase multienzyme complex. J Mol Biol, 1996;

66. Zhou, H, McCarthy, DB, O’Connor, CM, Reed, LJ, Stoops, JK. The re- markable structural and functional organization of the eukaryotic pyruvate dehydrogenase complexes. Proc Natl Acad Sci USA. 2001; 98:14802- 14807.

28:329-43.

263:463-474.

Page 211: Conformational Proteomics of Macromolecular Architecture

Chapter 10

METAMORPHOSIS OF AN ENZYME

Rudolf Ladenstein*", Winfried Meiningt,$, Xiaofeng Zhang',', Markus Fischer'

and Adel bert Bacher'

The protein shells of the bifunctional LumazineRiboflavin synthase complex found in bacteria, archaea and plants show some similarity to the assembly of small spherical viruses. Sixty lumazine synthase sub- units form a T = 1 icosahedral capsid, which instead of nucleic acids in the central core, contains a trimer of riboflavin synthase. Lumazine synthases from fungi, yeasts and some bacteria, however, exist only in pentameric form. Capsid formation in icosahedral lumazine synthases is dependent on the presence of certain substrate-analogous ligands, on pH and phosphate concentration. The experimental background from X-ray crystallography, X-ray small angle scattering and electron microscopy will be discussed. Different active assemblies of the en- zyme are observed in vivo and in vitro. There is experimental evidence for the formation of large capsids, obtained spontaneously or after cer- tain mutations to the sequence of the lumazine synthase subunit. Those presumably metastable T = 3 capsids can be reassembled into T = 1 capsids by ligand-driven reassembly in vitro. The active site of lumaz- ine synthase is relatively resilient to point mutations. One lethal muta- tion to the binding site for the phosphate-substrate, however, has a strong influence on both, capsid stability and enzymatic activity.

Keywords: lumazine synthase, riboflavin synthase, protein dissociation and association, vitamin biosynthesis, X-ray crystallography

*To whom correspondence should be addressed; E-mail: [email protected] 'Karolinska Institute, NOVUM, Department of Biosciences, Centre for Structural Biochemistry, S-14157 Huddinge, Sweden ?3odertorns Hogskola, S-14157, Sweden 'Lehrstuhl fur Organische Chemie und Biochemie Technische Universitat Munchen, Lichtenbergstr. 4, D-85747 Garching, Federal Republic of Germany

198

Page 212: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 199

... 1980 AT TU MUNICH ... A surprisingly large enzyme complex was isolated and purified from the cell contents of Bacillus subtilis in 1980 at the Technical University of Munich (Mailander & Bacher, 1976; Bacher et al., 1980a). The complex, which was initially called “heavy riboflavin synthase”, was a 106 Dalton protein (26 S) that consisted of two types of subunits, which were designated as a (M = 23500 Da) and p (M = 16000 Da). It was found that the enzyme catalysed two subsequent reactions of the riboflavin biosynthesis pathway, shown in Fig 1 (Nielsen et al., 1984).

GTP -------)

HzN$Jo HN

I H 7% H-C-OH

I H-C-OH

I H-C-OH

I CH;OH

1

I

H-C-OH I

H-C-OH I

H-C-OH I

y*

CH,-OH

3

t

n$xo yY

H-C-OH 4 H-C-OH

H-C-OH

CH;OH

Fig. 1 . Reaction catalyzed by the bifunctional lumazine synthase (A) riboflavin synthase (B) complex.

Page 213: Conformational Proteomics of Macromolecular Architecture

200 RudolfLudenstein et ul.

~ -1 - *

Fig. 2. Negative staining electron micrograph of “Heavy Riboflavin Synthase”

At this time, the second substrate (2) was still unknown, but a four carbon compound was suggested (Le Van et al., 1985). The molecular structure of “heavy riboflavin synthase” showed a number of unexpected features. The subunit stoichioiiietry was strilung with 3 a subunits and 60 p subunits. Electron micrographs obtained by negative staining showed approximately spherical particles with a diameter of 150 A, shown in Fig 2 (Bacher, 1980b). Binding experiments with antibodies raised against a subunits showed that the immunological determinants of the a subunits were inaccessible in the native enzyme complex (Bacher, 1986). On the basis of these preliminary results, it was suggested that the quaternary structure of the bifunctional enzyme complex is characterized as a spherical capsid of 60 p subunits with icosahedral symmetry, which contains a trimer of a subunits in the central cavity (Bacher, 1980b).

With this assumption a vast field of ideas was opened, which were extremely inspiring and needed to be verified by structure-analytical techniques. At almost the same time novel and powerful crystallographic methods were developed which allowed to use the redundancy of structure information in the icosahedral tomato bushy stunt virus for phase determination (Harrison, 1978). The successful crystallization of the bifunctional enzyme complex (Ladenstein et al., 1983) today designated as lumazine synthase (LS)/riboflavin synthase (RS) complex,

Page 214: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 20 1

initiated an exciting scientific adventure into structural biology, and, not less important, long-standing personal relations in between scientists from Germany, Sweden, Bulgaria, the United Statcs and China.

DEALING WITH ICOSAHEDRAL PARTICLES In the absence of specific ligands (substrate analogs to 1 and 3) the enzyme complex was stable only in the presence of phosphate ions in a narrow pH range close to neutrality. Dissociation occurred in between pH 7 and 8 and led to the formation of a subunit trimers. The p subunits reaggregated to form a polydisperse mixture of large oligomers with the shape of hollow or massive spheres and molecular weights, determined by sedimentation analysis in the analytical ultracentrifuge, of 3 x lo6 Daltons (26 S) and 6 x lo6 Daltons (48 S) (Bacher, 1980a,b).

The native and the reaggregated 26 S and 48 S particles were studied by X-ray small angle scattering on a synchrotron X-ray source (Ladenstein et al., 1986, see Fig 3 for a typical scattering curve). The scattering curves of the 26s particles had a characteristic appearance and were well interpretable in terms of a hollow sphere model with a ratio of inner and outer radius of Ri/R, = 0.3/1. For the native 26 S complex a diameter of 160 A was estimated from the scattering curves, in good

4 1

1 ' . , , , , , . , , I

0 5 1 0 1 5 2 0 2 5

S

Fig. 3. Typical scattering curve of the 26s (T = 1 icosahedral) particles of lumazine synthase from B. subtilis derived from X-ray small angle scattering.

Page 215: Conformational Proteomics of Macromolecular Architecture

202 Rudolf Ladenstein et al.

a b Fig. 4. Electron micrographs of freeze-etched crystals of heavy riboflavin synthase. (a) ab plane; (b) ac plane; (c) silver decoration of the ab plane of a B. subtilis LS crystal with an overlay of an icosahedral model.

agreement with the value from electron micrographs. For the 48 S aggregate a maximum diameter of about 330

In parallel to the initial crystallographic studies the crystals were studied by electron microscopy. Electron micrographs of freeze-etched crystals of the native complex showed approximately spherical molecules, which were arranged in hexagonal layer packing (Fig 4a, b). The lattice constants (156 A) found from the micrographs were in excellent agreement with the values derived from X-ray diffraction data (156.4 A). From the electron micrographs a paclung model was derived which, together with the space group symmetry of the crystals (P6322) allowed an unambiguous determination of the translation component of the particles (Ladenstein et al., 1986). Silver and gold decoration patterns mapped by electron microscopy in conjunction with image processing were used to determine approximate values of the orientation of the icosahedral symmetry axes with respect to the crystal cell axes (Fig 4c) (Weinkauf et al., 1991).

X-ray intensity data to 3.2 A resolution were obtained at EMBL Outstation, Hamburg, Germany. Patterson self rotation functions (Rossmann & Blow, 1962) calculated from these data showed a set of peaks for two-fold, three-fold and five-fold local symmetry axes accurately consistent with icosahedral symmetry and with one of both particle orientations allowed by the crystal packing model deduced from electron microscopy. With the help of these data it was clear that the

was observed.

Page 216: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 203

Fig. 5. Folding pattern of a p subunit of lumazine synthase from B. subtilis.

structure of the native complex can be described as an icosahedral capsid of 60 p subunits with the triangulation number T = 1 (Ladenstein et al., 1986). The spatial orientation of the a subunits located in the central core, however, was incompletely understood at this time. The LS/RS complex of Bacillus subtilis was thus the first bifunctional enzyme with icosahedral symmetry to be investigated by X-ray crystallography.

SUBUNIT FOLD RELATED TO FLAVODOXIN The structure of the icosahedral capsid was determined by multiple isomorphous replacement, icosahedral symmetry averaging (Bricogne, 1974; Bricogne, 1976) and phase extension to 3.3 A resolution (Ladenstein et al., 1988). Due to the limitations in computer memory and speed, phase extension in conjunction with symmetry averaging was still an enormous undertaking with respect to computing time, but was already extensively used by other colleagues in the structure analysis of spherical viruses (Rossmann et al., 1985). The averaged electron density map was of high quality, but due to the static disorder of the capsids in the unit cell it contained no information on the structure of the a subunits. The observed fold of the p subunit, a four-stranded parallel p- sheet flanked on both sides by pairs of a helices, resembled the folding

Page 217: Conformational Proteomics of Macromolecular Architecture

204 Rudolf Ladenstein et al.

a b Fig. 6. (a) C a model of a B. subtilis LS p subunit pentamer; (b) Schematical representation of p strands in a B. subtilis LS p subunit pentamer.

pattern of flavodoxins, although no significant sequence homology could be detected (Fig 5) . The fold of the p subunit has no structural resemblance with the subunit structures of the known icosahedral plant or animal viruses.

a b Fig. 7. Surface representations of p6,) capsids from (a) B. subtilis (b) A. aeolicus lumazine synthase. Color codes: red for Asp and Glu; blue for Lys, His and A r g ; green for Asn, Gln, Ser, Thr, Cys, Tyr and Gly; white for Ala, Val, Leu, Ile, Met, Pro, Phe and Trp.

Page 218: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of ar1 Enzyme

a b

Fig. 8. Ligand binding at the active sites in a p subunit pentamer from B. subtilis LS. (a) Scheme; (b) Structural representation.

The structural organization of the pentamers in the capsid is strikmg, due to its beauty and obvious (pH dependent!) structural stability (Karshikoff & Ladenstein, 1989) (Fig 6). The N-terminal segment of each monomer is attached to the beta sheet of the neighbouring monomer thus forming the fifth strand of the beta sheet. By helix a3 and its four symmetry-related neighbours a channel is formed. The five helices are twisted into a stable super-helical motif comprising pores (diameter 9 A, length 30 A) in the capsid wall, which are parallel to the five-fold local symmetry axes. Due to its stability the pentamers were suggested to constitute the building blocks in the capsid assembly (Karshikoff & Ladenstein, 1989). Experimental evidence, however, e.g. from assembly studies by X-ray or neutron scattering is still lacking.

A solvent accessible surface representation of entire p60 capsids from two LS homologs is shown in Fig 7. The capsid of Bacillus subtilis LS has a spherical shape and is characterized by a radially dipolar electrostatic potential distribution with a positive electrostatic potential covering the inner capsid surface and a negative potential on the outer surface (Karshikoff & Ladenstein, 1989). Crystallographic binding studies of substrate analogous ligands have revealed the positions of the active sites at the subunit interfaces in a pentamer (Fig 8 a,b) close to the inner capsid surface (Ladenstein et al., 1988). Ligand binding increases the stability of the capsid remarkably and drives the reassembly of p60

capsids from large T = 3 capsids (see below).

205

Page 219: Conformational Proteomics of Macromolecular Architecture

206 Rudolf Ladenstein et al.

MODELLING OF THE BIFUNCTIONAL ENZYME COMPLEX In spite of extensive crystallization screening and modification of the protein by mutation, we have never obtained diffraction quality crystals of Bacillus subtilis riboflavin synthase, the a subunit trimers, which fill the core space of the icosahedral capsid. A 2.0 A resolution structure of Escherichia coli RS, obtained by multiple anomalous dispersion methods from the Se-Met derivative of the protein was described by Liao et a1 (2001). The homotrimer consisted of an asymmetric assembly of monomers, each of which comprised two structurally similar three- stranded p barrels with the same topology and a five-turn C-terminal a helix. The similar p barrels, in fact N- and C-terminal domains, within the monomer confirm a prediction of pseudo-twofold symmetry inferred from the sequence similarity between the two halves of the monomer (Meining et al., 1998; Schott et al., 1990). The trimer interface comprised hydrophobic interactions from residues of the C-terminal a helices of the three monomers. Only two of the three monomers were involved in a tightly associated inter-molecular surface between p barrels. Consequently, the three active sites of the trimer were suggested to lie between pairs of monomers, where residues conserved among species reside. In the trimer, only one active site is formed and the other two active sites appear wide open and exposed to solvent. The nature of the trimer configuration as well as the imperfect local threefold symmetry suggested that only one active site could be formed at a time.

The structure of riboflavin synthase from Schizosaccharomyces pombe in complex with the substrate analogue, 6-carboxyethyl-7-oxo-8- ribityllumazine has been determined at 2.1 resolution. In contrast to the homotrimeric solution state of native riboflavin synthase, the enzyme was found to be monomeric in the crystal structure. Structural comparison of the riboflavin synthases of S. pombe and Escherichia coli suggested oligomer contact sites and delineated the catalytic site for dimerization of the substrate and subsequent fragmentation of the pentacyclic intermediate. The pentacyclic substrate dimer was modelled into the proposed active site. It suggests that the substrate molecule at the

Page 220: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 207

Fig. 9. Computer-generated C a model of the LS/RS complex.

C-terminal domain donates a four-carbon unit to the substrate molecule bound at the N-terminal domain of an adjacent subunit in the oligomer (Gerhardt et al., 2002).

The catalytic formation of riboflavin molecules from lumazine precursors (Fig 1) in the core space of the capsid seems to occur under rather dynamic circumstances accompanied by large domain movements needed for the opening and closing of the active sites. The Fig 9 shows a computer model obtained by fitting the C, structure of Escherichia coli RS into the icosahedral LS capsid from Bacillus subtilis (Meining et al., unpublished). The trimer axis of RS was aligned such that it is parallel with one of the icosahedral trimer axes. The RS trimer fits into the core space of the capsid remarkably well, however, large solvent-filled cavities can be recognized. It may be speculated that specific RS trimer conformations can be induced by interactions of RS with residues residing on the inner surface of the LS capsid as well as by interaction with the dimethyl-ribityl lumazine substrate (3, Fig 1). Association of RS and LS of Bacillus subtilis was reported to enhance catalytic efficiency through a substrate channeling mechanism (Kis et al., 1995). On a protomer basis, the h,,(pH 7, 25°C) for E. coli RS is about tenfold that of E.coli LS (Zheng et al., 2000). In this scenario the RS product (1) could

Page 221: Conformational Proteomics of Macromolecular Architecture

208 RudolfLadenstein et al.

bind to LS active sites and thus relieve product inhibition of RS (Kis, Volk et al., 1995). There are six lumazine binding sites in a RS trimer. Interestingly, only two of them form a productive active site, while the other four binding sites only are capable of binding lumazine.

W H Y ONLY PENTAMERS IN YEAST LUMAZINE SYNTHASE? Several of the hitherto known lumazine synthases, among them LS from Brucella abortus (Goldbaum et al., 1998), Magnaporthe grisea (Person et al., 1999), Saccaromyces cerevisiae (Meining et al., 2000) and Schizosaccharomyces pombe (Fischer et al., 2002, Gerhardt et al., 2002) exist only as pentameric complexes. It was also shown that Saccharo- myces cerevisiae LS did not associate with RS (Mort1 et al., 1996). The intriguing question why only pentamers can be formed by yeast LS could be solved by X-ray crystallography. The structure in complex with a substrate analogous ligand (5-(6-D-ribitylamino-2,4-dihydroxypyrimidine-5-y1)- 1 -pentyl-phosphonic acid) was solved at 1.85 A resolu- tion (see Fig 10a) by molecular replacement using the pentamer structure

a b

Fig. 10. Model of a yeast lumazine synthase pentamer; (b) structural alignment of a yeast LS subunit (blue) and a B. subtilis LS subunit (green).

Page 222: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 209

Fig. 11. Computer-generated assembly of the yeast lumazine synthase pentamers in an icosahedral capsid.

of the Bacillus subtilis LS capsid as a search model (Meining et al.,

Structural determinants responsible for the inability of yeast LS to form an icosahedral capsid were detected by sequence and structural alignments of the subunits of LS from Saccharumyces cerevisiae and Bacillus subtilis (Fig lob): (1) an increase of the length of the loop in between helix a4 and helix as by four residues (sequence insertion: IDEA) and (2) conformation differences in the N-termini, which in the LS from Saccharumyces cerevisiae is completely flexible. Actually, up to four amino acid residues could be deleted from the N-terminus with not more than 40% loss of the enzymatic activity. However, the removal of 17 residues from the N-terminus resulted in a still soluble protein, but with significantly reduced activity (to 5%), (Meining et al., 2000). As reasons for the inability of certain lumazine synthases to form icosahedral assemblies, an insertion of two to four residues, which extends the loop in between helix a4 and helix a5 and a proline residue in the N-terminus have been discussed. If the yeast LS pentamers were assembled into an icosahedral structure by computer modeling, several clashes of certain main-chain parts and side-chains close to the trimer interfaces were seen (Fig 11).

2000).

Page 223: Conformational Proteomics of Macromolecular Architecture

210 RudolfLudenstein et ul.

In conclusion, formation of icosahedral capsids in lumazine synthases are perturbed by capping of the N-termini (impairment of the 5-stranded P-sheets in the pentamer), disorder in the conformation of the N-termini and by sequence insertions in the loop close to the trimer interface (causing clashes of portions of the main chain preventing icosahedral arrangement).

TOWARDS HIGH RESOLUTION AND EXTREME HEAT TOLERANCE - LS FROM Aquifex aeolicus In the framework of our research on protein stability the X-ray structure analysis of LS from the hyperthermophilic bacterium Aquifex aeolicus was initiated in 1999. Aquifex, belonging to the genus Thermotogales, is a marine hyperthermophile living close to volcanic hot water sources and grows optimally at T = 85°C. The structure analysis of this hyperthermostable form of LS revealed not only a surprising surface change of the icosahedral capsid, but opened up possibilities to study ligand binding and the structure of the active site at high resolution (1.6 A). The moderate resolution of 3.3 A obtained with LS from Bacillus subtilis did, in spite of well-defined averaged electron density maps, not allow crystallographic refinement and the analysis of the solvent structure and small side chain movements necessary to study the catalytic process in detail.

An open reading frame optimized for expression of LS from Aquifex aeolicus was expressed in Escherichia coli and the structure of the recombinant enzyme was solved by molecular replacement using LS from Bacillus subtilis as a search model. The structure was refined by maximum likelihood refinement (Murshudov et al., 1997) to 1.6 A resolution and Rti,, = 23.6% (Zhang et al., 2001). The subunit fold was closely related to that of Bacillus subtilis LS with an rmsd of C, carbons of 0.8 A. The icosahedral LS complex from Aquifex aeolicus showed an apparent melting temperature of 120°C in a calorimetric scan and belongs therefore to the most thermotolerant known proteins. A comparison of the accessible surface comprised by charged residues with that of Bacillus subtilis LS revealed a doubling of charged residues on the surface of the Aquifex enzyme (See Fig 7) and the smallest fraction

Page 224: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 21 1

presented by energetically unfavourable hydrophobic surface residues. A relative increase in charged surface area and surface ionic networks is a characteristic determinant of proteins from hyperthermophiles and is related to their extreme heat tolerance. The strength of ionic interactions is due to solvation effects and the reduction of the dielectric constant of water increased at high temperature (Elcock, 1998; Karshikoff & Ladenstein, 2001). By formation of ionic networks the entropic penalty is reduced relative to pairwise interactions, an important contribution which increases the free stabilization energy of a protein according to the Gibbs-Helmholtz equation AGstab = AH - TAS.

NATURE I S NOT ALWAYS PREDICTABLE The observation of the extended loop with the sequence insertion IDEA in between helix a4 and helix a5 in Saccharomyces cerevisiae LS has inspired us to test experimentally whether formation of an icosahedral capsid would be perturbed after the introduction of this insertion into the loop sequence of Aquifex aeolicus LS. We expected to observe an Aquifex aeolicus LS which could only form pentamers. Very surprisingly, however, the modified Aquifex LS formed large 48 S hollow sphere aggregates with a diameter of 300 A (Fig 12a) indicating that the aggregation behaviour of the subunits was indeed changed, but not in the direction we expected. Similar 48 S particles were observed earlier by pH-dependent dissaggregation of Bacillus subtilis LS capsids. The molecular weight determined from sedimentation data suggested particles with 180 subunits and T = 3 icosahedral symmetry.

By applying the concepts of quasi-equivalence implying that binding geometry and binding interactions are similar in pentamers and hexamers (Caspar & Klug, 1962), we have obtained a computer-generated model of the presumptive T = 3 capsid of Aquifex LS (Fig 12b, Zhang et al., unpublished). This model, which represents the experimentally observed diameter of the large capsids surprisingly well, was obtained by exten- ding the interactions in the hexamers and by decreasing the curvature of the capsid. This model, however, needs still to be verified either by cryo- electron microscopy in conjunction with image reconstruction or by X- ray crystallography. Preliminary results from cryo-electron micros copy

Page 225: Conformational Proteomics of Macromolecular Architecture

212 Rudolf Ladenstein et al.

a b Fig. 12. (a) 48s hollow capsids of A. aeolicus lumazine synthase with sequence insertion IDEA; (b) size comparison of the models for small (26s) and large (48s) capsids.

suggest a diameter of 290 and a hexamer organisation, which is different from the hexamer structure obtained by simply extending the interfaces (Li et al., unpublished). A careful inspection of the electron micrograph (Fig 12a) shows a number of filled particles (black arrow). The structural nature of these particles is unclear. The existence of double capsids, which are abundant in spherical viruses, is suggested, i.e. a T = 3 capsid which contains a T = 1 capsid in its core. The comparison in Fig 12b shows that the relative diameters of the particles would allow such an arrangement.

LIGAND BINDING - A NUMBER OF UNSOLVED PROBLEMS Early ligand binding studies by equilibrium dialysis have shown that a variety of h-ibityl-substituted lumazines and 5-ribitylaminopyrimidines, among them substrate 1 (see Fig l), bind to Bacillus subtilis LS with a stoichiometry of one molecule per monomer corresponding to 60 molecules per capsid (Bacher & Mailander, 1978; Bacher & Ludwig, 1982). There is, however, also some binding affinity of the LS active site towards the product riboflavin (4, Fig 1) of the reaction catalyzed by RS.

Page 226: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 213

Fig. 13. Pocket at the twofold local axes of A. aeolicus lumazine synthase.

The enzyme from S. pombe was the first described lumazine synthase that was found to bind riboflavin with relatively high affinity (K, of 1.2 mM) (Fischer et al., 2002). The position and conformational orientation of bound riboflavin is similar to already known complex structures with substrate analogues. Comparing lumazine synthase crystal structures of either the isolated or the enzyme-inhibitor complexes from B. subtilis (Ladenstein et al., 1994), M. grisea, spinach (Person et al., 1999) and S. cerevisiae (Meining et al., 2000) with the crystal structure from S. pombe, clearly shows that riboflavin would act as a competitive inhibitor (Ki of 17 pM, Fischer et al., 2002) that binds in the same manner as the inhibitors of the other enzymes.

The icosahedral capsid has 60 identical binding sites for both substrates 2 and 3 (Fig 1). Each of these sites is located at the interface of two respective p subunits within a pentamer in close proximity to the inner capsid surface (Ladenstein et al., 1988; 1994). The capsid structure with a thickness of the protein wall of 39 is rather densely packed, and the routes for the entry of the substrates and the exit of products are still unclear. Channels with a diameter of about 9 A in Bacillus subtilis LS

Page 227: Conformational Proteomics of Macromolecular Architecture

214 Rudolf Ladenstein et al.

Fig. 14. Substrate binding site ofA. aeolicus lumazine synthase in complex with inhibitor 3-(7-hydroxy-8-ribityllumazine-6-yl) propionic acid.

running along the 5-fold axes of the icosahedron (Fig 6a, above) would allow the passage of 1 and 2, but appear too narrow for the exit of enzymatically produced riboflavin (4). An attempt to block the entrance of the five-fold channels by binding of five-fold symmetric tungsten compounds resulted in unperturbed enzymatic activity of the complex from Bacillus subtilis (Ladenstein et al., 1987).

Another route of substrate entry was suggested more recently: remarkably large pores were detected at the icosahedral twofold axes of Aquifex aeolicus LS (Zang et ul., 2003), constituting the interfaces in between pentamers (Fig 13). The pores at the twofold axes are formed by two symmetry equivalent tetrads of charged side-chains, comprising Arg 127, His 132, Glu 126 and Lys 13 1. The substrate binding sites were found close to the pore entrance. The pore appears large enough to allow diffusion of the substrate molecules 1 and 2 from solvent space to their binding sites. Single-site mutation studies have suggested that the catalytic activity of the enzyme is critically dependent on the presence of Arg127, which is a part of the charged tetrad, and on His88 as a proton

Page 228: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 215

Fig. 15. pH dependence of the electrostatic subunit interaction energy in p subunit - dimers, - trimers and - pentamers of Bacillus subtilis LS, a, pentamer, b, dimer, c, trimer, Curves a’, b’, c’ correspond to the ligand-free forms, respectively.

donodacceptor (Fischer et al., 2003). In the substrate-free enzyme forms usually a bound phosphate ion is found involved in a charge-charge interaction with Arg127, which was thus defined as the phosphate- binding site. Capsid stability is strongly influenced by pH, phosphate concentration and binding of substrate-analogous ligands (see following section). The question on how substrates, ligands and products penetrate the capsid wall might therefore be answered by assuming a substrate controlled capsid assembly/disassembly, which is coupled to the enzymatic function of the complex.

LIGAND BINDING, CAPSID STABILITY A N D ASSEMBLY A number of experimental studies have shown that the stability of the T = 1 capsid is dependent on changes in pH, phosphate concentration and

Page 229: Conformational Proteomics of Macromolecular Architecture

216 Rudolf Ladenstein et al.

<4M urea

ps0, 26S, T = l ?

Fig. 16. Disaggregation and ligand-driven reaggregation of B. subtilis LS capsids.

a b C

Fig. 17. Negative staining electron micrographs of B. subtilis LS wild type (a), LS mutants R127T (b), R127H (c).

the presence of the substrates or certain substrate-analogous ligands (Fig 14, Bacher et al., 1986). The electrostatic part of the subunit interaction

Page 230: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 217

energy has a pronounced minimum at pH =: 8 and is dependent on the presence or absence of substrate-analogous ligands. It was suggested that the pentamer is the most stable structure (- 4 kcal/mol pentamer at pH 8) among the three possible subunit assemblies and the trimer stability was characterized as repulsive with a negligible pH dependence (+ 2 kcal/mol trimer at pH 8). The electrostatic stabilities of pentamers and trimers appeared to be independent from the presence of the ligands, see Fig 15 (Karshikoff & Ladenstein, 1989).

The analysis has further shown that the aggregatioddisaggregation equilibrium seems to be regulated by electrostatic interactions between p subunits forming dimers, which connect the relatively stable pentamers in the pG0 capsid. The capsid may thus be considered as an assembly of 12 pentamers connected by the ligand-sensitive electrostatic dimer interactions. pH- or phosphate-dependent discharging of the tetrads at the twofold axes or the release of the ligand could lead to a weakening of the pentamer contacts via reduction of the electrostatic attraction within dimers, which is followed by dissociation of the entire capsid.

T = 1 capsids of Bacillus subtilis LS may be dissociated into a subunit trimers and p subunits by treatment with buffers at pH > 8.5 and/or decrease of the phosphate concentration, schematically shown in Fig 16, which is supported by the experimental results from ultracentrifugation, X-ray scattering, electron microscopy and gel electrophoresis. Subsequent reduction of the pH led to the formation of large 48 S hollow sphere assemblies, which are best characterised by T = 3 icosahedral symmetry (see above). T = 3 assemblies, obtained in this way, can be rearranged to form 26 S hollow T = 1 particles in a ligand- driven reaction under the presence of urea. The binding of the ligand seems thus, in agreement with the electrostatic calculations, to add a favourable contribution to the subunit interaction energy and stabilize the T = 1 capsid.

CATALYSIS AND ASSEMBLY The only lethal single site mutation in the LS active site is the exchange of R127, the binding site for the phosphate group of the substrate 1, to an uncharged residue, e.g. R127T. The mutants R127H and R127K still

Page 231: Conformational Proteomics of Macromolecular Architecture

218 RudolfLadenstein et al.

showed an enzymatic activity of 62% and 9% compared to the wildtype enzyme, respectively (Fischer et al., 2003). R127 and R127’ were identi- fied as parts of the symmetry-related ionic tetrads at the local twofolds, comprising contacts in between pentamers (see above). It was found, that mutations at the phosphate binding site R127 are able to perturb enzy- matic activity and, surprisingly enough, also capsid assembly. Electron micrographs obtained by negative staining and native acrylamide gel electrophoresis showed that the mutant protein R127H consisted of a mixture of T = 1 and T = 3 capsids, whereas in the mutant proteins R127T and R127A almost exclusively large T = 3 capsids were visible (Fig 17, Fischer et al., unpublished). It is thus tempting to speculate that mutations of the residues comprising the ionic tetrads close to the icosa- hedral twofolds will show an impaired capsid assembly and possibly also changes in the enzymatic activity. It appears therefore, that enzymatic activity, capsid stability and assembly are coupled in the icosahedral lu- mazine synthase complex.

REFERENCES 1. Bacher, A., Baur, R., Eggers, U., Harders, H. D., Otto, M. K. & Schnepple,

H. Riboflavin synthases of Bacillus subtilis. Purification and properties. J. Biol. Chem. 1980; 255: 632-637.

2. Bacher, A. & Ludwig, H. C. Ligand-binding studies on heavy riboflavin synthase of Bacillus subtilis. Eur. J. Biochem. 1982; 127: 539-545.

3. Bacher, A,, Ludwig, H. C., Schnepple, H. & Ben-Shaul, Y. Heavy ribo- flavin synthase from Bacillus subtilis. Quaternary structure and reaggre- gation. J. Mol. Biol. 1986; 187: 75-86.

4. Bacher, A. & Mailander, B. Biosynthesis of riboflavin in Bacillus subtilis: function and genetic control of the riboflavin synthase complex. J. Bact. 1978; 134: 476-482.

5. Bacher, A,, Schnepple, H., Mailander, B., Otto, M. K. & Ben-Shaul, Y. (1980). Flavins and Flavoproteins (Yagi, K. Y., T., Ed.), Japan Scientific Societies Press, Tokyo.

6. Bricogne, G. Geometric sources of redundancy in intensity data and their use for phase determination. Acta Cryst. Section A. 1974; 30: 395-405.

7. Bricogne, G. Methods and programs for direct-space exploitation of geo- metric redundancies. Acta Cryst. Section A. 1976; 32: 832-847.

Page 232: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 219

8. Caspar, D. L. D. & Mug, A. Physical principles in the construction of regular virus. Cold Spring Harbor Symp. Quant. Biol. 1962; 27: 1-24.

9. Elcock, A. H. The stability of salt bridges at high temperatures: implications for hyperthermophilic proteins. J. Mol. Biol. 1998; 284: 489-502.

10. Fischer, M., Haase, I., Kis, K., Meining, W., Ladenstein, R., Cushman, M., Schramek, N., Huber, R. & Bacher, A. Enzyme Catalysis via Control of Activation Entropy: Site-directed Mutagenesis of 6,7-Dimethyl-S-ribityl- lumazine Synthase. J. Mol. Biol. 2003; 326: 783-793.

11. Fischer, M., Haase, I., Feicht, R., Gerhardt, S., Changeux, J. P., Huber, R. & Bacher, A. Biosynthesis of riboflavin. 6,7-Dimethyl-8-ribityllumazine syn- thase of Schizosaccharo-myces pombe. Eur. J. Biochem. 2002; 269: 519- 526

12. S. Gerhardt, I. Haase, J. Kaiser, R. Huber, A. Bacher und M. Fischer. Crystal structure of 6,7-dimethyl-8-ribityllumazine synthase of Schizosuccharornyces pornbe in complex with riboflavin. J. Mol. B i d . 2002;

13. Goldbaum, F. A., Polikarpov, I., Cauerhff, A. A., Velikovsky, C. A., Braden, B. C. & Poljak, R. J. Crystallization and preliminary x-ray diffraction analysis of the lumazine synthase from Brucella abortus. J . Struct. Bid. 1998; 123: 175-178.

14. Harrison, S. C., Olson, A. J., Schutt, C. E., Winkler, F. K., Bricogne, G. Tomato bushy stunt virus at 2,9 P\ resolution. Nature. 1978; 276: 368-373.

15. Karshikoff, A. & Ladenstein, R. Electrostatic effects in a large enzyme complex: subunit interactions and electrostatic potential field of the icosahedral b60 capsid of heavy riboflavin synthase. Proteins. 1989; 5: 248- 257.

16. Karshikoff, A. & Ladenstein, R. Ion pairs and the thermotolerance of proteins from hyperthermophiles: a "traffic rule" for hot roads. Trends Biochem. Sci. 2001; 26: 550-556.

17. Kis, K., Volk, R. & Bacher, A. Biosynthesis of riboflavin. Studies on the reaction mechanism of 6,7-dimethyl-8-ribityllumazine synthase. Biochem.

18. Ladenstein, R., Bacher, A. & Huber, R. Some observations of a correlation between the symmetry of large heavy-atom complexes and their binding sites on proteins. J . Mol. Biol. 1987; 195: 75 1-753.

19. Ladenstein, R., Ludwig, H. C. & Bacher, A. Crystallization and preliminary X-ray diffraction study of heavy riboflavin synthase from Bacillus subtilis. J. Biol. Chem. 1983; 258: 11981-11983.

318: 1317-1329

1995; 34: 2883-2892.

Page 233: Conformational Proteomics of Macromolecular Architecture

220 Rudolf Ladenstein et al.

20. Ladenstein, R., Meyer, B., Huber, R., Labischinski, H., Bartels, K., Bartunik, H. D., Bachmann, L., Ludwig, H. C. & Bacher, A. Heavy riboflavin synthase from Bacillus subtilis. Particle dimensions, crystal packing and molecular symmetry. J . Mol. Biol. 1986; 187: 87-100.

21. Ladenstein, R., Ritsert, K., Huber, R., Richter, G. & Bacher, A. The lumazine synthaselriboflavin synthase complex of Bacillus subtilis. X-ray structure analysis of hollow reconstituted b-subunit capsids. Eur. J. Biochem. 1994; 223: 1007-1017.

22. Ladenstein, R., Schneider, M., Huber, R., Bartunik, H. D., Wilson, K., Schott, K. & Bacher, A. Heavy riboflavin synthase from Bacillus subtilis. Crystal structure analysis of the icosahedral b60 capsid at 3.3 A resolution. J.

23. Liao, D., Calabrese, J., Wawrzak, Z., Viitanen, P. & Jordan, D. Crystal structure of 3,4-dihydroxy-2-butanone 4-phosphate synthase of riboflavin biosynthesis. Structure. 2001; 9: 11-18.

24. Mailander, B. & Bacher, A. Biosynthesis of riboflavin. Structure of the purine precursor and origin of the ribityl side chain. J B i d Chem. 1976;

25. Meining, W., Mortl, S., Fischer, M., Cushman, M., Bacher, A. & Ladenstein, R. The atomic structure of pentameric Lumazine Synthase from Saccharomyces cerevisiae at 1.85 A resolution reveals the binding mode of a phosphonate intermediate analogue. J. Mol. Biol. 2000; 299: 181-197.

26. Meining, W., Tibbelin, G., Ladenstein, R., Eberhardt, S., Fischer, M. & Bacher, A. Evidence for local 32 symmetry in homotrimeric riboflavin synthase of Escherichia coli. J. Struct. Biol. 1998; 121: 53-60.

27. Mortl, S., Fischer, M., Richter, G., Tack, J., Weinkauf, S. & Bacher, A. Biosynthesis of riboflavin. Lumazine synthase of Escherichia coli. J. Biol. Chem. 1996; 271: 33201-33207.

28. Murshudov, G. N., Vagin, A. A. & Dodson, E. J. Refinement of Macromolecular Structures by the Maximum-Likelihood Method. Acta Cryst. Section D. 1997; 53: 240-255.

29. Nielsen, P., Neuberger, G., Floss, H. G. & Bacher, A. Biosynthesis of riboflavin. Enzymatic formation of the xylene moiety from [ 14Clribulose 5- phosphate. Biochemical & Biophysical Research Communications. 1984;

30. Persson, K., Schneider, G., Jordan, D. B., Viitanen, P. V. & Sandalova, T. Crystal structure analysis of a pentameric fungal and an icosahedral plant

Mol. Biol. 1988; 203: 1045-1070.

251: 3623-3628.

118: 814-820.

Page 234: Conformational Proteomics of Macromolecular Architecture

Metamorphosis of an Enzyme 22 1

lumazine synthase reveals the structural basis for differences in assembly. Protein Science. 1999; 8: 2355-2365.

31. Rossmann, M. & Blow, D. The detection of subunits within the crystallographic asymmetric unit. Actu Cryst. 1962; 15: 24-3 1.

32. Rossmann, M. G., Arnold, E., Erickson, J. W., Frankenberger, E. A,, Griffith, J. P., Hecht, H. J., Johnson, J. E., Kamer, G., Luo, M., Mosser, A. G. & et al. Structure of a human common cold virus and functional relationship to other picornaviruses. Nature. 1985; 317: 145-153.

33. Van, Q. L., Keller, P. J., Bown, D. H., Floss, H. G. & Bacher, A. Biosynthesis of riboflavin in Bacillus subtilis: origin of the four-carbon moiety. J. B a t . 1985; 162: 1280-1284.

34. Weinkauf, S., Bacher, A., Baumeister, W., Ladenstein, R., Huber, R. & Bachmann, L. Correlation of metal decoration and topochemistry on protein surfaces. J. Mol. Biol. 1991; 221: 637-645.

35. Zhang, X., Meining, W., Fischer, M., Bacher, A. & Ladenstein, R. X-ray Structure Analysis and Crystallographic Refinement of Lumazine Synthase from the Hyperthermophile Aquifex aeolicus at 1.6 Resolution: Determinants of Thermostability revealed from Structural Comparisons. J.

36. Zhang, X., Meining, W., Cushhman, M., Haase, I., Fischer, M., Bacher A,, and Ladenstein R. A Structure-Based Model of the Reaction Catalyzed by Lumazine Synthase from Aquifex aeolicus. J . Mol. Biol. 2003; 328: 167- 182.

37. Zheng, Y.-J., Viitanen, P. V. & Jordan, D. B. Rate Limitations in the Lumazine Synthase Mechanism. Bioorg. Chem. 2000; 28: 89-97.

Mol. Biol. 2001; 306: 1099-1 114.

T

Page 235: Conformational Proteomics of Macromolecular Architecture

Chapter 11

OPTIMIZING AN ENZYME FOR ITS PHYSIOLOGICAL ROLE: STRUCTURAL

AND FUNCTIONAL COMPARISONS OF ATP SULFURYLASES FROM THREE

DIFFERENT ORGANISMS

Andrew J. Fisher*,+,’, tan J. Macrae’, John D. Beynon’, Eric B. Lansdon’

and Irwin H. SegeI’

The crystal structures of homooligomeric ATP sulfurylase from three different organisms have recently been determined. The enzymes are (a) the allosteric hexamer from the filamentous fungus, Penicillium ckrysogenum, (b) the non-allosteric hexamer from the yeast, Succharo- myces cereviseae (both of which function in vivo to produce adenylyl-

sulfate and PPi from MgATP and inorganic sulfate), and (c) the dimeric enzyme from the Riftia (tubeworm) symbiont (a sulfur chemolitho- troph, in which the enzyme runs in the reverse mode in vivo). The sub- units of all three enzymes have the same overall fold and active site residues. The structures suggest that the different kinetic behavior of these enzymes relies not on differences in active site residues per se,

but rather, on “second layer” residues that “tune” the properties of the active site. One important “second layer” is a mobile loop (“switch”) modulating the charge of an active site arginyl residue.

Keywords: ATP sulfurylase; allosteric; crystal structure; cooperativity; evolution; APS, PAPS; structural comparison

‘Colsesponding author. Email address: [email protected] ”Section of Molecular and Cellular Biology, University of California, One Shields Avenue, Davis. California 95616. ’Department of Chemistry, University of California, One Shields Avenue, Davis, California 95616.

222

Page 236: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme,for Its Physiologicul Role 223

INTRODUCTION Much of the biologically accessible sulfur on Earth is present as inor- ganic sulfate and it is in this form that the element is assimilated by aero- bic organisms. Sulfate, however, is quite non-reactive at cellular tem- peratures and pH, so the anion must first be “activated” in order to enter the mainstream of metabolism. Activation proceeds in two steps. These are catalyzed, in order, by the enzymes ATP sulfurylase (MgATP: sulfate adenylyltransferase, EC 2.7.7.4) and APS lunase (MgATP: APS 3’- phosphotransferase, EC 2.7.1.25). The reactions produce the sulfonucleo- tides APS (adenylylsulfate; adenosine 5’-phosphosulfate) and PAPS (3’- phopshoadenylylsulfate; 3 ’-phosphoadenosine 5 ’-phosphosulfate):

n

MgATP + SOqL- MgPPi + APS (ATP sulfurylase)

MgATP + APS PAPS + MgADP (APS kinase)

The hexavalent sulfur of APS (in algae, plants, and some bacteria) or PAPS (in yeast, fungi, and most bacteria) is reduced to sulfite, then to sulfide. Sulfide condenses with 0-acetylserine to form cysteine, and sub- sequently, nearly all other sulfur-containing biornolecules. Thus in these organisms, ATP sulfurylase plays a role in sulfate utilization analogous to that played by nitrogenase (or nitrate reductase) in nitrogen assimila- tion, or ribulose bisphosphate carboxylase in C02 fixation. In all organ- isms, PAPS serves as the sulfuryl donor for sulfate ester biosynthesis e.g., chondroitin sulfate, heparin, and protein tyrosyl sulfate in animals (which do not reduce APS or PAPS), choline-0-sulfate in filamentous fungi, flavinol sulfates in plants, and sulfolipids in animals and some bacteria. In anaerobic sulfate reducing bacteria (e.g., Desulfovibrio), the physiological reaction of ATP sulfurylase is in the same direction as in sulfate assimilators, but the APS produced plays a completely different role - that of terminal electron acceptor of anaerobic metabolism. In cer- tain chemo- and photolithotrophic bacteria (e.g., Aquifex, Thiobacillus, Chromatiurn), ATP sulfurylase catalyzes the last reaction in the oxidation of reduced inorganic sulfur compounds to sulfate, i.e., the physiological

Page 237: Conformational Proteomics of Macromolecular Architecture

224 Andrew J. Fisher er al.

reaction is in the opposite direction compared to that in sulfate assimila- tors. In these sulfur bacteria, the ATP sulfurylase reaction provides a di- rect and efficient route for recycling AMP and PPi produced by biosyn- thetic reactions back to ATP at the expense of energy provided by sulfite oxidation. Sulfate (i.e., sulfuric acid) produced by sulfur chemolith- otrophs is believed to be responsible for shaping many subsurface cav- erns and caves.

The thermodynamics of the ATP sulfurylase reaction are the same regardless of the physiologically relevant direction. However, ligand af- finity, Michaelis constants, and b a t values can be optimized for a par- ticular direction (within the constraint of the Haldane relationship). Be- cause a variation in function must have a structural basis, a comparison of the structures of ATP sulfurylase from sulfate assimilators with that of the enzyme from a sulfur chemolithotroph might disclose the evolution- ary adaptations that optimize each class of ATP sulfurylase for a given direction.

To date, the crystal structures of three different ATP sulfurylases have been determined. These are the enzymes from (a) the filamentous fungus, Penicillium chrysogenum (MacRae et al., 2001; MacRae et al., 2001), (b) bakers yeast, Saccharomyces cerevisiae (Ullrich et al., 2001; Ullrich and Huber, 2001) and (c) the unnamed sulfide oxidizing bacterial symbiont of the deep-sea, hydrothermal vent, giant tubeworm, Rqtia pachyptila (Beynon et al., 2001). In this report, we compare the struc- tures of the three enzymes.

P. CHRYSOGENUM ATP SULFURYLASE ATP sulfurylase from the filamentous fungus, Penicillium chrysogenum is a homooligomer composed of six 64 kDa subunits (573 residues). Sev- eral years ago, we found that covalent modification of one free sulfhy- dry1 per subunit (designated SH- 1) changed the initial velocity kinetics at pH 8, 30" C from normal (hyperbolic) to sigmoidal with a concomitant decrease in the substrate affinities (Renosto et al., 1987). PAPS, a re- versibly bound ligand (and branch-point metabolite in filamentous fungi), induced the same change (Renosto et al., 1990). The sigmoidal

Page 238: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme for Its Physiological Role 225

curves were shown to reflect true cooperative binding (Renosto et al., 1987; Martin et al., 1989). ATP sulfurylases from other filamentous fungi behaved the same way. However, the enzyme from yeast, plants, or animal sources did not contain a Cys residue whose modification induced cooperativity and were not allosterically inhibited by PAPS (Renosto et al., 1990). The results indicated that (a) P. chrysogenum ATP sulfurylase possesses an allosteric binding site for PAPS that is not present in the enzyme from other sources and (b) SH-1 was either in the region of, or in communication with this site. Sequence studies (Foster et al., 1994) sub- sequently revealed that the enzyme contains a C-terminal region that is 42% identical to the conserved core of APS lunase, the second enzyme in the sulfate activation pathway, which has a high affinity for PAPS (Renosto et al., 1984; Renosto et al., 1985). SH-1 was identified as Cys- 509 (Martin et al., 1989; Foster et al., 1994). It was clear that residues 396-573 of P. chrysogenum ATP sulfurylase evolved from true APS kinase to provide the allosteric binding site for PAPS.

To date, we have obtained two different crystal structures of P. chry- sogenum ATP sulfurylase - one with two molecules of APS (product) bound per subunit, and one with one molecule of PAPS (allosteric effec- tor) bound per subunit. In the presence of 5 mM APS, the enzyme crys- tallized in the orthorhombic space group (1222) with unit cell parameters of a = 135.7 A, b = 162.1 A, and c = 273.0 A (MacRae et al., 2001; MacRae et al., 2001). The x-ray structure at 2.8 A resolution was solved by Se-Met multiwavelength anomalous dispersion and single isomor- phous replacement methods. That structure was refined to a crystallo- graphic R-factor of 20.7% and R-free of 24.5.

Subunit Domains

As shown in the ribbon diagram (Fig. l) , each subunit of the homooli- gomer is composed of three structurally distinct globular regions. Resi- dues 1-170 compose an N-terminal domain which folds into a partial five-stranded P-barrel surrounded by seven helices. While this domain is present in all ATP sulfurylases and contains stretches of conserved resi- dues, its exact function is unknown. (The N-terminal domain has no ho- mologs in the structural database (Holm and Sander, 1993)).

Page 239: Conformational Proteomics of Macromolecular Architecture

226 Andrew J. Fisher et al.

N-terminal

LJ

N lemma1

Figure 1. Ribbon diagram (stereo view) of the P. chrysogenurn ATP sulfurylase subunit (protonier; monomer). The subunit is divided into three distinct domains: N-terminal (light gray), catalytic (medium gray), and allosteric or regulatory (dark gray). Bound APS is shown as a stick model.

Residues 17 1-395 compose the central catalytic domain (identified by bound APS). The core of the domain is composed of a five-stranded parallel P-sheet sandwiched between five helices. Topologically, the do- main is similar to the Rossmann fold, or di-nucleotide binding domain as seen in dehydrogenases (Rossmann et al., 1974) and other adenylyl trans- ferases (Rould et al., 1991; Izard and Geerlof, 1999). All of the residues that have been identified as essential for activity are located in this do- main (Deyrup et al., 1999; Venkatachalam et al., 1999). Residues 331 - 389 form a small sub-domain, called Domain 111 in the yeast structure (Ullrich et al., 2001; Ullrich and Huber, 2001).

A distinct C-terminal domain (residues 396 - 573, also contained bound APS. This domain consists of a central five-stranded parallel p- sheet sandwiched between two a-helical bundles and contains a classic a l p purine nucleotide binding motif (Schulz and Schirmer, 1974). The high degree of sequence and topological similarity of the C-terminal do- main to true APS kinase leaves no doubt that this region provides the allosteric (regulatory) site for PAPS.

Page 240: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme for Its Physiological Role 227

01 igome r Structure

Native ATP sulfurylase from P. chrysogenum is organized as a dimer of triads in the shape of a flattened ellipsoid 134 A diameter x 73 A (Fig. 2). The majority of the inter-subunit contacts are mediated through the allos- teric domain. Within each triad, the subunits are arranged in a head-to- tail manner with the catalytic domain of one subunit contacting the allos- teric domain of the next. Each subunit makes contact with four others - - two in the same triad and two in the opposing triad. These interlocking linkages provide the means of transmitting a structural change in one subunit to all the others, i.e., a way to propagate a concerted allosteric transition (Monod et al., 1965; Rubin and Changeux, 1966; Segel, 1993).

Figure 2. Ribbon diagram of P. chrysogenum ATP sulfurylase viewed along the non- crystallographic 3-fold. One subunit of the hexamer is shown in dark gray. Bound APS molecules are shown as stick models.

A solvent channel, 15 to 70 A wide, exists along the three-fold axis, but substrates have access to the catalytic site only from the external medium. On the other hand, a deep surface trench links each catalytic site in one triad with an allosteric site in the other triad. This trench may be a vestigial feature of a bifunctional (“PAPS syn- thetase”) ancestor of fungal ATP sulfurylase in which APS could be channeled from the sulfurylase site to the kinase site.

Active Site

Key residues of the phosphosulfate subsite include a 197QTRN200 motif and the main chain amide-N of Ala-295. The former is included within a highly conserved stretch ( ‘93VVAFQTRNPMHRAHREL209 in fungi) that forms a part of the active site pocket. The side chain amide N of Gln- 197, the guanido nitrogen of Arg-199, and the main-chain amide nitrogen

Page 241: Conformational Proteomics of Macromolecular Architecture

228 Andrew J. Fisher et al.

Figure 3. Active site of P. chrysogenum ATP sulfurylase. The stereo view ribbon dia- gram shows residues that interact with bound APS. Dashed lines represent hydrogen bonds.

of Ala-295 each interact with a different non-bridging oxygen atom of the sulfuryl group (Fig. 3 ) . Asn-200 hydrogen bonds to a non-bridging oxygen of the phosphate group. The hydroxyl of Thr-198 is also within hydrogen bonding distance of one of the phosphate non-bridging oxy- gens, although this residue is not conserved in all ATP Sulfurylases (Deyrup et al., 1999).

His-224, located just before the start of the active site switch (see be- low), hydrogen bonds to Arg-280, which in turn, hydrogen bonds with Gln-197 of the '97QTRN200 motif. Thus, the oriented amide group can serve only as a hydrogen bond donor to sulfate. The fact that all three hydrogen bonding partners of sulfate are H-donating groups may be the basis of the sulfate/phosphate discrimination (Wang et al., 1994; Quio- cho, 1996; Quiocho and Ledvina, 1996). If so, replacement of Gln-197 with Glu might permit phosphate to bind in place of sulfate. Other im- portant residues in the active site include His-203 and His-206, which, as HXXH, are a common feature of nucleotidyl transferases (Bork et al., 1995; Veitch and Cornell, 1996). These residues probably interact with the p and y phosphates of bound ATP. The Arg-199 and the His residues have been shown to be essential for activity (Deyrup et al., 1999; Venkatachalam et al., 1999).

Page 242: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme for Its Physiological Role 229

The structure discloses that several highly conserved regions of ATP sulfurylases participate in shaping the active site. One of these “second layer” regions is 2 6 5 M ~ @ ~ G P R E A @ W H A ~ ~ R ~ N 2 8 2 ((I = hydrophobic residue). This stretch contributes to the hydrophobic character of the phosphosulfate subsite and to the cohesion of the N-terminal and cata- lytic domains. Additionally, His-276 hydrogen bonds to His-294, which, in turn, binds to the ribose of APS. His-294 also helps to position Gln- 197. Another conserved sequence is 288FIVGRD~AG~G298. This stretch forms a hydrophobic core “below” the active site pocket. The short se- quence 307YGPY310 forms the top of a helix behind 291GRDHA295 and sterically positions the latter. The Arg and Asp residues have been shown to be essential for ATP sulfurylase activity (Deyrup et al., 1999) and the entire stretch may be part of a larger PPi binding loop (Bork and Koonin, 1994). The stretch 3s71SGT~~R363 forms a turn and the start of a helix “above” the active site. The 379PEV38’ helps to position His-203.

ATP sulfurylase prefers ATP as a substrate to other nucleoside triphosphates. The structure explains this preference by revealing that the N1 nitrogen of the adenine ring hydrogen bonds to the main chain amide of Val-333 (Fig. 3). Pyrimidine nucleotides would lie too far from Val- 333, and guanine nucleotides would be protonated on N1 and thus, un- able to form that hydrogen bond. The 2’-hydroxyl of the APS ribose hy- drogen bonds with the main chain amide nitrogen of Arg-292, the main chain carbonyl oxygen of Arg-292, and the main chain amide nitrogen of His-294. These interactions probably account for the enzyme’s prefer- ence for ATP over 2’-deoxy-ATP (Tweedie and Segel, 1971).

Active Site “Switch” The crystal structures brought to our attention a loop positioned over the active site. This loop, (228GLTKPGDIDHF238 in the P. chrysogenum en- zyme) is designated the active site “switch”. The switch was in a “down“ position in the structure of fungal sulfurylase containing bound APS, with Asp-234 interacting directly with Arg- 199 of the phosphosulfate subsite. Asp-234 probably functions to attenuate the charge on Arg-199 allowing the latter to interact with a sulfate oxygen via a preferred hy- drogen bond, as opposed to a salt linkage (Quiocho, 1996; Quiocho and

Page 243: Conformational Proteomics of Macromolecular Architecture

230 Andrew J. Fisher et al.

Ledvina, 1996). (A full electrostatic interaction might not allow optimal placement of all three H- bonding oxygens of sulfate.) If Asp-234 does indeed orient Arg-199 in this manner, mutation of the former to Ala should have a major effect on the affinity of the enzyme for sulfate or APS .

In the crystal structure of the E PAPS complex (to be reported else- where) the active site was empty and the switch was in an “up” position, as it was in the structure of the ligand-free Riftia symbiont enzyme (Bey- non et al., 2001). This might mean that the position of the switch corre- lates with occupancy or affinity of the active site. The high thermal B- factors observed for switch residues and the presence of a flexible (and highly conserved) Gly residue at hinge position 228 are consistent with mobility of the switch. The E PAPS of the P. chrysogenum enzyme is almost certainly in the low-substrate affinity T state. Consequently, it is also possible that the switch position correlates with the allosteric state of the fungal enzyme. In fact, both correlations may be true. That is, the allosteric effector may not actually induce a new or unique enzyme con- formation, but rather, may act to stabilize or exaggerate a low affinity transient conformation that is already utilized in the catalytic cycle. Thus the “down” switch position may correspond to the R state; the “up” posi- tion may correspond to the T state the enzyme. (Normal catalysis may involve only small shifts in the position of the switch such that each sub- unit can go through a catalytic cycle independently without triggering a concerted transition of the others.) The active site switch resides in a longer sequence that is conserved in homooligomeric ATP sulfurylases from all sources as @~@Hx@xGxxKxxD@xxxxR. Sequence differences between the switches of ATP sulfurylases from sulfate assimilators and sulfur chemolithotrophs might contribute to the kinetic differences be- tween these two classes of homooligomeric sulfurylases.

Allosteric Domain

Given the high degree of sequence homology between the C-terminal domain of P. chrysogenum ATP sulfurylase and true APS kinase, it was not surprising that they also have a high degree of structural identity. The alpha-carbon backbone of the allosteric domain and APS kinase

Page 244: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme for Its Physiological Role 23 1

Figure 4. Superposition of the allos- teric domain of P. chrysogenum ATP sulfurylase (dark gray) with APS kinase from the same organism (light gray). Bound APS (at the top, center) and ADP (middle, left) are shown as ball and stick models. APS binds in a nearly identical manner to APS kinase and the C-terminal do- main.

superimpose with an rms deviation of 1.09 €or 172 equivalent a- carbons (Fig. 4). Also, the binding interactions and the conformation of bound APS in the allosteric domain of ATP sulfurylase are identical to those in APS kinase. For example, at the allosteric site, the adenine ring of bound APS is sandwiched between Phe-446 and Phe-529 with N1 hy- drogen bonded to Arg-451 (Lansdon et al., 2002). The phosphosulfate moiety is bound in a bidentate salt linkage with the guanidinium group of Arg-437. These four residues are strictly conserved in true APS kinases from fungi, plants, bacteria, and animals. On the other hand, the C- terminal domain has no detectable APS lunase activity. One modification that may explain the absence of activity occurs in the P-loop region: The sequence 32GLSASGKST40 that is highly conserved among APS kinases is modified to GYMNSGKDA411 in the C-terminal domain of ATP sulfurylase. While the glycine residues are conserved in the P-loop re- quired for phosphate binding, the serine that normally binds to the Mg2+ ion is mutated to aspartate in the allosteric domain thus blocking phos- phate binding. APS kinase that had been mutated to contain either the YMN or the DA sequence was inactive (MacRae et al., 1998). Also, Asn-184, which hydrogen bonds to the adenine ring of ATP in APS kinase is altered to Phe-548 in the C-terminal domain. The aromatic ring of the Phe occupies the space that the adenine ring of bound ATP would occupy in true APS kinase.

403

Page 245: Conformational Proteomics of Macromolecular Architecture

232 Andrew J. Fisher et al.

Given the high concentration of APS used in the crystallization me- dium and the structural similarity of APS and PAPS, it was not surpris- ing to find APS bound at the allosteric site. However, under normal assay conditions, the lunetics indicate that micromolar APS drives the enzyme to the R state; PAPS drives the enzyme to the T state. (MacRae and Segel, 1997). This pattern means that APS has a higher affinity for the active site than for the allosteric site, while the opposite is true for PAPS. Thus, the presence of APS at the allosteric site may be a crystallization artifact. In the E APS APS structure, Asp-343 forms a hydrogen bond to the 3’ OH of the APS ribose, an interaction that could not occur when PAPS is bound at the allosteric site.

The allosteric domain in the fungal ATP sulfurylase makes extensive contacts with other subunits within the hexamer. These interactions not only serve to propagate a concerted allosteric transition upon PAPS bind- ing, but also help to stabilize the hexamer. Deletion of this domain by site-directed mutagenesis yields a catalytically active, but thermally la- bile monomeric enzyme (to be reported elsewhere).

lntertriad Trench The high degree of sequence and structural homology between the allos- teric domain of fungal ATP sulfurylase and APS kinase suggests that present day fungal ATP sulfurylase may have evolved from an ancestral bifunctional “PAPS synthetase”. This possibility is supported by the ob- servations that (a) the structurally similar enzyme from Aqufex aeolicus has some APS kinase activity (Hanna et al., 2002) and (b) a trench in the P. chrysogenum enzyme connects the active site with a trans-triad allos- teric site (Fig. 5) . The structure and kinetic properties of the fungal en- zyme with its allosteric domain replaced by true APS kinase would be of great interest. (The enzyme from animal cells is bifunctional and has been reported to channel APS between the ATP sulfurylase and APS kinase sites (Lyle et al., 1994). However, in that enzyme, the APS kinase domain is N-terminal to the ATP sulfurylase domain, so its structure is probably different from that of the fungal enzyme.)

Page 246: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme for Its Physiological Role 233

Figure 5. Surface diagram (stereo view) showing the “trench” that links the catalytic site of a subunit in one triad with the allosteric site of a symmetry-related subunit in the opposite triad. A P S molecules at each site are shown as CPK space-filling models.

S. CEREVlSlAE ATP SULFURYLASE ATP sulfurylase from yeast (Ullrich et al., 2001; Ullrich and Huber, 2001) has a hexameric structure that is very similar to that of the P. chry- sogenum enzyme (Fig. 6A). The N-terminal and catalytic domains of the two enzymes (residues 1-395) are 67% identical in sequence and super- impose with an rms deviation of 0.72 for 363 equivalent a-carbons. In the absence of PAPS, the yeast and P. chrysogenum enzymes have al- most identical kinetic properties (Foster et al., 1994) and not surpris- ingly, the active sites of the two enzymes are almost identical. All key residues involved in substrate binding and/or catalysis are conserved.

At first glance, there appears to be no similarities between the C- terminal domains of the P. chrysogenum and yeast ATP sulfurylases. The sequences do not align and the latter is about 50 residues shorter. Never- theless, the topology of the yeast C-terminal domain reveals that it too must have evolved from APS kinase. In fact, the cores of the yeast and P. chrysogenum C-terminal domains superimpose with an rms deviation of 0.84w for 83 equivalent a-carbons (Fig. 6B). Yeast ATP sulfurylase is not allosterically inhibited by PAPS (Renosto et al., 1990). This is not surprising considering that many residues responsible for binding PAPS are missing. For example, the mobile lid element which forms half the binding pocket for (P)APS in true APS kinase (Lansdon et al., 2002) and in the allosteric domain of the P. chrysogenum enzyme is completely deleted in the yeast enzyme. The yeast C-terminal domain may be a rem- nant of the APS kinase domain of an ancestral “PAPS synthetase”. Parts of the domain may have been retained in order to stabilize the hexameric

Page 247: Conformational Proteomics of Macromolecular Architecture

234 Andrew J. Fisher et al.

structure. While the domain cores are similar in structure, the yeast en- zyme is missing all the residues used by the P. chrysogenum enzyme to bind PAPS.

Penicillium chrysogenum Yeast

Figure 6A. Hexameric ATP sulfurylases from P. chrysogenum (left) and from yeast (right). The two enzymes have structurally similar N-terminal and catalytic domains, but differ in their C-terminal domains. Each subunit is shown in a different shade of gray.

Figure 6B. Superposition of C- temnal domains of ATP sul- furylase from P. chrysogenum (dark gray) and the yeast en- zyme (light gray).

Page 248: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme for Its Physiological Role 235

ATP SULFURYLASE FROM A CHEMOLITHOTROPHIC BACTERIUM Sulfur chemolithotrophs inhabit an environment that is rich is reduced inorganic sulfur compounds and consequently, have no need of a sulfate activation and reduction pathway. Nevertheless, many of these organisms possess very high levels of ATP sulfurylase. The enzyme functions in the ATP synthesis direction: APS (formed by the AMP-coupled oxidation of inorganic sulfite) reacts with PPi (produced by ATP-dependent biosyn- thetic reactions) to regenerate ATP in a reaction that is extremely favor- able thermodynamically. Compared to the ATP sulfurylases from sulfate assimilators, the enzyme from sulfur chemolithotrophs have markedly increased Michaelis constants for sulfate (Renosto et al., 1991; Hanna et al., 2002). This feature is consistent with a physiological role of releasing sulfate as a product into a high sulfate environment. In order to explore the structural basis of the kinetic difference between ATP sulfurylases from sulfate assimilators and sulfur chemolithotrophs, the enzyme from the Riftia symbiont (Cavanaugh et al., 1981) was determined at 2.2 A

Figure 7. Structure of ATP sulfurylase from the Riftiu symbiont. The enzyme crystallized with one subunit per asymmetric unit. The crystallographic 2-fold axis is vertical in the plane of the page. The N-terminal and cata- lytic domains of the subunit on the left are shown in medium and light gray, respectively. The 2-fold related subunit is shown in dark gray.

(Beynon et al., 2001). That en- zyme has the same overall struc- ture and topology as the fungal enzyme except that (a) there is no analogous C-terminal domain and (b) the Rif ia symbiont en- zyme is a dimer (Fig. 7).

The N-terminal and catalytic domains of the Rifia symbiont and fungal enzymes superim- pose with rms deviations of 1.77 A and 1.64 A, respectively. Compared to the interface be- tween catalytic domains of P. chrysogenum or yeast ATP sul- furylases, one subunit of the Rif ia symbiont enzyme is ro- tated 90" (Fig. 8). This results in

Page 249: Conformational Proteomics of Macromolecular Architecture

236 Andrew J. Fisher et 01.

, i

Riftia Dimer

Figure 8. Comparison of the dimer interface of the Riftia symbiont ATP sulfurylase (left) with the interface between catalytic domains of the P. chrysogenum enzyme (right). Op- posing subunits are shown in different shades of gray. One partner of the P. chrysogenum catalytic domain “dimer” (shown in light gray) is shown in the same orientation as the light gray subunit of the Riftia symbiont enzyme. The crystallographic 2-fold axes that relate both sets of dimer partners are shown as black lines. The views show the proximity of the active site switch in the Rifia symbiont (left subunit) to helix a1 of the 2-fold related subunit. There is no equivalent helix in the P. chrysogenum enzyme.

a much greater buried surface area compared to the other enzymes (1576 A in the fungal enzyme.) The rotation places the active site switch in close proximity to helix a1 of the partner subunit where the switch is held in an “open” position by a hydrophobic interaction. In that open position the switch cannot interact with the phosphosulfate subsite QXRN residues. This small structural difference might explain the high Km for sulfate exhib- ited by ATP sulfurylase of sulfur chemolithotrophs (Renosto et al., 1991; Hanna et al., 2002). (It may also be the reason why sulfate was not bound at the phosphosulfate subsite in the crystal grown in 2.3 molar a m o - nium sulfate.) Catalytic dimer partners within the fungal hexamer are rotated differently and unable to enter into the Riftia type of hydrophobic interaction.

0 2 . 0 2 . in the Riftia enzyme compared to 332 A

Page 250: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme for Its Physiological Role 231

Figure 9. Superposition of the active sites from three ATP sulfurylases whose structures are known. In this stereo view, the enzymes from the Riftia symbiont, P. chrysogenum, and yeast are shown (in order) in light, dark, and medium gray. Bound APS, as observed in the P. chrysogenum and yeast enzymes, is shown with white carbon bonds. The active site switch is in the “open” position in the Rifia symbiont structure. Residues that par- ticipate in binding sulfate or the phosphosulfate group are shown with numbering corre- sponding to the Rifia symbiont enzyme.

CONCLUDING REMARKS As shown in Fig. 9, the crystal structures of ATP sulfurylases from three different organisms reveal a high degree of structural homology at their active sites despite differences in kinetic behavior. With such closely related structures, the imposition of an allosteric response, or kinetic op- timization for a particular direction, does not rely on differences in active site residues, but rather on the status of “second layer” residues. “Second layer” residues are those that do not interact directly with bound sub- strates or products, but rather, interact with and “tune” the properties of the active site residues. One important “second layer” region is the mo- bile active site switch that interacts with the conserved phosphosulfate subsite, QXRN. Movement of this switch may accompany or control substrate binding and product release during the normal catalytic cycle. The structures of the fungal enzymes that we have determined suggest that PAPS binding to the allosteric domain may act by changing the

Page 251: Conformational Proteomics of Macromolecular Architecture

238 Andrew J. Fisher et al.

steady state distribution of enzyme molecules between the “closed” and “open” switch conformations. In effect, the allosteric transition may have evolved by appropriating or exaggerating the normal movement of this switch. Similarly, an intrinsically altered switch position might account for differences in sulfate K, values between ATP sulfurylases from sul- fate assimilators and sulfur chemolithotrophs.

ACKNO W 1 E DG E M E NTS Research on the P. chrysogenum and Riftia symbiont enzymes was sup- ported by NSF Grant MCB-9904003 to I.H.S. and A.J.F. and by the fa- cilities of the W.M. Keck Foundation Center for Structural Biology at the University of California, Davis.

REFERENCES 1. Beynon, J. D., MacRae, I. J., Huston, S. L., Nelson, D. C. and Segel, I. H.

F., A. J. Crystal structure of ATP sulfurylase from the bacterial symbiont of the hydrothermal vent tubeworm, Riftia pachyptila. Biochemistry, 200 1, 40:

2. Bork, P., Holm, L., Koonin, E. V. and Sander, C. The cytidyltransferase su- perfamily: Identification of the nucleotide-binding site and fold prediction. Proteins: Structure, Function, and Genetics, 1995, 22: 259 - 266.

3. Bork, P. and Koonin, E. V. A P-loop-like motif in a widespread ATP pyro- phosphatase domain: Implications for the evolution of sequence motifs and enzyme activity. Proteins: Structure, Function, and Genetics, 1994, 20: 347 - 355.

4. Cavanaugh, C. M., Gardiner, S. L., Jones, M. L., Jannasch, H. W. and Waterbury, J. B. Prokaryotic cells in the hydrothermal vent tube worm Riftia pachyptila Jones: Possible chemoautotrophic symbionts. Science, 1981,213: 340 - 342.

5. Deyrup, A. T., Krishnan, S., Singh, B. and Schwartz, N. B. Activity and stability of recombinant bifunctional rearranged and monofunctional do- mains ATP sulfurylase and adenosine 5‘-phosphosulfate kinase. J. Biol. Chem., 1999,274: 10751 - 10757.

6. Deyrup, A. T., Singh, B., Krishnan, S., Lyle, S. and Schwartz, N. B. Chemical modification and site-directed mutagenesis of conserved HXXH

14509 - 14517.

Page 252: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme for Its Physiological Role 239

and PP-loop motif arginines and histidines in murine bifunctional ATP sul- furylaseladenosine 5'-phosphosulfate kinase. J. Biol. Chem., 1999, 274:

7. Foster, B. A., Thomas, S. M., Mahr, J. A., Renosto, F., Patel, H. and Segel, I. H. Cloning and sequencing of ATP sulfurylase from Penicillium chry- sogenum: Identification of a likely allosteric domain. J. Biol. Chem., 1994,

8. Hanna, E., MacRae, I. J., Medina, D. C., Fisher, A. J . and Segel, I. H. ATP sulfurylase from the hyperthermophilic chemolithotroph Aquifex aeolicus. Arch. Biochem. Biophys., 2002,406: 275 - 288.

9. Holm, L. and Sander, C. Protein structure comparison by alignment of dis- tance matrices. J. Mol. Biol., 1993, 233: 123 - 138.

10. Izard, T. and Geerlof, A. The crystal structure of a novel bacterial adenylyl- transferase reveals half of sites reactivity. EMBO J , 1999, 18: 2021-2030.

11. Lansdon, E. B., Segel, I. H. and Fisher, A. J. Ligand-induced structural changes in Adenosine 5'-phosphosulfate (APS) kinase from Penicillium chrysogenum. Biochemistry, 2002,41: 13672- 13680.

12. Lyle, S., Ozeran, J. D., Stanczak, J., Westley, J. and Schwartz, N. B. Inter- mediate channeling between ATP sulfurylase and adenosine 5"- phosphosulfate kinase from rat chondrosarcoma. Biochemistry, 1994, 33: 6822 - 6827.

13. MacRae, I., Rose, A. B. and Segel, I. H. Adenosine 5'-Phosphosulfate (APS) Kinase from Penicillium chrysogenum: Site Directed Mutagenesis at Putative Phosphoryl-Accepting and ATP P-loop Residues. J. Biol. Chem.,

14. MacRae, I. and Segel, I. H. ATP sulfurylase from filamentous fungi: Which sulfonucleotide is the true allosteric effector? Arch. Biochem. Biophys.,

15. MacRae, I. J., Segel, I . H. and Fisher, A. J. Crystal structure of ATP sul- furylase from Penicillium chiysogenum:. FASEB J., 2001, 15: Abstract No. 192.199; p. A203.

16. MacRae, I. J., Segel, I. H. and Fisher, A. J. Crystal structure of ATP sul- furylase from Penicillium chiysogenum: Insights into the allosteric regula- tion of sulfate assimilation. Biochemistry, 200 1,40: 6795 - 6804.

17. Martin, R. L., Daley, L. A., Lovric, Z., Wailes, L. M., Renosto, F. and Segel, I. H. The "Regulatory" Sulfhydryl Group of Penicillium chry- sogenum ATP Sulfurylase: Cooperative Ligand Binding After SH

28929 - 28936.

269: 19777 - 19786.

1998,273: 28583 - 28589.

1997,337: 17 - 26.

Page 253: Conformational Proteomics of Macromolecular Architecture

240 Andrew J. Fisher et al.

Modification; Chemical and Thermodynamic Properties. J. Biol. Chem.,

18. Monod, J., Wyman, J. and Changeux, J.-P. On the nature of allosteric transitions: A plausible model. J. Molec. Biol., 1965, 12: 88 - 118.

19. Quiocho, F. A. Atomic basis of the exquisite specificity of phosphate and sulfate transport receptors. Kidney Znt, 1996,49: 943-946.

20. Quiocho, F. A. and Ledvina, P. S. Atomic structure and specificity of bacte- rial periplasmic receptors for active transport and chemotaxis: variation of common themes. Mol Microbiol, 1996, 20: 17-25.

21. Renosto, F., Martin, R. L., Borrell, J. L., Nelson, D. C. and Segel, I. H. ATP Sulfurylase from Trophosome Tissue of Riftia pachyptila (Hydrothermal Vent Tube Worm). Arch. Biochem. Biophys., 1991,290: 66 - 78.

22. Renosto, F., Martin, R. L. and Segel, I. H. ATP Sulfurylase from Penicil- lium chrysogenum: Molecular Basis of the Sigmoidal Velocity Curves In- duced by Sulfhydryl Group Modification. J. Biol. Chem., 1987, 262: 16279

23. Renosto, F., Martin, R. L., Wailes, L. M., Daley, L. A. and Segel, I. H. Regulation of Inorganic Sulfate Activation in Filamentous Fungi: Allosteric Inhibition of ATP Sulfurylase by 3'-Phosphoadenosine-S'-Phosphosulfate. J. Biol. Chern., 1990,265: 10300 - 10308.

24. Renosto, F., Seubert, P. A., Knudson, P. and Segel, I. H. Adenosine 5'- Phosphosulfate kinase from Penicillium chrysogenum: Determining ligand dissociation constants of binary and ternary complexes from the kinetics of enzyme inactivation. J. Biol. Chem., 1985, 260: 11903 - 11913.

25. Renosto, F., Seubert, P. A. and Segel, I. H. Adenosine-5'-Phosphosulfate Kinase from Penicillium chrysogenum: Purification and Kinetic Characteri- zation. J. Biol. Chem., 1984, 259: 2113 - 2123.

26. Rossmann, M. G., Moras, D. and Olsen, K. W. Chemical and biological evolution of a nucleotide-binding protein. Nature, 1974, 250: 194 - 199.

27. Rould, M. A., Perona, J. J. and Steitz, T. A. Structural basis of anticodon loop recognition by glutaminyl-tRNA synthetase. Nature, 1991, 352: 21 3- 218.

28. Rubin, M. M. and Changeux, J.-P. On the nature of allosteric transitions: Implications of non-exclusive ligand binding. J. Mol. Biol., 1966, 21: 265 - 274.

29. Schulz, G. E. and Schirmer, R. H. Topological comparison of adenylyl kinase with other proteins. Nature, 1974, 250: 142 - 144.

1989,264: 11768 - 11775.

- 16288.

Page 254: Conformational Proteomics of Macromolecular Architecture

Optimizing an Enzyme for Its Physiological Role 24 1

30. Segel, 1. H. Enzyme Kinetics: Behavior and Analysis of Rapid Equilibrium and Steady-State Enzyme Systems. New York, Wiley-Interscience, 1993.

31. Tweedie, J. W. and Segel, I. H. ATP sulfurylase from Penicillium chry- sogenum. I. Purification and characterization. Prep. Biochem., 1971, 1: 91 - 117.

32. Ullrich, T. C., Blaesse, M. and Huber, R. Crystal structure of ATP sulfury- lase from Saccharomyces cerevisiae, a key enzyme in sulfate activation.

33. Ullrich, T. C. and Huber, R. The complex structures of ATP sulfurylase with thiosulfate, ADP, and chlorate reveal new insights in inhibitory effects and the catalytic cycle. J. Mol. Biol., 2001, 313: 11 17 - 1125.

34. Veitch, D. P. and Cornell, R. B. Substitution of serine for gly-91 in the HXGH motif of CTP:phosphocholine cytidyltransferase implicates this mo- tif in CTP binding. Biochemistry, 1996, 35: 10743 - 10750.

35. Venkatachalam, K. V., Fuda, H., Koonin, E. V. and Strott, C. A. Site- directed mutagenesis of a conserved nucleotide binding HXGH motif lo- cated in the ATP sulfurylase domain of human bifunctional 3'- phosphoadenosine 5'-phosphosulfate synthase. J. Biol. Chem., 1999, 274: 2601 - 2604.

36. Wang, Z., Choudhary, A., Ledvina, P. S. and Quiocho, F. A. Fine tuning the specificity of the periplasmic phosphate transport receptor. Site-directed mutagenesis, ligand binding, and crystallographic studies. J Biol Chem,

EMBO J , 2001, 20: 316-329.

1994,269: 2509 1-25094.

Page 255: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 256: Conformational Proteomics of Macromolecular Architecture

PART V RIBOSOMES

Page 257: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 258: Conformational Proteomics of Macromolecular Architecture

Chapter 12

RIBOSOMAL CRYSTALLOGRAPHY: DYNAMICS, FLEXIBILITY A N D PEPTIDE

B O N D FORMATION

Ada Yonath*

High-resolution crystal structures of functionally active ribosomal par- ticles provide unique tools for understanding key questions concerning ribosomal function, mobility, dynamics, and involvement in cellular regulation. Structure analysis of complexes of ribosomal particles with substrate analogs and universal drugs indicated that ribosomes provide the structural frame for precise positioning of the tRNA molecules rather than participate in the catalytic event, and that the peptide bond is being formed by a nucleophilic attack of the amino moiety of the residue bound to A-site tRNA on the carbonyl carbon at the P-site. Clinically relevant antibiotics interact almost exclusively with rRNA. They interfere with substrate binding, limit the conformational mobil- ity, block the nascent chain exit tunnel or hinder the progression of growing peptide chains.

Keywords: Protein synthesis, ribosomes, translation factors, GTP hydrolysis, L12.

I NTRO D UCTlO N In rapidly growing cells, the compounds involved in the translation of the genetic code into proteins constitute about half of the cell's dry weight and consumes up to 80% of the cell's energy. This fundamental life proc- ess involves the participation of more than a hundred components; among them is the ribosome, the largest known macromolecular enzyme.

'Department of Structural Biology; Weizmann Institute of Science, 76 100 Rehovot Israel Email address: yonath@ rnpgars,desY.de

245

Page 259: Conformational Proteomics of Macromolecular Architecture

246 Ada Yonath

Ribosomes are the universal cellular organelles built of two subunits of unequal size. The prokaryotic ribosomal small subunit (called 30s) has a molecular weight of 8.5 x lo5 Dalton and contains one RNA chain of over 1500 nucleotides and 20 proteins. The prokaryotic large ribosomal subunit (called 50s) is of molecular weight of 1.5 x lo6 Dalton and con- tains two RNA chains with a total of about 3000 nucleotides and around 35 proteins.

The smaller subunit has key roles in the initiation of the translation process, in decoding the genetic message, in discriminating against non- and near-cognate amino-acylated tRNA molecules, and in controlling the fidelity of codon-anticodon interactions. The larger subunit contains the peptidyl transferase center, the site where the peptide bonds are created. Upon initiation of protein synthesis, the two ribosomal subunits associate to form functionally active 70s ribosome, utilizing amino acids brought to it by amino-acylated tRNA molecules. Within the ribosome there are three binding sites for transfer RNA (tRNA), designated the P (peptidyl), A (aminoacyl) and E (exit) sites which are partly located on both the small and the large subunits. The anticodon loops of the three tRNA molecules bind to the small subunit, whereas the acceptor stems bind to the large subunit. Both subunits work together to translocate all three tRNAs molecules and the associated mRNA chain by precisely one codon with respect to the ribosome. The entire process depends on an energy source, the hydrolysis of GTP, and several extrinsic cellular pro- tein factors.

The ribosome is a precisely engineered molecular machine that per- forms an intricate multi-step process that requires smooth and rapid switches between different conformations. Both ribosomal subunits can undergo reversible alterations and contain structural elements that par- ticipate in global motions together with local rearrangements. One of the major events involved in protein biosynthesis that requires significant mobility of both ribosomal subunits is the GTPase-dependent transloca- tion. In the course of protein biosynthesis, once a peptide bond is formed, the P-site tRNA is deacylated and its acceptor end moves to the E (exit)- site, while the A-site tRNA, carrying the nascent chain moves into the P- site. This fundamental act in the elongation cycle of protein synthesis is

Page 260: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 247

called translocation. It may be performed either by a simple translation of entire tRNA molecules, according to the classical three-site model (Rheinberger et al., 1981, Lill and Wintermeyer, 1987), or incorporate an additional intermediate hybrid state. According to the latter proposal, translocation occurs in two discrete steps. In the first, spontaneous step, that occurs right after peptide bond formation, the tRNA acceptor end moves relative to the large subunit. In the second step, which is promoted by EF-G, the anticodon moves relative to the small subunit (Moazed and Noller, 1989, Wilson and Noller, 1998).

Antibiotics are natural or man-made compounds, designed to inter- fere with bacterial metabolism and eliminate bacteria by inhibiting the biosynthesis of protein or DNA or cell-wall components. About 40% of the known antibiotics interfere with protein biosynthesis. The ribosome is one of the main binding targets for a broad range of natural and syn- thetic antibiotics. Structurally diverse natural as well as synthetic com- pounds efficiently inhibit ribosomal function (Cundliffe, 198 1, Spahn and Prescott, 1996). Theoretically the ribosome offers multiple opportu- nities for the binding of small compounds, but practically all the known drugs utilize only a few sites. Biochemical information about binding and action of antibiotics on the ribosome has been accumulated for al- most four decades.

Puromycin played a central role in biochemical experiments aimed at the understanding of the mechanism of peptide bond formation (Pestka, 1977, Vazquez, 1979, Gale et al., 1981, Porse and Garrett, 1995, Rodri- guez-Fonseca et al., 2000) since it can bind to the A-site (Moazed and Noller, 1991, Monro et al., 1969, Smith et al., 1965, Traut and Monro, 1964) as well as to the P-site, albeit to a lower extent (Bourd et al., 1983, Kirillov et al., 1997). Puromycin, which was named "antibiotic agent" because it is a product of a microorganism, is a universal ribosome in- hibitor, since it binds to all ribosomes. It binds at the peptidyl transferase center (Rodriguez-Fonseca et al., 2000) and as such is being used for studies on the mechanism of peptide bond formation. Unlike other anti- biotics, puromycin does not lead to drug-resistance by mutations in the PTC (Garrett and Rodriguez-Fonseca, 1995). Puromycin is partially co- structural with the 3' terminus of aminoacyl-tRNA (Harms et al., 2001), but its aminoacyl residue is linked via an amide bridge rather than an

Page 261: Conformational Proteomics of Macromolecular Architecture

248 Ada Yonath

ester bond. Puromycin is known to bind weakly to the large subunit and a high concentration of methanol or ethanol, is required to enhance its binding. Puromycin probing in the presence of an active donor substrate can result in peptide bond formation (Odom et al., 1990), which is un- coupled from movement of the A-site tRNA (Green et al., 1998). No further synthesis can take place since the amide bond of puromycin can- not be cleaved; hence the peptidyl-puromycin so obtained falls off the ribosome.

Among the antibiotics that target ribosomes, the macrolides have the highest clinical usage. They act against gram-positive aerobes and some gram-negative aerobes. Most macrolides have a broad-spectrum antim- icrobial activity and are used primarily for respiratory, skin and soft tis- sue infections. The macrolide family is large and structurally diverse. The central component of the macrolides is a lactone ring. The 14- member ring macrolides are among the most important antibiotics. Better stability and improved spectrum of activity characterize Macrolides of the second generation, such as chlarithromycin or roxithromycin. Subse- quent rapid spread of antibiotic-resistant strains has stimulated the search for additional novel derivatives. The macrolides of the third generation, the ketolides, show an improved activity profile, and are more active against certain macrolide-resistant strains.

The high-resolution structures of the two ribosomal subunits form eubacteria were found suitable to serve as pathogen-models. Using them as references allowed unambiguous localization of over a dozen antibi- otic drugs, most of which are clinically relevant antibiotics (Brodersen et al., 2000, Carter et al., 2000, Pioletti et al., 2001, Schluenzen et al., 2001, Hansen et al., 2002). Co-crystals were grown, each containing a complex of one of the ribosomal subunits and an antibiotic agent at a clinically relevant concentration. Alternatively, crystals of ribosomal par- ticles were soaked in solutions containing antibiotics at clinically rele- vant concentrations. In most cases the co-crystals of antibiotics and the ribosomal subunits yielded crystallographic data of quality that was sometimes better than that obtained from crystals of free particles (Carter et al., 2000, Schluenzen et al., 2001), presumably because the antibiotics reduce internal motions of flexible regions and increase homogeneity.

Page 262: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crymllogruphy 249

Analysis of the structures of the antibiotics complexes showed diver- sity in the antibiotics modes of action, such as interference with substrate binding, hindrance of the mobility required for the biosynthetic process and the blockage of tunnel which provides the path of the nascent pro- teins. Most of the antibiotics studied by us were found to bind primarily to ribosomal RNA and, except for two that caused allosteric effects; their binding did not cause major conformational changes.

SOME HISTORICAL COMMENTS High-resolution crystal structures of ribosomal particles and of their complexes with substrate analogues, inhibitors and antibiotics, currently emerging in an impressive speed, led to a quantum jump in our under- standing of the translation process. These studies were idealized over two decade ago (Yonath et al., 1980) once we established that the key to high-resolution data is to crystallize highly active homogenous prepara- tions of robust ribosomal particles under conditions similar to their in- situ environments and to minimize crystal heterogeneity by inducing se- lected conformations within the crystals. An alternative approach is to design complexes containing ribosomes at defined functional stages, such as of the entire ribosome with tRNA and mRNA molecules (Hansen et al., 1990). This approach was later adopted, refined and extended, and has led a medium resolution structure of the ribosome with three tRNA molecules (Yusupov et al., 2001).

Robust ribosomal particles were chosen assuming that they would maintain their integrity during preparation, hence should provide suitable material for crystallization. We focused on thermophilic bacteria, Bacil- lus stearothermopilus and Thermus thermophilus; as well as on Huloar- cula marismortui, the bacterium leaving in the Dead-Sea, the lake of the highest salinity worldwide. The recent addition is Deinococcus radi- odurans, an extremely robust gram-positive mesophilic eubacterium with a ribosome that shares extensive similarity the ribosomes of Escherichiu coli and T. thermophilus. This species was originally identified as a con- taminant of irradiated canned meat, and later isolated from environments that are either very rich or extremely poor in organic nutrients, ranging from soil and animal feces to weathered granite in a dry Antarctic valley,

Page 263: Conformational Proteomics of Macromolecular Architecture

250 Ada Yonath

room dust, wastes of atomic-piles and irradiated medical instruments. It also is the organism with the highest level of radiation-resistance cur- rently known. It survives under conditions that cause DNA damage, such as hydrogen peroxide, and ionizing or ultraviolet radiation. It contains systems for DNA repair, DNA damage export, desiccation, starvation recovery and genetic redundancy (White et al., 1999).

The first crystals that yielded some crystallographic information were grown from of the large subunit from B. stearothermophilus (Yo- nath et al., 1980, Yonath et al., 1984). The large ribosomal subunit from H. marismortui (Shevack et al., 1985) yielded, after a few years, high- resolution diffraction (Makowski et al., 1987, von Bohlen et al., 1991). Crystals of the large and small subunits from T. thermophilus, T50S (Muessig et al., 1989, Volkmann et al., 1990) and T30S (Yonath et al., 1988), respectively, diffracting to low resolution were grown in parallel. Microcrystals of the latter were obtained also by the Russian group headed by A. Spirin and B. Weinstein (Trakhanov et al., 1987).

The crystals of the large ribosomal subunit from D. radiodurans and of their complexes with antibiotics and substrate analogs that were grown and kept under conditions almost identical to those optimized for maxi- mizing their biological activity (Schluenzen et al., 2001, Harms et al., 2001). These crystals were found to provide an excellent system to inves- tigate the peptide bond formation (Bashan et al., 2002) to gain more in- sight into functional flexibility (Yonath, 2002, Zarivach et al., 2002) to extend the information of antibiotics binding towards rational drug de- sign; to identify the exit tunnel gate and reveal the structural basis for the involvement of the ribosome in cellular regulation (Berisio et al., 2002).

Over the years it was found that all ribosomal crystals present challenging technical problems, owing to their enormous size; their com- plexity; their natural tendency to deteriorate and disintegrate; their inter- nal flexibility and their extreme sensitivity to irradiation. Assuming that one of the main reasons for crystal decay is the progression of free radi- cals that are produced by the X-ray beam, we pioneered crystallographic data collection at cryogenic temperature (Hope et al., 1989, Yonath et al., 1987a). This procedure was found to minimize dramatically the harm caused by irradiation, and therefore became rapidly the routine way for collecting crystallographic data from biological crystals. The application

Page 264: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 25 1

of cry0 crystallography together with the advances of the X-ray sources, namely the installation of third generation synchrotrons equipped with state-of-the-art detectors, and the increased sophistication in the phasing methods, enabled us, as well as others, to handle most of the technical problems.

THE GLOBAL ORGANIZATION OF THE TWO RIBOSOMAL SUBUNITS The overall structures of both ribosomal subunits, as determined by us (Schluenzen et al., 2000, Harms et al., 2001) are shown in Figure 1. The two subunits differ in shape and in their global organization. Thus, whereas the small one is built of distinct structural domains, the core of the large subunit seems to be more compact. In both subunits the ribo- somal RNA dominates most of the ribosome structure. We placed mRNA and tRNA in the ribosomal particles by reference to the struc- tures of the complex of the entire ribosome with three tRNA molecules that were determined at 5.5 A resolution (Yusupov et al., 2001). This placement reconfirmed that the anticodon loops of the A- and P-site tRNAs as well as the mRNA do not contact any ribosomal proteins.

Common to both subunits are the overall structures of the ribosomal proteins and their distribution. Almost all ribosomal proteins contain long tails or extended internal loops. In general, the globular domains are peripheral, located on the particle's surface, at its solvent side. The in- volvement of proteins in the stabilization of the structure is achieved mainly through their long extensions that penetrate into rRNA regions and serve as molecular linkers, struts and supports, as observed in viruses (Huang et al., 1998). Another group of proteins have tails pointing to- wards the solution, similar to their positioning in the nucleosome (Luger et al., 1997), presumably acting as tentacles that enhance the binding of non ribosomal compounds that attach to the ribosome.

A few proteins do not have extensions, are built of more than a sin- gle globular domain. These are located either at the ends of functionally important protuberances (Ll, L7L12, L10, L l l ) or fill a gap between the

Page 265: Conformational Proteomics of Macromolecular Architecture

252 Ada Yonath

The large and the small , a ribosomal subunits from

eubaeteria

130s Fig. 1 The "front views" (interfaces) of the two ribosomal subunits from eubacteria A, P, and E, designate the sites of the interactions of the three tRNA with the small subunit (their anticodon loops with the decoding region), and with the large subunit (at their elbows)

central protuberance and one of the stalks (Harms et al., 2001). Most of the globular domains of the proteins are located at the periphery and their long tails that penetrate into the RNA core are believed to stabilize its structure. Protein tails that point into the solution, may act as tentacles for enhancing the binding of non-ribosomal factors participating in pro- tein biosynthesis (Gluehmann et al., 2001, Pioletti et al., 2001, Zarivach et al., 2002). The striking architecture of the ribosome allows for sub- stantial domain mobility. Yet, the individual structural elements are rather stable. The features that contribute to the local stability include specific RNA folds, by a high G-C content at the rims of strategically located junctions and by the ribosomal proteins.

The Small Ribosomal Subunit

The high-resolution structure of the small subunit from Thermus thermo- philus has been determined by us (Schluenzen et al., 2000, Pioletti et al., 2001) and by the group of V. Ramakrishnan at MRC, UK (Wimberly et al., 2000). The emerging particles from both electron density maps are similar and contain the morphological features familiar from early elec- tron microscopy studies (Lake, 1985, Stoffler and Stoffler-Meilicke, 1984). The main structural features of this subunit, the "head", "neck"

Page 266: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 253

and "body" that contains a "shoulder" and a "platform", radiate from the junction combining the head and the body (Figure 2), a location that hosts the decoding center.

The principal component of the subunit interface region is the long penultimate helix (H44), which is responsible for most of the contacts with the larger subunit within the ribosome. It consists of over a 100 nu- cleotides, of which the only evolutionarily conserved part comprising of less than two dozen nucleotides that are involved in decoding and in P-site tRNA binding. Helix H44 is one of three long helices run parallel to the vertical axis of the body, likely to transmit structural re- arrangements, correlating events at the particle's far ends with the cycle of mRNA translocation at the decoding region. Transverse features, placed like ladder rungs between them, link the three longitudinal heli- ces. Principal among these transverse helices is an inclined lune extend- ing from the shoulder to the platform. The head contains most of the 3' region of the 16s RNA, arranged mainly in short helices, in marked con- trast to the long features of the body. The head has a bi-lobal architec- ture, with a longer helix (H34) serving as the bridge between hemi- spheres. It joins the body through a slender connection, made of a single RNA helix which appears to act as a hinge while translocation.

The shoulder plays a key role in mRNA binding, as it forms the lower side of an elongated, curved channel, which we assigned as en- trance side of the path of the mRNA. A latch (Schluenzen et al., 2000), which can be described as a non-covalent body-head connection, is formed by the shoulder and the lower part of the head, is the feature that designates the entrance to the mRNA channel. This latch facilitates mRNA threading and provides the special geometry that guarantees processivity and ensures maximized fidelity. It controls the entrance to the mRNA channel by creating a pore of varying diameter and its relative location may be dictated by the head twist.

The decoding region contains features from the upper part of the body and the lower part of the head. Mapping the conserved nucleotides in the 16s RNA on our structure showed remarkable conservation around this region, in accord with the universality of the decoding process. The

Page 267: Conformational Proteomics of Macromolecular Architecture

254 Ada Yonath

most prominent feature in the decoding center is the upper portion of H44, which bends towards the neck and forms most of the intersubunit contacts in the assembled ribosome (Yusupov et al., 2001). Its upper bulge forms the A- and P-tRNA sites for codon-anticodon interactions. A helix, called the "switch helix" or H27, packs groove-to-groove with the upper end of H44. This helix can undergo rearrangements in its base- pairing scheme that may induce global conformational rearrangements Lodmell and Dahlberg, 1997).

The large Ribosomal Subunit The availability of two high-resolution crystal structures of unbound large ribosomal subunits, the archaeal H50S (Ban et al., 2000) and eubacterial D50S (Harms et al., 2001), as well as a lower resolution structure of T50S within the T70S ribosome (Yusupov et al., 2001), pro- vide a unique tool for comparative studies. In the particular case of H50S and D50S, such comparison should shed light on the correlation between the structure, the function and the environment, as well as on phyloge- netic aspects.

Both crystal structures of the large subunit are similar to the tradi- tional shape of the large ribosomal subunit, as seen by electron micros- copy (Mueller et al., 2000, Penczek et al., 1999). This view, often re- ferred to as the ''crown view", looks like a halved pear with two lateral protuberances, called the L1 and L7L12 stalks, is shown in Figures 1

~ ~ ~ ~

Fig. 2. (Figure on facing page) Top: The three-dimensional structure of T30S, emphasizing the distribution of RNA and proteins (silver: RNA, blue: proteins). Left: the interface with the large subunit. Right: Side view. Obtained by rotating the left view by 90 degrees around its long axis. The yellow circle shows the location of protein S2. Two conformations of this protein are shown in the middle. In yellow: the structure of this protein in native ribosomes, and in cyan: the structure of the tungstenated protein. The tungsten atoms bound to this protein are shown in red. Middle right: the two orientations of the head, seen in crystals diffracting to low- resolution. Bottom: left: the domains of the small subunit RNA are shown in different colors. Right: the detailed view of the binding site of edeine (purple). The 30s platform is repre- sented by two helices involved in its movement. Note the newly formed base pair in green). The docked P-site tRNA (orange) and E-site tRNA (gold) are also shown.

Page 268: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 255

Fig. 2.

Page 269: Conformational Proteomics of Macromolecular Architecture

256 Ada Yonarh

and 2. The c.ore of the large ribosomal subunit is built of interwoven RNA features. Its flat surface faces the small subunit in the 70s ribosome and its round back side faces the solvent.

The gross similarity of the rRNA fold of D50S to the available 50s structures allowed superposition of the model of D50S onto that of the 2.4 A structure of H50S (Ban et al., 2000) and of the 50s subunit within [he 5.5 A structurc of the T70S ribosome (Yusupov ct nl., 2001). How- ever, we detected significant structural dil'l'erences even within the con- served regions, which cannot be explained solely by expected phyloge- netic variations. In addition, the riboosnial proteins show remarkable dif- ferences, even when sharing homology with their counterparts in H50S. In addition, D50S contains several proteins that have no counterparts in H50S. We detected RNA segments replacing proteins and vice versa. Of structural interest is a three domains protein (CTC), alongside with an extended alpha helical protein (L20) and two Zn-finger proteins (L32 and L36).

The peptidyl transferase cavity

Peptide bond formationl the principal reaction of protein biosynthesis, has bccii localized in the large subunit over three decades ago (Monro E L

d., 1968, Cundliffe, 1990, Moazed and Nollcr, 199.1, Nollcr et a/., 'I 992, Garrett and Rodriguez-Fonseca, 1995, Samaha et nE., 1995), in a multi branched loop in the 23s KNA. Among the 43 nucleotides forming the PT ring 36 are conserved in H. rnarismortui and D. radiudwans. Despite the high conservation and the wealth of information accumulated over the years and the availability of crystallographic structures, the molecular mechanism of peptidyl transferase (PT) activity is still not well under- stood. The only proposal for catalytic involvement of the ribosome that was based on crystal structure, proposed an acid-base catalysis (Nissen et al., 2000) generate.d doubts (Barta et al., 2001, Polacek et al., 2001, Thompson et al., 2001, Bayfield et al., 2001). As seen below, our results (Schluenzen et al., 2001, Harms et d., 2001, Yonath, 2002) support al- ternative suggestions, that the ribosome facilitates peptide bond forma- tion by providing the structural frame that allows precise positj.o.ning of the tRNA m.olecules as well as for the generalion ol the energy required

Page 270: Conformational Proteomics of Macromolecular Architecture

for the lormation of the peptidc bond (Nierhaus et nl., 1980: Sarnaha et al., 1995, Green and Noller, 1997, Pape et al., 1999, Polacek ef al.,

Superposition of the backbone of the structures of the PT center (PTC) in the two unbound large subunits of H50S and DSOS on that of the bound large subunit within T70S, show similar, but not identical folds. The orientations of some of the nucleotides, however, show dis- tinct differences (Yusupov et aL, 2001: Harms et al., 2001). It is possible that the different orientations reflect the flexibility needed for the forma- t.ion oC the peptidc bond. It is also possible, however, that the different orient.ations result. from the dil-ferences in the functional states of the 50s subunit in the two crystal forms, consist.enl with the structural changes that wei-c found to occur at distinct nucleotides of the peptidyl transferasc ring upon transition bctween the active and inactivc conlormations through chemical probing with dimethyl sulfate (Bayfield et nl., 2001). In support of this suggestion are experiments performed over three dec- ades ago on the E. coli 50s subunits (Miskin et al., 1968, Vogel et al., 1971, Zamir et al., 1974), that indicated that the relative orientations of several nucleotides within the peptidyl transferase center vary upon al- terations in the monovaIent ion concentrations in magnitudes that are much lower than the modifications in the concentrations and types of the monovalent ions that were employed in the course of the determination of the structure of HSOS (Ban et d., 2000).

The PTC is situated above the entrancc to the polypeptide exit tun- nel, a miijor coiiiponenl of the ribosome that could be detected even by wnventiond electron microscopy at low resolution (Milligan and Un- win, 1986, Yonath et a!., 1987b}. De.spite the low resolution, these stud- ies showcd that this tunnel spans the largc subunit from the location as- sumed to be the peptidyl transferase site to its lower part, and that it is about 100 A in length and up to 25 A in diameter (Yonath et al., 1987b), dimensions consistent with the suggestion, made more than three decades ago, that the newest synthesized part of a nascent protein is masked by the ribosome (Malkin and Rich, 1967, Sabatini and Blobel, 1970). The existence of the exit tunnel was confirmed at high resolution in H50S (Nissen et uI., 2000) and in DSOS (Hams et al., 2001). Based on the

2001).

Ribosomal Crystallography 257

Page 271: Conformational Proteomics of Macromolecular Architecture

258 Ada Yonath

structure of H50S it was suggested that the walls of the tunnel have a "nonstick" character (Nissen et al., 2000).

MOBILITY, FLEXlBlLlTY AND FUNCTIONAL ACTIVITY From the initial stage of ribosomal crystallography our aim was to eluci- date structures of ribosomal particles trapped at functionally relevant conformations. We developed two approaches: (a) crystallize and main- tain the crystals under close to physiological conditions, or (b) activate the crystallized subunits and stabilize the so obtained conformations. Al- though neither of these approaches is simple or routine, we exploited them for the determination of high-resolution functionally relevant struc- tures.

Conformational Mobility of the Small Ribosomal Subunit The small subunit is built of loosely attached domains (Figure 2) and contains structural elements that allow local rearrangements as well as the global motions required for its function. Its conformational variability has been detected by cry0 electron microscopy (Gabashvili et al., 2001, Stark et al., 1997), by surface RNA probing (Alexander et al., 1994), by monitoring ribosomal activity, and by the analysis of the high resolution structures of the small subunit complexes (Carter et al., 2000, Schluen- zen et al., 2000, Wimberly et ul., 2000, Pioletti et al., 2001, Clemons et al., 2001, Ogle et al., 2001). The conformational variability also explains why all the available cryo-EM reconstructions were not useful for ex- tracting initial phase sets for the small subunit, whereas similar searches were performed successfully for the whole ribosome and for its large ribosomal subunit (Harms et al., 1999). Our analysis of the 30s structure led us to suggest an interconnected network of features that could allow concerted movements during translocation. This movement includes the formation of a pore of varying diameter between the head and the shoul- der, and is associated with the concerted displacement of the platform facilitate mRNA threading and progression and provides the special ge- ometry that guarantees processivity of and ensures maximized fidelity of the biosynthetic process.

Page 272: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 259

The head makes the upper boundary of the mRNA channel, and its relative location is dictated by the head twist. In addition, internal head axes may be utilized for facilitating global movements associated with protein biosynthesis. Head mobility was confirmed by molecular re- placement studies that indicated that the low-resolution crystals contain at least one conformation that differs from that of the crystals diffracting to high resolution. The pivotal point for this movement is likely to be at the connection between the head and the neck, rather close to the binding site of the antibiotic spectinomycin that is known to hamper the head twist by trapping a particular conformation (Carter et al., 2000).

Trapping crystalline small subunit at functionally relevant conformations

The small ribosomal subunit is less stable than the large one. We found that by exposing 70s ribosomes to a potent proteolytic mixture, the 50s subunits remained intact, whereas the 30s subunits were completely di- gested (Evers et al., 1994). Similarly, large differences in the integrity of the two subunits were observed when attempting crystallization of entire ribosomes assembled from purified subunits. Crystals obtained from these preparations were found to consist only of 50s subunits (Berkovitch-Yellin et al., 1992) and the supernatant of the crystallization drop did not contain intact small subunits, but did show 30s proteins and fragmented 16s RNA chain. Consequently, among the many ribosome sources that were tested, so far only the 30s from T. thermophilus crys- tallized, and only one crystal-type of the small subunit was found suit- able for crystallographic studies. Almost a decade was needed to mini- mize the severe non-isomorphism of this form and all the procedures developed for increasing the homogeneity of these crystals, are based on post-crystallization treatments.

Our approach (Tocilj et al., 1999, Schluenzen et al., 2000) was to in- duce a preferred conformation within the crystals, preferably, a confor- mation with functional relevance. We exploited the commonly used heat- activation procedure, developed over 30 years ago (Zamir et al., 1971). We exposed the T30S crystals to elevated temperatures, since we sus- pected that their specific packing arrangement should allow post-

Page 273: Conformational Proteomics of Macromolecular Architecture

260 Ada Yonath

crystallization conformational rearrangements. As the first task of the small ribosomal subunit is to form the initiation complex, we assumed that the heat induced conformation resembles this one. Once functional activation was achieved, the conformation of the particles was stabilized by incubation the crystals with minute amounts of a heteropolytungstate cluster [(NH4)6(P2W1*062)14H20], referred below as W18 (Tocilj et al., 1999). The same procedure was employed for complexes of T30S with compounds that facilitate or inhibit protein biosynthesis, mRNA ana- logues, initiation factors and antibiotics. Soaking in solutions containing the non-ribosomal compounds in their normal binding buffer was per- formed at elevated temperatures. Once the functional complex was formed, the crystals were treated with W18 cluster.

We found in the low resolution crystals of T30S various head con- formations (Figure 2), including the conformation seen at high resolu- tion. Head stability was achieved by the interactions of four W 18 clusters with protein S2 (Figure 2), a large and flexible ribosomal protein, located on the solvent side of the 30s particle and combining the head to the body. Since S2 is located on a crystallographic two-fold axis, the W18 clusters "glued" the symmetry related two particles, hindered the move- ments of protein S2, and consequently also of the entire head. A similar effect was obtained by binding spectinomycin, an antibiotic agent that locks the head of the small subunit in a particular conformation, and was reported to improve the quality of the T30S crystals (Carter et al., 2000). Thus, although the mechanism for minimizing internal motions differs in the two systems, and although only in one system effort was made to achieve a functionally relevant conformation (Tocilj et al., 1999, Schlu- enzen et al., 2001), the resulting fixation of the desired conformation led to better diffracting crystals.

The W18 cluster played a dual role in the course of structure deter- mination of T30S. In additions to minimizing the conformational hetero- geneity and limiting the mobility of the crystallized particles, treatment with this cluster yielded phase information. Thirteen W 18 clusters bind to each T30S particle. The individual W atoms of ten of them (total 180 atoms) could be located precisely. Most of tungsten clusters interact with ribosomal proteins (Figure 2), in positions that may significantly reduce the global mobility of the T30S particles within the crystal network.

Page 274: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 26 1

Pairing of T30S particles around the crystallographic two-fold axis is one of the main features of the crystallographic network in T30S crystals. The contacts holding these pairs are extremely stable, and many of them were maintained even after the rest of the crystal network is destroyed (Harms et al., 1999).

Edeine - A universal antibiotic limiting platform mobility

The small subunit is the main player in initiation of protein biosynthesis. After binding to the mRNA the initiation complex moves in the 5' to 3' direction along the mRNA scanning it, in search for the initiator (AUG) codon (Kozak and Shatkin, 1978). Edeine is a peptide-like antibiotic agent, produced by a strain of Bacillus brevis. It contains a spermidine- type moiety at its C-terminal end and a beta-tyrosine residue at its N- terminal end (Kurylo-Borowska, 1975). As early as 1976 (Fresno et al., 1976) it was found that the universal antibiotic edeine blocks mRNA binding to the small ribosomal subunit. Further biochemical studies indi- cated that edeine inhibits mRNA binding by linking critical features translocation and E-site tRNA release, and impose constraints on ribo- somal mobility required for the translation process (Altamura et al., 1988, Odom et al., 1978).

We found that it binds to the platform in a position that may affect the binding of the P-site tRNA, alter the mRNA path at the E-site and hamper the interactions between the small and the large subunits (Pioletti et al., 2001). This is consistent with the finding that a subset of the 16s rRNA nucleotides protected by the P-site tRNA (Moazed et al., 1995) overlaps with those protected by edeine, kasugamycin and pactamycin (Mankin, 1997, Woodcock et al., 1991). In addition, the binding of edeine to the 30s subunit induces the formation of a new base pair (Fig- ure 2) that may alter the mRNA path and would impose constraints on the mobility of the platform. Thus, by physically linking the mRNA and four key helices that are critical for tRNA and mRNA binding, edeine locks the small subunit into a fixed configuration and hinder the confor- mational changes that accompany the initiation process.

The universal effect of edeine on initiation implies that the main structural elements important for the initiation process are conserved in

Page 275: Conformational Proteomics of Macromolecular Architecture

262 Ada Yonath

all kingdoms. Analysis of our results shows that all rRNA bases defining the edeine-binding site are conserved in chloroplasts, mitochondria, and the three phylogenetic domains. Among these are two conserved nucleo- tides along the path of the messenger. Thus, edeine shows a novel mode of action, based on limiting the ribosomal mobility and/or preventing the ribosome from adopting conformations required for its function. Fur- thermore, it induces an allosteric change by the formation of a new base pair-an important new principle of antibiotic action.

CONFORMATIONAL MOBILITY WITHIN THE LARGE RIBOSOMAL SUBUNIT The structure of the large ribosomal subunit was reported to be compact and monolithic (Ban et al., 2000). Nevertheless, significant mobility was assigned to the large subunit's features that are directly involved in ribo- somal functions, based on cry0 electron microscopy studies (Frank and Agrawal, 2000), as well as on comparisons of the crystal structures of the entire ribosome with the structures of its large ribosomal subunit. The latter showed that most of the functionally relevant features of the large subunit assume different conformations in unbound (Harms et al., 2001, Yonath, 2002) and assembled (Yusupov et al., 2001) states. They also may become completely disordered, as in the 2.4 A crystal structure of the large subunits from Haloarcula marismortui, H50S (Ban et al.,

The conformational variability of the large subunit allows the crea- tion of intersubunit bridges, leads to the formation of peptide bonds, fa- cilitates tRNA release, and enables the involvement of the ribosome in cell regulation. The flexibility of the functionally relevant features is manifested in the variability of their conformations between the unbound D50S subunits, and those incorporated into T70S ribosomes, as well as in their disorder in H50S. Thus, almost all of the RNA structural features known to be involved in functional aspects of protein biosynthesis are disordered in the 2.4 A electron density map of H50S (Ban et al., 2000). These include both lateral protuberances that create the most prominent features in the typical shape of the large subunit; intersubunit bridges and four ribosomal proteins, all of them match the list of proteins that are

2000).

Page 276: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 263

loosely held by the core of the particle, hence could be detached selec- tively from halophilic ribosomes (Franceschi et al., 1994).

A revolving door assisting the release of E-site tRNA Although the large ribosomal subunit is known to have less conforma- tional variability than the small subunit, it does posses various conforma- tions that can be correlated to the functional activity of the ribosome. The most significant differences between the two structures of the unbound large subunits were found in key features, known to participate in the functional activities of the ribosome. Remarkable examples are the 50s hook into the decoding region of the small subunit, and other intersubunit bridges created upon subunit association, the entire L1 arm that acts as the revolving gate for the exiting tRNA molecules, and the GTPase center.

The L1 stalk, which includes the rRNA helices and a ribosomal pro- tein, L l , is well resolved in T70S (Yusupov et al., 2001) and in D50S (Harms et al., 2001). Comparison between the structure of the unbound 50s and the 70s ribosome indicates how the Ll-arm facilitates the exit of the tRNA molecules. In the complex of T70S with three tRNA mole- cules, the L1 stalk interacts with the elbow of E-tRNA. This interaction seems to block the release of the E-site tRNA. In H50S, the entire L1 arm is disordered and therefore could not be traced in the electron den- sity map (Ban et al., 2000), an additional hint of the inherent flexibility of this feature.

The location of protein L1 in D50S does not block the presumed exit path of the E-site tRNA, hence it seems that the mobility of the L1 arm is utilized for facilitating the release of E-site tRNA. Although the orienta- tion of the L1 arm in the 70s ribosome during the release of the E-site tRNA is still not known, the two defined orientations that have been ob- served indicated that movement of the L1 arm might occur during pro- tein biosynthesis. Superposition of the structure of D50S on that of the T70S ribosome allowed the definition of a pivot point for the possible movement of the L1 arm. Similar differences found in the relative orien- tation of the L1 stalk have been correlated with the presence or absence of tRNA and elongation factors (Agrawal et al., 2000). Hence it may be assumed that the position of the L1 stalk in the unbound D50S represents the conformational change required for the release of the E-site tRNA.

Page 277: Conformational Proteomics of Macromolecular Architecture

264 Ada Yonath

An intersubunit bridge with multiple roles

Intersubunit bridges form upon the association of the two ribosomal sub- units, once the functionally active is created. They are the features con- necting the two subunits within the assembled ribosome, namely the linkers between the two ribosomal subunits. The correct assembly of the entire ribosome from its two subunits is the key, or one of the major keys, for proteins biosynthesis, hence these bridges must be positioned accurately and point at the exact direction. Each intersubunit bridge is formed from two parts - one of the small and one of the large subunit. We found that whereas those of the small subunit are of almost the same conformation in the unbound and bound subunit, those originating from the large one are inherently flexible, and may have different conforma- tions or assume a high level of disorder. Upon subunit association the conformations of these bridges change so that they can participate in the creation of the assembled ribosome. Thus, their structure and the nature of their conformational mobility should show how the ribosome controls its intricate assembly.

Fig. 3. (Figure on facing page) (a) The RNA domains of D50S (color code is shown in the middle). Top: Left - front (interface) view. Right: solvent side. Bottom: side views, obtained by rotating the top views around their long axis by 90 degrees. The interface views are flat, with perturbing L l stalk (in yellow). (b) The upper part of D50S (compared to the view shown in the a top left). The L1-arm of D50S is highlighted (in gold). Also shown are the docked L1-arm of T70S and protein L1 of T70S (green) and the location of protein L1 in D50S (yellow-gold). The pivot point between these two orientations is marked by a red dot. The docked tRNA molecules are shown in cyan (A), blue (P) and purple (E). (c) Bridge B2a (H69) in the unbound D50S (red) and within the T70S ribosome (gold). H44 (of the small subunit) is shown in gray. P-site tRNA (in cyan) and A-site in green. (d) The modified bases in the tip of H69 are shown. (e) and (9 show overlay of H69 In the unbound D50S subunit (gray) on the corre- sponding feature in the structure of the whole ribosome (gold). The tRNA acceptor stem mimic (ASM) is shown in red. The docked A- and P-sites tRNA are shown in cyan and dark green (respectively). These figures indicate the proposed movement of H69 towards the decoding center of H44 (light cyan) in T30S.

Page 278: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallographyh

These bridges acould be seen even at 5.5 resolution, and are de- scribed in detail in (Yusupov et al., 2001). Here we focus on bridge B2a for a few functional tashks, since we found that elements involved in bridging the two subunits within the assembled ribosome, appear to par- ticipate in the functional tasks of the ribosome. The orientation of H69 with its universally conserved stem-loop in D50S is different than that seen in T70S. Both lie on the surface of the intersubunit interface, but in the 70s ribosome it stretches towards the small subunit, whereas in the

Y

Fig. 3.

265

Page 279: Conformational Proteomics of Macromolecular Architecture

266 Ada Yonath

free 50s it makes more contacts with the large subunit, so that the dis- tance between the tips of their stem-loops is about 13.5 A. Figure 3c hints at a feasible sequence of events leading to its creation. Once the initiation complex, that includes the small subunit and tRNA at the P- site, approaches the large subunit the tRNA pushes helix H69 towards the decoding center, and the intersubunit bridge is formed.

The specific conformations of H69 in D50S and T50S, and their modes of binding the tRNA and its mimics, implicated H69 as a carrier of the helical part of the A-site tRNA into the P-site. Within the 70s ri- bosome, H69 interacts with both the A- and the P-site tRNAs (Yusupov et al., 2001). In the complex of D50S with an acceptor stem tRNA mimic (called here ASM), most of the contacts of the helical stem of the ASM, which position it within the A-site, are with H69 (Figure 4). The crucial contribution of H69 to the proper placement of the tRNA mimic is also reflected by the disorder of the helical stem of the tRNA mimic that was bound to H50S crystals, in which H69 itself is disordered (Ban et al., 2000, Nissen et al., 2000).

The displacement and the rotation of a massive helix like H69 re- quire inherent flexibility. It is conceivable that the ribosome benefits from this flexibility beyond bridging the two subunits. The proximity of H69 to both the A- and the P-site tRNAs (Yusupov et al., 2001, Bashan et al., 2002), suggest that besides acting as an intersubunit bridge, H69 participates in translocation. In addition, connecting between the peptidyl transferase center in the large subunit and the decoding region (Figure 3) in the small one, H69 may be the right candidate to provide the machin- ery needed for the transmission of signals between the two centers. The location of H69 may hint also at its contribution to a sophisticated signal- ing network over long distances, like between the GTPase and the PTC centers or between the PTC and the E-site tRNA release mechanism (Harms et al., 2001).

Interestingly, mapping of the E. coli modified nucleotides known to be important for the function of the large ribosomal subunit (Ofengand and Bakin, 1997) onto the D50S structure, showed clustering of the posi- tions corresponding to these nucleotides in the vicinity of the active site of D. radiodurans as well as in H69 (Figures 3 and 5). The location of the latter on the stem loop of H69 intersubunit bridge in the assembled

Page 280: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 267

ribosome led us to suggest that the modified bases play a role in the bridging events.

The PTC tolerates various binding modes

Three-dimensional structures of several complexes of D50S with sub- strate analogs, designed to mimic the tRNA acceptor stem (ASM) or the CCA 3' end of the tRNA bound to puromycin (ACCP), with the universal antibiotic sparsomycin, and with a combination of ASM and sparso- mycin (ASMS) were determined by us (Bashan et al., 2002). Analysis of these structures allowed us to elucidate the modes of interactions be- tween the ribosome and the substrate analogs; to illuminate elements of flexibility within the peptidyl transferase cavity, including those facili- tating the interplay between the A- and P-sites; to investigate the prin- ciples of the action of a P-site ligand; and to identify feature that contri- bute to the dynamics of translocation.

The PTC is highly conserved. Nevertheless, we observed some di- versity in its structure in the different crystal systems. The overall struc- ture of the cavity hosting the PT activity in the liganded D50S is similar to that seen in the native (Harms et al., 2001), in the antibiotic bound D50S structures (Schluenzen et al., 2001) and in the complexed 70s ri- bosome (Yusupov et al., 2001). The orientations of both the conserved and variable bases of the PTC seem to depend on several parameters; among them is the functional state of the ribosome. Thus, the conforma- tion of the key nucleotides in the complex of T70S with three tRNAs differs significantly from the conformations seen in two complexes of the large ribosomal subunit from H50S with compounds believed to be sub- strate or transition-state analogs (Yusupov et al., 2001). Also, the PTC of H50S undergoes notable conformational changes upon binding ligands (Nissen et al., 2000, Schmeing et al., 2002), including the ordering of the base corresponding to A2602, which is disordered in the 2.4 A structure of H50S, as are most of the functionally relevant features in this structure (Ban et al., 2000).

Diversity in binding modes of different A-site tRNA analogs may also be connected to the nature of the analog, and the differences in posi- tioning of different analogs appear to be correlated with the amount of

Page 281: Conformational Proteomics of Macromolecular Architecture

268 Ada Yonath

Fig. 4 (a) The PTC and its environment, including ASM and the modeled A- and P-site tRNAs. (b) Three views, showing substrate analogs in the PTC and, backbone of H93 and A2602. Note the hydrated Mg2+ ions, shown as pink dots. (c) The relative orientations of A2602 in different complexes of D50S (Bashan et al., 2002) and of H50S (PDB entry lFGO and 1KQS). Sparsomycin (Spar) and chloram- phenicol (CAM) are included, to indicate the limits of the rotation ofA2602. The dark gray indicates the RNA backbone in the sparsomycinD50S complex. The light gray shows the backbone in DSOS/CAM complex. (d) The two fold symmetry in the PTC, together with A2602. (e) The proposed over-all mechanism of peptide bond formation.

Page 282: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 269

support given to them by the PTC. An example is ASM that is held in its position mostly by the interactions that its helical part makes with H69, the loop between H69 and H71 and protein L16 (Figure 4). The tRNA mimics that include features representing the acceptor stem of tRNA, were found to be oriented by their interactions with two long ribosomal helical segments, H89 and H69. Such contacts cannot be created either for short analogs or when one of the main supports, helix H69, is disor- dered, as in the structure of H50S. Indeed, we found that truncated sub- strate-analogs bind to the ribosomal peptidyl center at a large range of conformations that may be similar, but not identical, to the mode of bind- ing of larger RNA constructs that were designed to bind to the large sub- unit as tRNA mimics.

We found that the compounds mimicking the CCA ends of tRNA, complexed with D50S (Bashan et al., 2002) or H50S (Nissen et al., 2000, Schmeing et al., 2002), are held in their positions by a comparable amount of interactions with their corresponding PTCs. However, varia- tions in binding modes were observed between them, even within the subgroup of short tRNA analogs (Figure 4). Thus, it appears that the lower part of the PTC can tolerate several binding modes that resemble each other, but are not necessarily identical to the precise orientation leading to efficient protein biosynthesis. Consistent with the findings that although most of the interactions of the ACCP with the ribosome are with universally conserved nucleotides, altered reactivities were ob- served for puromycin in eubacteria and archaea (Rodriguez-Fonseca et al., 1995). The position of ASM in D50S is similar, but not identical, to that of the acceptor stem of the A-site tRNA in the 5.5 A structure of T70S (Yusupov et al., 2001). The reasons for this may reflect the differ- ence between tRNA binding to unbound large subunit and to assembled ribosome, in which the tRNA also makes substantial contacts with the small subunit, or to the differences in A-site binding in the absence of P- site substrate (Green et al., 1998). Alternatively, the position of ASM may indicate the existence of an additional binding mode, similar to the suggested "hybrid mode", in which the movement of the acceptor stem is uncoupled from that of the rest of the tRNA (Moazed and Noller, 1991).

The walls of the PT cavity are composed of several RNA features. One of them is the flexible helix H69 that forms the B2a bridge (Harms

Page 283: Conformational Proteomics of Macromolecular Architecture

270 Ada Yonath

et al., 2001, Yusupov et al., 2001). The helical stem of ASM interacts with the extended loop of protein L16, and that H69 packs groove-to- backbone with it. Hence it seems that H69 and protein L16 are the key factors influencing the positioning of ASM within the PTC. Interestingly, the main chain of protein does not interact directly with the tRNA mimic, although its conformation underwent substantial rearrangements as a re- sult of the binding of the tRNA mimic, presumably to avoid short con- tacts.

Analysis of the modes of attachment of the tRNA mimics to the pep- tidy1 transferase center in D50S supports the idea that the ribosome pro- vides a frame for the peptide bond formation, rather than being actively involved in the catalytic events, consistent with (Polacek et al., 2001), and with the suggestion that di-metal ions may be instrumental for pep- tide bond formation (Barta et al., 2001). Our studies also indicate that the peptidyl transferase center contains several flexible regions, some of which may be stabilized by the binding of substrate analogs, others may be exploited as parts of the for translocation machinery.

Striking conformational alterations within the PTC

Sparsomycin is a universal antibiotic agent. Nevertheless, ribosomes from different kingdoms show differences in binding affinities to it. Similar to PTC antibiotics studied so far by us (Schluenzen et al., 2001), sparsomycin interacts exclusively of 23s RNA. In its single binding site, and interacts with the highly conserved base A2602. But, unlike other antibiotics of the large subunit, which make various interactions with the ribosome, sparsomycin interacts only with a single base, A2602. The limited contacts between sparsomycin and the large subunit rationalize its weak binding. These stackmg interactions may be sufficient for its firm attachment as long as the ribosome or its large subunit are not ac- tively involved in protein biosynthesis, or in the crystals, owing to the limited mobility of crystalline materials. In active ribosome, destabiliza- tion of sparsomycin binding during protein biosynthesis may be corre- lated to changes in the orientation of sparsomycin's counterpart, nucleo- tide A2602, which was implicated to play an active role in protein bio- synthesis. Additional interactions with P-site substrates like N-blocked

Page 284: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 27 1

aminoacyl-tRNA that is known to increase the accessibility of nucleotide A2602 (Porse et al., 1999), should lead to tighter binding. Furthermore, the enhancement of sparsomycin binding by N-blocked aminoacyl-tRNA may indicate that sparsomycin may inhibit protein biosynthesis not only by altering the conformation of the PTC, but also by blocking the P-site and by trapping non-productive intermediate-state compounds.

In contrast to the minor conformational changes induced by the anti- biotics studied so far (Schluenzen et al., 2001), sparsomycin appears to significantly alter the conformation of both the P- and the A-sites. A2602 is the base that undergoes the most noticeable conformational rear- rangements (Figure 4) upon sparsomycin binding. Interestingly, although chloramphenicol and sparsomycin do not share overlapping positions, they seem to compete with each other in inhibiting peptide bond forma- tion. We found that the base of A2602 in sparsomycin complex is flipped by 180" compared with its position in the complex of D50S with chloramphenicol (Figure 4), implicating this base as the trigger of the competition between them.

Analysis of our results showed that sparsomycin introduces altera- tions in the peptidyl transferase center. Figure 3 shows the location of the tRNA mimic in the presence of sparsomycin. Compared with its position in "empty" PTC, in the presence of sparsomycin, ASM is slightly twisted and placed somewhat closer to the P-site. In its position ASMS interacts with protein L16, but looses one of the contacts that ASM makes with this protein. In addition to the interactions of ASMS with the 23s RNA and protein L16, it makes three hydrogen bonds with a putative hydrated Mg2+ ion, located close to its CCA end. This Mg2+ ion with the water molecules bound to it, is seen clearly in the ASMS map. The same posi- tion in the ASM map contains less well-defined features.

We assume that the differences between the binding modes of ASM and ASMS result from alterations in the PTC. Since ASMS crystals were obtained by soakmg co-crystals of D50S and sparsomycin, and since sparsomycin was shown to trigger conformation changes n the PTC, it seems that these were sufficient to modify the binding mode of the A-site substrate analogs, hence suggesting interplay between the A- and the P- sites. Based on the binding modes of ASM in the presence and absence

Page 285: Conformational Proteomics of Macromolecular Architecture

27 2 Ada Yonath

of sparsomycin, we conclude that P-site occupation governs the posi- tioning at the A-site. As seen below, whereas the location and orientation of the A-site acceptor stem analog (ASM) seems to be designed for pep- tide bond formation, the orientation of ASM in the presence of an inhibi- tor at the P-site, would not permit its participation in peptide bond forma- tion. Thus, the PTC seems to possess a mechanism that prevents correct localization of a tRNA molecule at the A-site when the P-site is occupied by an inhibitor rather than a substrate.

Two-fold rotation We identified a local two-fold rotation axis within the peptidyl trans- ferase cavity that relates two groups of nine nucleotides, in each the A- and P-sites (Figure 4). Conformation, rather than the type of the base, is related by the pseudo two-fold symmetry. This local two-fold symmetry at the PTC of D50S is consistent with the observation that the CCAs bound in the A-and P-sites are related by a two-fold axis (Nissen et al.,

Why does the structure of the ribosome, which lacks any symmetry, possess a local two-fold axis at its active site? Why are the 3'-ends of the A- and P-sites tRNAs related by a local two-fold axis, whereas the tRNAs molecules are related by translation? A feasible explanation is that the local two-fold symmetry provides similar, albeit not identical, environments for the CCA termini, to allow for a smooth translocation with minor rearrangements (Yusupov et al., 2001) and without being ex- posed to large energetical differences.

Translocation of the tRNA-mRNA complex involves disruption of existing interactions in one site and the establishment of new interactions in the next site. Owing to the local two-fold symmetry, the environments of the A- and P-sites are similar. Nevertheless, the environment of the 3' ends of the two tRNAs are somewhat different. In T70S crystals, the P- site tRNA seems to make more interactions with the P-loop than the A- site with the A-loop (Yusupov et d., 2001). In the liganded H50S crys- tals, the A- and the P-site tRNA make the same number of contacts with the PTC, but the P-site tRNA makes two base pairs whereas the A-site tRNA is involved in only one base pair[Nissen, 2000 #57. Hence, in both

2000).

Page 286: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 273

systems the progression from the A- to the P-site would be energetically favored and should enhance the contacts between the tRNA and the 23s RNA.

The observation of a two fold symmetry between the A- and P-sites 3' tRNA termini implies that regardless of the translocation mechanism, the CCA end of the A-site tRNA bearing the newly formed polypeptide should rotate by approximately 180" on its way from the A- to the P-site. This rotation may be triggered by the creation of the new peptide bond, and can occur, in principle, when the helical part of the tRNA is either at the A-site, or during its translocation to the P-site or after the tRNA reaches the P-site. In order to exclude non-permitted rotations due to space constrains, we modeled the three possibilities for rotation (Agmon et al., to be published). Starting from the location of the tRNA mimic (ASM) in D50S, we found that a 180" rotation of its ACCA end together with the base bound to it can occur while the helical part of ASM is at the A-site without steric hindrance. Furthermore, we found that in ASM, the P-03' bond corresponding to the bond connecting bases 73 and 74 of tRNA, are located just above the ACCA terminal, is almost overlapping the local two-fold axis. Therefore the ACCA-peptidyl rotation may occur around this bond while the tRNA is at a location similar to that of ASM that may represent an intermediate hybrid state (Moazed and Noller, 1989).

Performing the two-fold symmetry operation on ASM positioned the carbonyl carbon of esterified P site residue in an orientation and distance suitable for a nucleophilic attack of the primary amine of the A-site bound aminoacyl tRNA. At these relative orientations a nucleophilic at- tack should be spontaneous, especially at basic pH values, as in D50S, since the primary amine is not expected to be protonated. At this orienta- tion similar mechanism should be possible even at slightly acidic pH values, -6, as an equilibrium between NH2 and NH3' is expected. The optimal overall pH value for efficient protein biosynthesis of D50S (similar to many other ribosomes) is around pH=8. Hence, it is logical to expect that the pH of the local environment at the PTC should be be- tween these two values. We therefore propose a mechanism for peptide- bond formation, which is based on direct donor-acceptor interaction

Page 287: Conformational Proteomics of Macromolecular Architecture

274 Ada Yonath

between the A- and P-substrates, and on proton transfer mediated by water or hydrated magnesium that was identified in the vicinity of the two substrates (Bashan et al., 2002). This mechanism is consistent with our earlier observations (Schluenzen et al., 2001, Yonath, 2002) as well as with earlier suggestions saying that the ribosome provides the frame for accurate orientation of the tRNA molecules and may enhances the rate of peptide bond formation (Nierhaus et al., 1980, Samaha et al., 1995, Green and Noller, 1997, Pape et al., 1999, Polacek et al., 2001) rather than participating in the actual enzymatic activity, as suggested by the Yale group (Nissen et al., 2000).

Nucleotides U2585 and A2602 are located approximately on the lo- cal two-fold axis, and U2585 is situated right under A2602, in the direc- tion of the protein exit tunnel. This construction hints that the extremely flexible nucleotide A2602 may play a dynamic role in coordinating the tRNA motions, and U2585 may assist in guiding the ACCA during the rotation and in transmitting messages from the tunnel wall to the FTC. This suggested rotation-translation motion could provide benefits not only for translocation but also for the progression of the nascent protein through the tunnel, since it may create a screw motion that demands less force than straight pushing. As the walls of the exit tunnel have bumps and grooves and its diameter is not uniform, the progression of the nas- cent protein through the tunnel cannot be approximated to a smooth ob- ject progressing along smooth walls. The growing proteins move at times through narrow paths, so that their side chains may exercise significant friction. One of the narrowest regions of the tunnel is its entrance. Hence, a screw movement should be beneficial especially for the first step of nascent chain movement - its entry into the tunnel.

THE PROTEIN EXIT TUNNEL - A PASSIVE PATH OR AN ACTIVE DISCRIMINATOR? The protein exit tunnel was assumed to provide a passive path for export- ing smoothly all protein sequences and changes in its diameter were observed in correlation with mutations or different functional states (Gabashvili et al., 2001). Originally this tunnel was believed to provide a passive path to the nascent protein chains. However, evidence was

Page 288: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 275

obtained for tunnel participation in regulating intracellular co- translational processes, indicated that the tunnel may possess dynamic capabilities allowing it to function as a discriminating gate and to re- spond to signals from cellular factors or from nascent proteins and refer- ences therein. Sequences that presumably interact with the tunnel interior and thereby arrest protein elongation cycle were identified. This interac- tive elongation arrest was proposed to provide mechanisms to guarantee critical events, such as sub-cellular localizations or subunit assembly (Walter and Johnson, 1994, Nakatogawa and Ito, 2002, Tenson and Ehrenberg, 2002, Young and Andrews, 1996, Stroud and Walter, 1999, Liao et al., 1997, Sarker et al., 2000). Furthermore, recently it was found that short peptides can act as regulatory nascent peptides and render re- sistance to macrolides (Herr et al., 2000, Lovett and Rogers, 1996, Ten- son and Mankin, 2001, Weisblum, 1995), while exploiting the peptides translated by the same ribosome. The length and the sequence of the pep- tides are critical for their activity, suggesting direct interaction between the peptide and the drug on the ribosome.

GATING WITHIN THE TUNNEL: A MECHANISM FOR REGULATING SELECTED CELLULAR EVENTS A semi-synthetic macrolide of no clinical use was found to trigger a striking conformational rearrangement in the walls of the tunnel, by flip- ping the tip of a highly conserved beta-hairpin of the ribosomal protein L22 across the tunnel (Figure 5). This modulation of the tunnel shape provides the first structural insight into its dynamics. The tunnel gating could be correlated with sequence discrimination and elongation arrest of the SecM (secretion monitor) protein (Sarker et al., 2000), thus paving the way for illuminating the ribosome role in regulating intracellular events. This secretory protein monitors protein export. It includes a se- quence motif that causes arrest during translation in the absence of the protein export system (called also "pulling protein"), which can be by- passed by mutations in the ribosomal RNA (rRNA) or in ribosomal pro- tein L22 (Nakatogawa and Ito, 2002), a constituent of the tunnel walls

Page 289: Conformational Proteomics of Macromolecular Architecture

216 Ada Yonath

Fig. 5 (a) The interface view of D50S (only RNA backbone is shown). The position of the tunnel entrance is highlighted in blue (representing erythromycin at its binding site). (b) Left: A view into the ribosomal exit tunnel highlighting all modified nucleotides (yel- low: pseudouridines, red: methylations, green: sugars. Right: the same as in the left side, but with erythromycin in the tunnel. (c) View into the ribosomal tunnel from the active site, showing the hindrance of the tunnel by L22 swung conformation (magenta, right) compared to the native (cyan, left). A superposition of them is shown in the bottom. Note how the native and swung double- hooks interact with two sides of the tunnel wall. (d) Side view of the region of the ribosome exit tunnel, showing the contacts of the native (cyan) and swung (magenta) conformations of L22 hairpin tip. The RNA moieties con- structing the tunnel wall at this region are shown in gray.

Page 290: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 277

(Nissen et al., 2000, Harms et al., 2001). L22 consists of a single globu- lar domain and a well-structured, highly conserved beta-hairpin with a unique twisted conformation, which maintains the same length in all spe- cies, whereas insertions/ deletions exist in other regions of L22. Within the ribosome protein L22 has an overall conformation similar to that seen in its crystal structure (Unge et al., 1998). Somewhat different, however, is the inclination of the tip of the beta-hairpin. It is positioned with its globular domain on the surface of the large subunit while the beta-hair- pin lines the walls of the tunnel and extends approximately 30 r\ away from the protein core.

Both the altered conformation of L22 beta-hairpin (called here the "swung conformation") and the native one are stabilized mainly by elec- trostatic interactions and hydrogen bonds with the backbone of rRNA. These two highly conserved arginines may be considered as a "double- hook" anchoring both native and swung conformations and modulating the switch between them. The structure of protein L22 appears to be de- signed for its gating role. Precise positioning of L22 hairpin stem, re- quired for accurate swinging and anchoring of the double-hook is pre- sumably achieved by the pronounced positive surface charges of this region.

The observation that sequence related translational arrest could be suppressed by mutations that were localized in the double-hook region of protein L22, led us to propose that the observed swing of the tip of L22 beta-hairpin indicate its intrinsic conformational mobility. Since the swung conformation restricts severely the space available for the passage of nascent proteins through the tunnel, and since L22 double-hook is highly conserved, it is logical to link the swing of L22 with the putative regulatory role assigned to the tunnel. We propose that L22 is a main player is this task, with its double-hook acting as a conformational switch and providing the molecular tool for the gating and discriminative prop- erties of the ribosome tunnel.

A specific sequence motif that induces the elongation arrest while SecM protein is being formed was found to hinder translation elongation in E. coli even when present within unrelated sequences (Nakatogawa and Ito, 2002). We therefore suggest that the mechanism of the elonga- tion arrest is based on the combination of the conformational rigidity of

Page 291: Conformational Proteomics of Macromolecular Architecture

278 Ada Yonath

protein with amino acids with bulky side chains (Trp and Ile) and their relative positions. By modeling a poly-alanine nascent chain, kinked to comply with the curvature of the tunnel, we verified that for this specific motif, once the proline has been incorporated into the nascent chain and was placed at the tunnel entrance, the two bulky amino acids reach the tip of L22 hairpin. In order to avoid collisions they may trigger a swing in a manner similar to ACM. This motion will free space for the bulky side chains, but at the same time will jam the tunnel for the progression of the nascent chain. In principle, nascent chains can use their flexibility to progress smoothly through the tunnel even in the proximity of L22 beta-hairpin, as presumably happens when residues with bulky sidechains are incorporated into nascent proteins. However, the inherent conformational rigidity of the proline that is located at the narrow en- trance to the tunnel should hinder possible adjustments of the nascent chain.

Under normal conditions the SecM elongation arrest was found to be transient, but in the absence of active export of SecM the arrest is signifi- cantly prolonged (Nakatogawa and Ito, 2002). The question still to be investigated concerns the mechanism whereby the cellular signaling for alleviating the arrest is being transmitted. An intermediate conformation of the swung region may be required, allowing sufficient space for the bulky side chain and for progression of the nascent protein. Indications for such conformation were observed in the crystals structure of L22 (Unge et al., 1998). The conformational change of swung region may be triggered by the C-terminal end of L22, which is positioned at the vicin- ity of the exit tunnel opening (Nissen et al., 2000, Harms et al., 2001) and therefore may interact with the "pulling protein". This C-terminus of L22 is almost a linear extension of the beta-hairpin. Hence it may trigger allosteric rearrangements in the hairpin. The nascent chain may also play a role, since the arrest motif is located over 150 residues away from the N-terminal. Hence, once the bulky side chains reach the swung region of L22, the N-terminal residues of SecM should have reached the tunnel opening and can interact with the "pulling protein". The outstanding role of L22 and the conservation of the double hook, and of the hairpin size and sequence, suggest the gross discriminating mechanism to be univer- sal, although the detailed interactions between the nascent protein and

Page 292: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 279

the tunnel may vary between prokaryotes and higher organism. The known dependence of elongation arrest on sequence motifs within nas- cent peptides; the correlation between arrest-suppression mutants and the features involved in L22 gating induced by the modified macrolide; indi- cate that the tunnel is involved in sequence discrimination and may play active roles in regulation of intracellular processes.

Our results show that protein L22 has an intrinsic conformational mobility and contains a conserved double-hook feature, capable of inter- acting with the two sides of the tunnel wall, thus creating a swinging gate within the tunnel. The existence of dynamic features within the ribo- somal tunnel and its ability to oscillate between conformations, the known dependence of elongation arrest on sequence motifs within nas- cent peptides, the correlation between arrest-suppression mutants and the features involved in L22 gating, indicate that the tunnel is involved in sequence discrimination and may play active roles in regulation of intra- cellular processes.

The opening of the ribosomal exit tunnel is located at the bottom of the particle. In D50S it is composed of rRNA components as well as of several proteins, including L22. In H50S, two proteins that do not exist in DSOS, L31e L39e are also part of the lower part of the tunnel (Harms et al., 2001). Interestingly, the space occupied by protein L23 in D50S hosts two proteins in H50S, so that the halophilic L39e replaces the ex- tended loop of L23 in DSOS. L39e is a small protein of an extended non- globular conformation, thinner than the extended loop of L23 in D50S. Therefore it penetrates deeper into the tunnel walls than the loop of L23 in DSOS. L39e is present in archaea and eukaryotes, but not in eubacte- ria. Thus, it seems that with the increase in cellular complication, and perhaps as a consequence of the high salinity, a tighter control on the tunnel's exit was required, hence two proteins replace single one.

CONCLUDING REMARKS Ribosomal crystallography, initiated two decades ago, yielded exciting structural and clinical information. We found that both the decoding cen- ter and the peptidyl transferase centers are formed of RNA. Proteins seem to serve ancillary functions such as stabilizing required conforma-

Page 293: Conformational Proteomics of Macromolecular Architecture

280 Ada Yonath

tion, binding of non-ribosomal factors, assisting the directionality of the translocation and gating of the ribosomal tunnel.

The ribosome is an accurate and intricate machine, and as such it has ample of mobile regions. These include the head and the shoulder of the small subunit, the features lining the mRNA path; the peptidyl trans- ferase; the intersubunit bridges; the exit tunnel that control the release of nascent chains; the L1 stalk that provides the door for exiting tRNA.

The studies presented here show that the peptidyl transferase center tolerates various binding modes, but precise positioning appears to be crucial for the biosynthesis of protein chains. This precise positioning is determined by the tRNA helical stem, rather than by its 3' end, and the ribosome provides the structural frame for it. Once properly positioned, the peptide bond can be formed spontaneously. Ribosomal components appear not participate directly in the catalytic event. They may, however, be of major importance for cell vitality, as they may increase the effi- ciency or enhance the rate of the reaction..

Ribosomes are a major target for antibiotics. The therapeutic use of antibiotics has been severely hampered by the emergence of drug resis- tance in many pathogenic bacteria. With the increased popularity of anti- biotics to treat bacterial infections, pathogenic strains have acquired anti- biotic resistance, thus became ineffective. Resistance posed extremely serious medical problems that have prompted extensive effort in the de- sign of modified or new antibacterial agents. The findings shown here may assist not only rational drug design but also open the door for mini- mizing drug resistance.

ACKNOWLEDGEMENTS Thanks are due to J.M. Lehn, M. Lahav, A. Mankin and R. Wimmer for critical discussions, M. Pope for supplying us with tungsten clusters, M. Kessler for her superb assistance and to all the members ribosomal- crystallography groups at the Weizmann Institute and the Max-Planck Society for contributing to different stages of these studies. These studies could not be performed without the cooperation and assistance of the staff of station ID19 of the SBC at APYANL. The Max-Planck Society, the US National Institute of Health (GM34360), the German Ministry for

Page 294: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 28 1

Science and Technology (BMBF Grant 05-641EA), and the Kimmelman Center for Macromolecular Assembly at the Weizmann Institute pro- vided support. AY holds the Hellen and Martin Kimmel Professorial Chair.

REFERENCES 1. Agmon 1. Auerbach T, Baram D, Bartels H, Bashan A, Berisio R, Fucini

P, Hansen HAS, Harms J, Kessler M, Peretz M, Schluenzen F, Yonath A, Zarivach R. On peptide bond formation, translocation, nascent protein pro- gression and the regulatory properties of ribosomes, Eur J Biochem, 2003; 270: 2543.

2. Agrawal RK, Spahn CM, Penczek P, Grassucci RA, Nierhaus KH, Frank, J. Visualization of tRNA movements on the Escherichia coli 70s ribosome during the elongation cycle. J Cell Biol, 2000; 150: 447-60.

3. Alexander RW, Muralikrishna P, Cooperman BS. Ribosomal Components Neighboring the Conserved 5 18-533-Loop of 16s Ribosomal-RNA in 30s Subunits. Biochemistry, 1994; 33: 12109-18.

4. Altamura S, Sanz JL, Amils R, Cammarano P, Londei P. The Antibiotic Sensitivity Spectra of Ribosomes from the Thermoproteales Phylogenetic Depth and Distribution of Antibiotic Binding Sites. Syst App Microbiol,

5 . Ban N, Nissen P, Hansen J, Moore PB, Steitz TA. The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science, 2000; 289: 905-20.

6. Barta A, Dorner S, Polacek N. Mechanism of ribosomal peptide bond for- mation. Science, 2001; 291: 203.

7. Bashan A, Agmon I, Zarivach R, Schluenzen F, Harms J, Berisio R, Bartels H, Franceschi F, Auerbach T, Hansen HAS, Kossoy E,Kessler M, Yonath A. Structural basis for a unified machinery of peptide bond forma- tion, translocation and nascent chain progression. Mol Cell, 2003; 11: 91.

8. Bayfield MA, Dahlberg AE, Schulmeister U, Dorner S, Barta A. A con- formational change in the ribosomal peptidyl transferase center upon ac- tivehactive transition. Proc Nut1 Acad Sci USA, 2001; 98: 10096-101.

9. Berisio R, Schluenzen F, Harms J, Bashan A, Auerbach T, Baram D, Yo- nath A. Structural insight into the role of the ribosomal tunnel in cellular regulation. Nut Struct Biol , 2003a;lO: 366

1988; 10: 21 8-25.

Page 295: Conformational Proteomics of Macromolecular Architecture

282 Ada Yon&

10. Berisio R, Harms J, Schluenzen F, Zarivach R, Hansen HAS, Fucini P, Yonath A. Structural insight into the antibiotic action of telithromycin on resistant mutants. J Bacteriol, 2003b; 185: 4276

11. Berkovitch-Yellin Z, Bennett WS, Yonath A. Aspects in structural studies on ribosomes. Crit Rev Biochem Mol Biol, 1992; 27: 403-44.

12. Bourd SB, Kukhanova MK, Gottikh BP, Krayevsky AA. Cooperative ef- fects in the peptidyltransferase center of Escherichia coli ribosomes. Eur J Biochem, 1983; 135: 465-70.

13. Brodersen DE, Clemons WM Jr, Carter AP, Morgan-Warren RJ, Wim- berly BT, Ramakrishnan V. The structural basis for the action of the anti- biotics tetracycline, pactamycin, and hygromycin B on the 30s ribosomal subunit. Cell, 2000; 103: 1143-54.

14. Carter AP, Clemons WM, Brodersen DE, Morgan-Warren RJ, Wimberly BT, Ramakrishnan V. Functional insights from the structure of the 30s ri- bosomal subunit and its interactions with antibiotics. Nature, 2000; 407:

15. Clemons WM Jr, Brodersen DE, McCutcheon JP, May JL, Carter AP, Morgan-Warren RJ, Wimberly BT, Ramakrishnan V. Crystal structure of the 30 S ribosomal subunit from Thermus thermophilus: purification, crys- tallization and structure determination. JMol Biol, 2001 ; 310: 82743.

16. Cundliffe E. Antibiotic inhibitors of ribosome function, Wiley, London, New York, Sydney, Toronto, 198 1.

17. Cundliffe E. In The Ribosome: structure, function and evolution, Eds Hill WE, Dahlberg AE, Garrett RA, Moore PB, Schlessinger D, Warner JR. ASM, Washington DC, 479-490, 1990.

18. Davydova N, Streltsov V, Wilce M, Liljas A, Garber M, L22 Ribosomal Protein and Effect of Its Mutation on Ribosome Resistance to Erythromy- cin, J Mol Biol, 2002;322:635

19. Ever SU, Franceschi F, Boddeker N, Yonath A. Crystallography of halo- philic ribosome: the isolation of an internal ribonucleoprotein complex. Biophys Chem, 1994; 50: 3-16.

20. Franceschi F, Sagi I, Boeddeker N, Evers U, Arndt E, Paulke C, Hasenban R, Laschever M, Glotz C, Piefke J, Muessi J, Weinstein S, Yonath A. Crystallography, biochemical and genetics studies on halophilic ribosomes. Syst App Microbiol, 1994; 16: 697.

21. Frank J, Agrawal RK. A ratchet-like inter-subunit reorganization of the ribosome during translocation. Nature, 2000; 406: 3 18-22.

340-8.

Page 296: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 283

22. Fresno M, Carrasco L, Vazquez D. Initiation of the polypeptide chain by reticulocyte cell-free systems. Survey of different inhibitors of translation. Eur J Biochem, 1976; 68: 355-64.

23. Gabashvili IS, Gregory ST, Valle M, Grassucci R, Worbs M, Wahl MC, Dahlberg AE, Frank J. The polypeptide tunnel system in the ribosome and its gating in erythromycin resistance mutants of L4 and L22. Mol Cell,

24. Gale EF, Cundliffe E, Reynolds PE, Richmond MH, Waring MJ. Wiley, London, 198 1.

25. Garrett RA, Rodriguez-Fonseca C. In Ribosomal RNA: structure, evolu- tion, processing & function, Eds, Zimmermann RA Dahlberg AE, CRC Press, Boca Raton, pp. 327-55, 1995.

26. Gluehmann M, Zarivach R, Bashan A, Harms J, Schluenzen F, Bartels H, Agmon I, Rosenblum G, Pioletti M, Auerbach T, Avila H, Hansen HA, Franceschi F, Yonath A. Ribosomal crystallography: from poorly diffract- ing microcrystals to high-resolution structures. Methods, 2001 ; 25: 292- 302.

27. Green R, Noller HF. Ribosomes and translation. Annu Rev Biochem, 1997;

28. Green R, Switzer C, Noller HF. Ribosome-catalyzed peptide-bond forma- tion with an A-site substrate covalently linked to 23s ribosomal RNA. Sci- ence, 1998; 280: 286-9.

29. Hansen HA, Volkmann N, Piefke J, Glotz C, Weinstein S, Makowski I, Meyer S, Wittmann HG, Yonath A. Crystals of complexes mimicking pro- tein biosynthesis are suitable for crystallographic studies. Biochim Biophys Acta, 1990; 1050: 1-7.

30. Hansen JL, Schmeing TM, Moore PB, Steitz TA Structural insights into peptide bond formation, Proc Natl Acad Sci USA, 2002;9:11670

3 1. Harms J, Schluenzen F, Zarivach R, Bashan A, Gat S, Agmon I, Bartels H, Franceschi F, Yonath A. High resolution structure of the large ribosomal subunit from a mesophilic eubacterium. Cell, 2001; 107: 679-88.

32. Harms J, Tocilj A, Levin I, Agmon I, Stark H, Kolln I, van Heel M, Cuff M, Schlunzen F, Bashan A, Franceschi F, Yonath A. Elucidating the me- dium-resolution structure of ribosomal particles: an interplay between electron cryo-microscopy and X-ray crystallograhy. Structure Fold Des, 1999; 7: 931-941.

2001; 8: 181-8.

66: 679-716.

Page 297: Conformational Proteomics of Macromolecular Architecture

284 Ada Yonath

33. Herr AJ, Gesteland RF, Atkins JF. One protein from two open reading frames: mechanism of a 50 nt translational bypass. Embo J, 2000; 19: 267 1-80.

34. Hope H, Frolow F, von Bohlen K, Makowski I, Kratky C, Halfon Y, Danz H, Webster P, Bartels KS, Wittmann HG, et al. Cryocrystallography of ri- bosomal particles. Acta Crystallogr B, 1989; 45: 190-9.

35. Huang H, Chopra R, Verdine GL, Harrison, SC. Structure of a covalently trapped catalytic complex of HIV- 1 reverse transcriptase: implications for drug resistance. Science, 1998; 282: 1669-75.

36. Kirillov S, Porse BT, Vester B, Woolley P, Garrett RA. Movement of the 3'-end of tRNA through the peptidyl transferase centre and its inhibition by antibiotics. FEBS Lett, 1997: 406: 223-33.

37. Kozak M, Shatkin AJ. Migration of 40 S ribosomal subunits on messenger RNA in the presence of edeine. J Biol Chem, 1978: 253: 6568-77.

38. Kurylo-Borowska Z. Biosynthesis of edeine: 11. Localization of edeine synthetase within Bacillus brevis Vm4. Biochim Biophys Acta, 1975; 399: 3 1 4 1 .

39. Lake JA. Evolving ribosome structure: domains in archaebacteria, eubac- teria, eocytes and eukaryotes. Annu Rev Biochem, 1985; 54: 507-30.

40. Liao S, Lin J, Do H, Johnson AE. Both lumenal and cytosolic gating of the aqueous ER translocon pore are regulated from inside the ribosome during membrane protein integration. Cell, 1997; 90: 3 1 4 1.

41. Lill R, Wintermeyer W. Destabilization of codon-anticodon interaction in the ribosomal exit site. J Mol B i d , 1987; 196: 137-48.

42. Lodmell JS, Dahlberg AE. A conformational switch in Escherichia coli 16s ribosomal RNA during decoding of messenger RNA. Science, 1997; 277: 1262-7.

43. Lovett PS, Rogers EJ. Ribosome regulation by the nascent peptide. Micro- biol Rev, 1996; 60: 366-85.

44. Luger K, Mader AW, Richmond RK, Sargent DF, Richmond TJ. Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature, 1997; 389: 251-60.

45. Makowski I, Frolow F, Saper MA, Shoham M, Wittmann HG, Yonath A. Single crystals of large ribosomal particles from Halobacterium marismor- tui diffract to 6 A. J M o l Biol, 1987: 193: 819-22.

46. Malkin LI, Rich A. Partial resistance of nascent polypeptide chains to pro- teolytic digestion due to ribosomal shielding. J Mol Biol, 1967; 26: 32946.

Page 298: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 285

47. Mankin AS. Pactamycin resistance mutations in functional sites of 16 S rRNA. J Mol Biol, 1997: 274: 8-15.

48. Milligan RA, Unwin PN. Location of exit channel for nascent protein in 80s ribosome. Nature, 1986; 319: 693-5.

49. Miskin R, Zamir A, Elson D. The inactivation and reactivation of ribo- somal-peptidyl transferase of E. coli. Biochem Biophys Res Commun,

50. Moazed D, Noller HF. Intermediate states in the movement of transfer RNA in the ribosome. Nature, 1989; 342: 142-8.

51. Moazed D, Noller HF. Sites of interaction of the CCA end of peptidyl- tRNA with 23s rRNA. Proc Nutl Acad Sci USA, 1991; 88: 3725-8.

52. Moazed D, Samaha RR, Gualerzi C, Noller HF. Specific protection of 16 S rRNA by translational initiation factors. J Mol Biol, 1995: 248: 207-10.

53. Monro RE, Celma ML, Vazquez D. Action of sparsomycin on ribosome- catalysed peptidyl transfer. Nature, 1969; 222: 356-8.

54. Monro RE, Cerna J, Marcker KA. Ribosome-catalyzed peptidyl transfer: substrate specificity at the P-site. Proc Nutl Acud Sci USA, 1968; 61:

55. Mueller F, Sommer I, Baranov P, Matadeen R, Stoldt M, Wohnert J, Gor- lach M, van Heel M, Brimacombe R. The 3D arrangement of the 23 S and 5 S rRNA in the Escherichia coli 50 S ribosomal subunit based on a cryo- electron microscopic reconstruction at 7.5 A resolution. J Mol Biol, 2000;

56. Muessig J, Makowski I, von Bohlen K, Hansen H, Bartels KS, Wittmann HG, Yonath A. Crystals of wild-type, mutated, derivatized and complexed 50 S ribosomal subunits from Bacillus stearothermophilus suitable for X-ray analysis. .I Mol Biol, 1989; 205: 619-21.

57. Nakatogawa H, Ito K. The ribosomal exit tunnel functions as a discrimi- nating gate. Cell, 2002; 108: 629-36.

58. Nikulin A, EliseikinaI, Tishchenko S, Nevskaya N, Davydova N, Pla- tonova 0, Piendl W, Selmer M, Liljas A, Drygin D, Zimmermann R, Gar- ber M, Nikonov s. Nut Struct Biol2003: 6:6

59. Nierhaus KH, Schulze H, Cooperman BS. Molecular mechanisms of the ribosomal peptidyl transferase center. Biochem Znt, 1980; 1: 185-192.

60. Nissen P, Hansen J, Ban N, Moore PB, Steitz TA. The structural basis of ribosome activity in peptide bond synthesis. Science, 2000; 289: 920-30.

61. Noller HF, Hoffarth V, Zimniak L. Unusual resistance of peptidyl trans- ferase to protein extraction procedures. Science, 1992; 256: 1416-9.

1968; 33: 551-7.

1042-9.

298: 35-59.

Page 299: Conformational Proteomics of Macromolecular Architecture

286 Ada Yonath

62. Odom OW, Kranier G, Henderson AB, Pinphanichakarn P, Hardesty B. GTP hydrolysis during methionyl-tRNAf binding to 40 S ribosomal sub- units and the site of edeine inhibition. J Biol Chem, 1978; 253: 1807-16.

63. Odom OW, Picking WD, Hardesty B. Movement of tRNA but not the nas- cent peptide during peptide bond formation on ribosomes. Biochemistry, 1990; 29: 10734-44.

64. Ofengand J, Bakin A. Mapping to nucleotide resolution of pseudouridine residues in large subunit ribosomal RNAs from representative eukaryotes, prokaryotes, archaebacteria, mitochondria and chloroplasts. J Mol Biol, 1997; 266: 24668 .

65. Ogle JM, Brodersen DE, Clemons WM Jr, Tarry MJ, Carter AP, Rama- krishnan V. Recognition of cognate transfer RNA by the 30s ribosomal subunit. Science, 2001; 292: 897-902.

66. Pape T, Wintermeyer W, Rodnina M. Induced fit in initial selection and proofreading of aminoacyl-tRNA on the ribosome. Embo J, 1999; 18:

67. Penczek P, Ban N, Grassucci RA, Agrawal RK, Frank J. Haloarcula maris- mortui 50s Subunit-Complementarity of Electron Microscopy and X-Ray Crystallographic Information. J Struct Biol, 1999; 128: 44-50.

68. Pestka S. Inhibitors of protein synthesis, Weissbach. H. New York, 1977. 69. Pioletti M, Schluenzen F, Harms J, Zarivach R, Gluhmann M, Avila H,

Bashan A, Bartels H, Auerbach T, Jacobi C, Hartsch T, Yonath A, Franceschi F. Crystal structures of complexes of the small ribosomal sub- unit with tetracycline, edeine and IF3. Embo J , 2001; 20: 1829-39.

70. Polacek N, Gaynor M, Yassin A, Mankin AS. Ribosomal peptidyl trans- ferase can withstand mutations at the putative catalytic nucleotide. Nature, 2001; 411: 498-501.

71. Porse BT, Garrett RA. Mapping important nucleotides in the peptidyl transferase centre of 23 S rRNA using a random mutagenesis approach. J Mol Biol, 1995; 249: 1-10,

72. Porse BT, Kirillov SV, Awayez MJ, Ottenheijm HC, Garrett RA. Direct crosslinking of the antitumor antibiotic sparsomycin, and its derivatives, to A2602 in the peptidyl transferase center of 23s-like rRNA within ribo- some-tRNA complexes. Proc Natl Acad Sci USA, 1999; 96: 9003-8.

73. Rheinberger HJ, Sternbach H, Nierhaus KH. Three tRNA binding sites on Escherichia coli ribosomes. Proc Natl Acad Sci USA, 1981; 78: 5310-4.

3800-7.

Page 300: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 287

74. Rodriguez-Fonseca C, Amils R, Garrett RA. Fine structure of the peptidyl transferase centre on 23 S-like rRNAs deduced from chemical probing of antibiotic-ribosome complexes. J Mol Biol, 199.5; 247: 224-35.

75. Rodriguez-Fonseca C, Phan H, Long KS, Porse, BT, Kirillov, SV, Amils, R, Garrett, RA. Puromycin-rRNA interaction sites at the peptidyl trans- ferase center. RNA, 2000; 6: 744-54.

76. Sabatini DD, Blobel G. Controlled proteolysis of nascent polypeptides in rat liver cell fractions. 11. Location of the polypeptides in rough micro- somes. J Cell Biol, 1970; 45: 146-57.

77. Samaha RR, Green R, Noller HF. A base pair between tRNA and 23s rRNA in the peptidyl transferase centre of the ribosome. Nature, 1995; 377: 309-14.

78. Sarker S, Rudd KE, Oliver D. Revised translation start site for secM de- fines an atypical signal peptide that regulates Escherichia coli secA ex- pression. J Bacteriol, 2000; 182: 5592-5.

79. Schluenzen F, Tocilj A, Zarivach R, Harms J, Gluehmann M, Janell D, Bashan A, Bartels H, Agmon I, Franceschi F, Yonath A. Structure of func- tionally activated small ribosomal subunit at 3.3 angstroms resolution. Cell, 2000; 102: 615-23.

80. Schluenzen F, Zarivach R, Harms J, Bashan A, Tocilj A, Albrecht R, Yo- nath A, Franceschi F. Structural basis for the interaction of antibiotics with the peptidyl transferase centre in eubacteria. Nature, 2001 ; 413:

81. Schluenzen F, Harms J, Franceschi F, Hansen HAS, Bartels H, Zarivach R and Yonath A. Structural basis for the antibiotic activity of ketolides and azalides, Structure, 2003;11:329

82. Schmeing TM, Seila AC, Hansen JL, Freeborn B, Soukup JK, Scaringe SA, Strobe1 SA, Moore PB, Steitz TA. A pre-translocational intermediate in protein synthesis observed in crystals of enzymatically active 50s sub- units. Nut Struct Biol, 2002; 9: 225-30.

83. Shevack A, Gewitz HS, Hennemann B, Yonath A, Wittmann HG. Charac- terization and crystallization of ribosomal particles from Halobacterium marismortui. FEBS Lett, 1985; 184: 68-71.

84. Smith JD, Traut RR, GM B, Monro RE. Action of puromycin in polyade- nylic acid-directed polylysine synthesis. J Mol Biol, 1965; 13:

85. Spahn CM, Prescott CD. Throwing a spanner in the works: antibiotics and

8 14-2 I .

61 7-628.

the translation apparatus. J Mol Med, 1996; 74: 423-39.

Page 301: Conformational Proteomics of Macromolecular Architecture

288 Ada Yonath

86. Stark H, Orlova EV, Rinke-Appel J, Junke N, Mueller F, Rodnina M, Wintermeyer W, Brimacombe R, van Heel M. Arrangement of tRNAs in pre- and posttranslocational ribosomes revealed by electron cryomicro- scopy. Cell, 1997; 88: 19-28.

87. Stoffler G, Stoffler-Meilicke M. Immunoelectron microscopy of ri- bosomes. Annu Rev Biophys Bioeng, 1984; 13: 303-30.

88. Stroud RM, Walter P. Signal sequence recognition and protein targeting. Curr Opin Struct Biol, 1999; 9: 754-9.

89. Tenson T, Ehrenberg M. Regulatory nascent peptides in the ribosomal tun- nel. Cell, 2002; 108: 591-4.

90. Tenson T, Mankin AS. Short peptides conferring resistance to macrolide antibiotics. Peptides, 2001; 22: 1661-8.

91. Thompson J, Kim DF, O'Connor M, Lieberman KR, Bayfield MA, Greg- ory ST, Green R, Noller HF, Dahlberg AE. Analysis of mutations at resi- dues A2451 and G2447 of 23s rRNA in the peptidyltransferase active site of the 50s ribosomal subunit. Proc Natl Acad Sci USA, 2001; 98: 9002-7.

92. Tocilj A, Schlunzen F, Jane11 D, Gluhmann M, Hansen HA, Harms J, Ba- shan A, Bartels H, Agmon I, Franceschi F, Yonath A. The small ribosomal subunit from Thermus thermophilus at 4.5 A resolution: pattern fittings and the identification of a functional site. Proc Natl Acad Sci USA, 1999;

93. Trakhanov SD, Yusupov MM, Agalarov SC, Garber MB, Ryazantsev SN, Tischenko SV, Shirokov VA. Crystallization of 70s Ribosomes and 30s Ribosomal Subunits from Thermus Thermophilus. FEBS Lett, 1987; 220:

94. Traut RR, Monr RE. The puromycin reaction and its relationship to pro- tein synthesis. J Mol Biol, 1964; 10: 63-72.

95. Unge J, berg A, Al-Kharadaghi S, Nikulin A, Nikonov S, Davydova N, Nevskaya N, Garber M, Liljas A. The crystal structure of ribosomal pro- tein L22 from Thermus thermophilus: insights into the mechanism of erythromycin resistance. Structure, 1998; 6: 1577-86.

96. Vazquez D. Inhibitors of protein biosynthesis. Mol Biol Biochem Biophys,

97. Vogel Z, Vogel T, Zamir A, Elson D. Correlation between the peptidyl transferase activity of the 50 s ribosomal subunit and the ability of the sub- unit to interact with antibiotics. J Mol Biol, 197 1 ; 60: 339-46.

98. Volkmann N, Hottentrager S, Hansen HA, Zaytzev-Bashan A, Sharon R, Berkovitch-Yellin Z, Yonath A, Wittmann HG. Characterization and pre-

96: 14252-7.

319-22.

1979; 30: 1-312.

Page 302: Conformational Proteomics of Macromolecular Architecture

Ribosomal Crystallography 289

liminary crystallographic studies on large ribosomal subunits from Ther- mus thermophilus. J M o l Biol, 1990; 216: 23941.

99. von Bohlen K, Makowski I, Hansen HA, Bartels H, Berkovitch-Yellin Z, Zaytzev-Bashan A, Meyer S, Paulke C, Franceschi F, Yonath A. Charac- terization and preliminary attempts for derivatization of crystals of large ribosomal subunits from Haloarcula marismortui diffracting to 3 A resolu- tion. J Mol Biol, 1991; 222: 11-5.

100. Walter P, Johnson AE. Signal sequence recognition and protein targeting to the endoplasmic reticulum membrane. Annu Rev Cell Biol, 1994; 10:

101. Weisblum B . Erythromycin resistance by ribosome modification. Antim- icrob Agents Chemother, 199.5; 39: 577-85.

102. White 0, Eisen JA, Heidelberg JF, Hickey EK, Peterson JD, Dodson RJ, Haft DH, Gwinn ML, Nelson WC, Richardson DL, Moffat KS, Qin H, Ji- ang L, Pamphil W, Crosby M, Shen M, Vamathevan JJ, Lam P, McDonald L, Utterback T, Zalewski C, Makarova KS, Aravind L, Daly, MJ, Fraser CM, et al. Genome sequence of the radioresistant bacterium Deinococcus radiodurans R1. Science, 1999; 286: 1571-7.

103. Wilson KS, Noller HF. Molecular movement inside the translational en- gine. Cell, 1998; 92: 337-349.

104. Wimberly BT, Brodersen DE, Clemons WM Jr, Morgan-Warren RJ, Carter AP, Vonrhein C, Hartsch T, Ramakrishnan V. Structure of the 30s ribosomal subunit. Nature, 2000; 407: 327-39.

10.5. Woodcock J, Moazed D, Cannon M, Davies J, Noller HF. Interaction of antibiotics with A- and P-site-specific bases in 16s ribosomal RNA. Embo J , 1991; 10: 3099-103.

106. Yonath A. The search and its outcome: high-resolution structures of ribo- somal particles from mesophilic, thermophilic, and halophilic bacteria at various functional states. Annu Rev Biophys Biomol Struct, 2002; 31: 257- 73.

107. Yonath A, Bartunik HD, Bartels KS, Wittmann HG. Some X-ray diffrac- tion patterns from single crystals of the large ribosomal subunit from Ba- cillus stearothermophilus. J Mol Biol, 1984; 177: 201-6.

108. Yonath A, Glotz C, Gewitz HS, Bartels KS, von Bohlen K, Makowski I, Wittmann HG. Characterization of crystals of small ribosomal subunits. J Mol Biol, 1988; 203: 8314.

87-1 19.

Page 303: Conformational Proteomics of Macromolecular Architecture

290 Ada Yonath

109. Yonath A, Leonard KR, Weinstein S, Wittmann HG. Approaches to the determination of the three-dimensional architecture of ribosomal particles. Cold Spring Harb Symp Quant Biol, 1987a; 52: 729-41.

110. Yonath A, Leonard KR, Wittmann HG. A tunnel in the large ribosomal subunit revealed by three-dimensional image reconstruction. Science, 1987b; 236: 813-6.

111. Yonath A, Muessig J, Tesche B, Lorenz S, Erdmann VA, Wittrnann HG. Crystallization of the large ribosomal subunit from B. stearothermophilus. Biochem Int, 1980; 1: 315428.

112. Young JC, Andrews DW. The signal recognition particle receptor alpha subunit assembles co-translationally on the endoplasmic reticulum mem- brane during an mRNA-encoded translation pause in vitro. Embo J, 1996;

113. Yusupov MM, Yusupova GZ, Baucom A, Lieberman, K, Earnest, TN, Cate, JH, Noller, HF. Crystal structure of the ribosome at 5.5 A resolution. Science, 2001; 292: 883-96.

114. Zamir A, Miskin R, Elson D. Inactivation and reactivation of ribosomal subunits: amino acyl-transfer RNA binding activity of the 30 s subunit of Escherichia coli. J Mol Biol, 1971; 60: 347-64.

115. Zamir A, Miskin R, Vogel Z, Elson D. The inactivation and reactivation of Escherichia coli ribosomes. Methods Enzymol, 1974; 30: 406-26.

116. Zarivach R, Bashan A, Schluenzen F, Harms J, Pioletti M, Franceschi F, Yonath A. Initiation and inhibition of protein biosynthesis - studies at high resolution. Cum Protein Peptid Sci, 2002; 3: 55-65.

15: 172-81.

Page 304: Conformational Proteomics of Macromolecular Architecture

Chapter 13

THE DYNAMICS OF THE RIBOSOME AS INFERRED BY CRYO-EM: INDUCED AND

SELF-ORCAN I ZED MOT10 NS

Joachim Frank*

Even though its structure is known to atomic resolution, the ways in which the ribosome accomplishes its tasks in synthesizing proteins are still unknown. The key to an understanding of its dynamics might be found in cryo-electron microscopy of trapped states, and an interpretation of the resultant density maps by “molding” the X-ray structures into them. First results, obtained by application of real-space refinement techniques to cryo-EM maps of complexes in different conformations, indicate a complicated internal reorganization. The question arises as to whether the observed conformational changes accompanying ribosomal function might be predictable based on the architecture of the macromolecular complex. It has indeed been possible to derive one of the principal motions (the ratchet motion) by normal mode analysis of the ribosome represented as a simplified mechanical system.

Keywords: translation, translocation, elongation factor G, elongation factor Tu, molecular machines, real space refinement, normal mode analysis.

‘Howard Hughes Medical Institute, Health Research, Inc., Wadsworth Center, Empire State Plaza, Albany, New York 12201-0509. Email address: <joachim@ wadsworth.org>

29 1

Page 305: Conformational Proteomics of Macromolecular Architecture

292 Joaclzim Frank

INTRODUCTION To date, the ribosome is the most complex biological structure solved l o atomic resolution (Ban et al., 2000; Wimberly et al., 2000; Schlunzeii et al., 2000; Yusupov et al.. 2001). Despite the remarkable achievement of X-ray crystallography that the atomic structures represent, an understanding of the translation process is still beyond our reach, in the absence of detailed knowledge about the ribosome’s dynamics.

30s 505

Fig. 1. Cryo-EM maps of 50s large and 305 small subunits of E. coli 70s ribosome complexed with Wet-tRNA, reconstructed at two different levels of resolution; the 11.5- A map (bottom) is obtained from -73.000 particles (data from Gabashvili et al., 200Oj, and the 8.7-A map (top) is obtained from -110,000 particles using an electron microscope of high stability (from Spahn, Grassucci, Nierhaus, Frank, work in progress).

Page 306: Conformational Proteomics of Macromolecular Architecture

The Dynamics of the Ribosome as Inferred by Cqo-EM 293

The entire syslem Pormed by the ribosomc and its ligands represciits a prime example of a molecular machine (see Wilson and Noller, 1998a; Frank, 2000). The technique of cryo-electron microscopy (cryo-EM) and single-particle reconstruction (see Frank, 1996) offers a way to study molecular machines in defined dynamic states, permitting fast immobilization of complexes under conditions closely resembling those found in the living system. The application of cryo-EM to the ribosome is, at the present time, still handicapped by the limited resolution (best resolution to date is 8.7 A, see Fig. 1; Spahn. Grassucci, Nierhaus, Frank, unpublished; best published structure: 11.5 A; Gabashvili et al., 2000), and the fact that snapshots of the ctmcture are only available in particular states that have been trapped by the addition of either antibiotics or nonhydrol yzable analogs of GTP. Investigation of other intermediate states must await thc development of physical trapping methods, such as spray-freezing (Berriman and Unwin, 1994; White el al., 1998). Nevertheless, even data derived from the more limited studies already allow some important conclusions to be drawn, and some outstanding questions to be formulated with greater insight. On the horizon, as we will see, there is the possibility that dynamic behavior follows “naturally” from properties inherent in the ribosome’s architecture.

ELONGATION CYCLE, AND ALTERNATE BINDING OF FACTORS The elongation cycle of translation (Fig 2) is driven by the alternate binding, to the ribosome, of two molecular complexes with very similar shapc: on the one hand, the ternary complex formed by aminoacyl-tRNA, EF-Tu and GTP, and on the other hand. EF-G and GTP. The task of EF- Tu within the ribosome-bound ternary complex is to facilitate decoding and, if a codon-anticodon match is found. subsequent accommodation of the tRNA into the A site. The task of EF-G, once the peptide chain has been elongated and transferred to the new A-site tRNA, is to catalyze the translocation of the complex formed by the two neighboring A- and P- site tRNAs and the mRNA by exactly one codon.

The binding of both key complexes to the ribosome is followed by GTP hydrolysis. The close resemblance between the X-ray structures of

Page 307: Conformational Proteomics of Macromolecular Architecture

294 Joachim Frank

the ternary complex and EF-GiGDP (“molecular mimicry”; Nissen et al., 2000) has led to the suggestion that the two complexes bind to the ribosome in very similar positions. This hypothesis proved to be correct, as was borne out by cryo-electron microscopy (ternary complex: Stark et al., 1997; Agrawal et al., 2000; Valle et al., 2002; EF-G: Agrawal et al., 1998; Stark et al., 2000) and by results from hydroxyl radical probing (EF-G: Wilson and Noller, 1998b). In the cryo-EM maps, we can recognize a complex binding pattern involving several sites on both subunits. In particular, we see the long arm-like domain IV of EF-G reaching into the decoding center, in a way that is quite similar to the approach of the anticodon arm of the tRNA within the ternary complex.

For both the aa-tRNA*EF-Tu*GTP ternary complex and EF-G*GDP, the equivalent sites making contact with the ribosome are known from the results of various cross-linking and footprinting experiments. A careful analysis of these sites in all available X-ray structures (ternary complex: Nissen et al., 1995; Kjeldgaard et al., 1993; Kawashima et al., 1996; Berchtold et al., 1993; Heffron and Jurnak, 2000: EF-G: al- Karadaghi et al., 1996; Eversson et al., 1994; Czworkowski et al., 1994) leads to the conclusion that they form different, non-congruent

Fig. 2. (Figure on facing page) Elongation cycle of translation. The ribosome is viewed from the “top”; 30s subunit in yellow, 50s subunit in blue. Color scheme for tRNAs: pink, A site; green, P site; brown, E site. ( i.1 ( ii.)

( iii.)

( iv.)

Deacylated tRNA at the P site, polypeptide linked to tRNA at the A site. EF-G binds to ribosome, followed by GTP hydrolysis, and translocation of tRNA from A- and P- to P- and E- sites. EF-G has exited, and the ribosome is ready for the incorporation of the next tRNA at the A site, as dictated by the next codon. EF-Tu brings a new tRNA to the ribosome. In the decoding step, the anticodon is checked against the codon for match. If correct match is found, a signal is transmitted to the EF-Tu binding sites, GTP hydrolysis ensues, EF-Tu exits, and tRNA is accommodated at the A site. This event is immediately followed by polypeptide transfer from the P- to the A-site tRNA. [Note that the previous version of this diagram depicted a flipped orientation of the E- site tRNA immediately before its removal from the ribosome; the position shown follows from recent evidence (Valle et al., 2003), which shows the outer side of the elbow contacting the L1 stalk]. [Adapted from Frank et al., 2000, and reproduced with permission by ASM Press]

Page 308: Conformational Proteomics of Macromolecular Architecture

The Dynamics of the Ribosome as Inferred by CQO-EM 295

constellations. In other words, it is not possible to superimpose the two molecules such that the corresponding binding sites overlap. (It is true that an atomic structure of EF-G in the form in which it binds the ribosome is still not available, but solution scattering data obtained by Czworkowski and Moore (1997) suggest that it is very similar to the GDP form). From this analysis alone it follows that the ribosome, to accommodate the two ligands in successive phases of the elongation cycle, has to alternate between two different conformations. Thus a simplified description of the sequence of events during the elongation cycle might be as follows:

1. Factor A binds to ribosome in State I 2. GTP hydrolysis

EF-Tu* GDP

3

(iii)

EF-G. GTP

EF-G. GDP

Fig. 2.

Page 309: Conformational Proteomics of Macromolecular Architecture

296 Joachiin Frank

3 . Ribosome goes into State 11, causing release of Factor A 4. Factor B binds to ribosome in Statc I1 5. GTP hydrolysis 6. Ribosome goes into State 1, cau4ng releacc of Factor B ...

This scheme would explain ribosomc function during clongation as a chemically driven stochastic two-punch proccss. Bindmg of either ligand twice in a row would bc automatically prohibited, since the steric requirements would no longer be met in the second attempt. Earlier observations of biochemical properties have indeed led to the concept of two principal states of the ribosome, pre- and post-translocational, which we can equate to the postulated states I and I1 in above scheme. These two states are separated by a large energy barrier of 120 kJ/mol (Schilling-Bartetzko et al., 1992).

OBSERVED CONFORMATIONAL CHANGES OF THE RIBOSOME In the course of analyzing numerous cryo-EM maps depicting ribosomes in various Iigand binding states, we have observed many changes in the position of components. For example, the top portion of helix 44 of the 16s FWA moves by 8A during translocation (VanLoock et a1.- 2000); the double-lobed L7/L12 stalk base is found in different conformations (Rawat et al., 2002); the L1 stalk has been found in different positions related by a flexing of the RNA at its base (Gomez et al.. 2000; M. Valle, et al., 2003): and the nlRNA channel in the small subunit, at the junction of head and body. opens and closes sideways (Lata et al., 1996), a movement later described as the action of a “latch” (Schlunzen et al., 2000). The muct dramatic change observed was a rolalion of the small subunit with respect to the large subunil, which occur+ in rerponse to thc binding of EF-G in the GTP slate (Agrawal et a]., 1999; Frank and Agrawal. 2000). In the following, this particular motion, referred to as the “ratchet motion,” will be analyzed in some detail. This is essentially in continuation of an earlier analysis (Frank and Agrawal, 2002). Special attention w i l be focused on the extent to which the intersubunit bridges are involved.

Page 310: Conformational Proteomics of Macromolecular Architecture

The Dynamics of the Ribosome as hgerred b y Cryo-EM 291

The idea has bccn put forth that the two ribosomal states, associalcd with alternatc factor binding, are in fact rclated by thc ratchet molion (Frank and Agrawal, 2002). Since three of the binding sitcs are on the small subunit, and another three on thc largc subunit, a relative rotation of the subunits would indeed produce a series of unique binding constellations. However, closer analysis (M. Valle ef al., 2003) has not confirmed this speculation: the ratchet motion as described above is only observed as part of the translocation process, and after the motion the ribosome. appears to revert completely, yielding the initial relative sub- unit orientation, once translocation is complete. Rather, the conforma- tional change that is instruinental for the discrimination between the factors must be a more coinplex reorganization, affecting the specific gcomctiy of the factor binding region, for inslance ;1 change in the inter- subunit distance (opening and closing ol the inte.rsubunit space) and/or i n lhc structure of the L7/1,12 stalk base.

In order to link the observed global changes to local rnole.cular reor- ganizations: which may constitute the origin of the rnovenient, we need a more quantitative basis for analysis. The 0bse.rve.d density maps must be fitted with suitably deformed or reorganized versions of the X-ray structure. The approach that we have chosen is real-space refinement. In this computational approach developed by Chapman ( 1993, the X-ray structure is cut into components that are likely to be stable units: proteins, either whole or after division into mqjor domains. and high- stability components of RNA such as double-helical regions. Cutting points are chosen to coincide with regions of known instabilit.y, or where there is expeiirrienlal evidcncc of disorder in the structure. These components are then used to gcncrate electron densities. which are individually lilted into the cryo-EM map. In the end. Ihe underlying substructures are re-linked under observation of stereochemical. constraints. In other words, the idea is Lo mold the X-ray structure into the cqo-EM maps depicting different states of the ribosome structure.

Following a Ieast-squares approach, the residual subject to minimization is formed by two components: one relating to the difference between the observed and the target density maps, the other to the. ene.rgy of the stereochcmical interactions at the structural seams (C;hapman, 19%). This kind of analysis was first applied to EM

Page 311: Conformational Proteomics of Macromolecular Architecture

298

n L

Joachirn Frank

b

d

Fig. 3. Interpretation, in terms of the X-ray structure, of cryo-EM density maps of ribosomes in two different conformational states, related by the ratchet motion. Cryo-EM densities are shown as contour maps, displayed in Iris Explorer (Numerical Algorithms Group Inc., Downers Grove, IL). Re-modeled X-ray structures, using real-space refinement, are displayed as ribbon diagrams (Carson, 1997).

reconstructions of insect flight muscle, which were interpreted in terms of actin-myosin interactions in different constellations (Chen et al., 2001). The approach is not without its problems: all of the strain in the change of the structure has to be absorbed by the bonds at the sites where the structure is broken up and then mended. Nevertheless, by using a large number of pieces (over 160 in our case), we can at least approximate an even distribution of strain over the whole molecule.

This analysis has been applied to several maps that depict the extreme positions of the ratchet motion (Gao, et al., 2003). In the first

Page 312: Conformational Proteomics of Macromolecular Architecture

The Dynamics of the Ribosome as Inferred by Cryo-EM 299

step, the X-ray structures of both the 30s and 50s subunits were modified to account for the species-dependent variations of ribosomal RN A. Proteins were homology -modeled on the basis of sequence similarity between E. coli proteins and their counterparts from the other prokaryotic species for which the X-ray coordinates are known (Haloarcula marismortui (Ban et al., 2000), Thermus thermophilus (Wimberly et al., 2000; Yusupov et al., 2001), and Deinococcus rudioduruns (Harms et al., 2001)). The remodeled X-ray structures were then separately fitted, using real-space refinement, into the 70s density map.

The quality of the fits can be characterized by three measures: correlation coefficient, real-space R-factor, and number of van der Wads contact violations. For both of the subunits, values of 0.7, 0.25, and -3000 are typical for these measures, respectively. The quality of the fits can be seen from a display in which the cryo-EM density is shown as a contour, while the remodeled X-ray structure is represented as a ribbon diagram (Fig. 3 ) . An animated 3D display of the structure, alternating between the two conformations, is particularly useful in the interpretation (See enclosed CD). A preliminary analysis yields the following results:

The apparent center of the ratchet rotation lies in helix 27 of the 16s rRNA. Known as the switch helix, this structure changes its conformation in the decoding process following a local reorganization of secondary structure (Lodmell and Dahlberg, 1997), which in turn affects the conformation of the whole ribosome (Gabashvili et al., 1999). Several proteins change their positions drastically. Perhaps some of these changes may bring about changes in system behavior by “mechanical shunting” (see the discussion on normal mode analysis below). The 16s rRNA of the small subunit changes from a compact to a looser configuration. One of the effects of this “breathing” movement is the opening of the entrance and exit channels known to conduct the mRNA. Indeed, an open state of the channels would be required for the mRNA to move, during translocation, while a clamped-close state would be of benefit during the decoding process since this would convey maximum stability to the mRNA (Frank and Agrawal, 2000).

Page 313: Conformational Proteomics of Macromolecular Architecture

300 Joachim Frank

4) The largest structural reorganization occurs in the region of the intersubunit bridges connecting the central protuberance of the large subunit with the head of the small subunit.

TOWARD AN UNDERSTANDING OF RIBOSOME DYNAMICS In the seeking of metaphors to explain the mechanism of ribosomal action, the three images that come to mind are a clock, a Rube Goldberg mechanism, and a guitar. Although the clockwork mechanism has been invoked, in jest, in a review on ribosome functions (Staehelin et al., 1967; Fig. 4), it is clear that a deterministic, gear-like mechanism will not suffice as an explanation, since stochastic processes are inevitably involved at molecular dimensions. Rube Goldberg’s devices are characterized by elaborate, indirect pathways of action grafted onto devices originally designed for a much simpler purpose (see Berry, 2001). The reason why this image comes to mind is that a number of long cause-and-effect routes have by now been discovered in the

ribosome, requiring an ever- increasing complexity of conformational signaling as an explanation. The ribosome might be viewed as an essentially rigid structural framework that requires a large number of additional interacting levers, as in a Rube Goldberg invention, to bring about complex motions. The guitar, finally, might be an apt metaphor for an architecture that supports a certain modes of motion Fig. 4. The ribosome conceived as a

clockwork mechanism; from Staehelin et al., based on its very design. This (19671, reproduced with permission from last idea has a strong appeal Academic Press. and plausibility: could the

Page 314: Conformational Proteomics of Macromolecular Architecture

The Dynamics ofthe Ribosome as Inferred by Cryo-EM 301

various motions of the ribosome be explained on the basis of its architecture?

To approach this question, the ribosome structure was subjected to normal mode analysis (F. Tama et al., 2003). In this analysis, the atomic structure of a molecule is modeled in a simplified way as a system of masses connected with springs (Hooke’ s Law) to represent their interactions. The mathematical analysis of the resulting equation system leads to an eigenvector problem that can be solved by standard matrix inversion techniques. However, the unusual size of the problem - originally over 100,000 atoms in the ribosome - necessitated a number of simplifications and approximations.

Normal mode analysis yields a set of eigenmodes, which are collective motions of the pseudo-atoms, characteristic for the ways in which they are connected in the structure under consideration and for the way in which the entire structure is shaped. The modes are mutually orthogonal, ranked by importance, and form a complete, orthonormalized system. Thus, any actual motion of the structure can be uniquely represented by a linear superposition of these modes.

This analysis was applied to the coordinates of the ribosome from Thermus thermoplzilus (Yusupov et al., 2001). Since only motions affecting whole domains are of interest, it is sufficient here to use the phosphate groups to model RNA and C, atoms to model the proteins.

Remarkably, the high-ranking modes, representing a relatively large share of the total energy, are very similar to the motions experimentally observed. In them, the small subunit is seen to perform a ratchet rotation around an axis that coincides with helix 27, and the L1 stalk swivels between two positions, one in which it blocks the intersubunit space, and the other in which it leaves the space open. This particular mode is quite similar to the motion observed for the post-termination complex (Valle et al., 2003). Other modes show movements that have also been inferred from previous experimental work, such as the swinging motion of the C- terminal domain of protein L9 (Matadeen et al., 1999) and the variation in the spacing between the subunits.

We have thus a sketchy beginning of a possible summary explanation of ribosomal dynamics: energy supplied by thermal motion, by GTP hydrolysis, and by ligand binding could be channeled into

Page 315: Conformational Proteomics of Macromolecular Architecture

302 Joachim Frank

discrete, unique motions that are preferred, based on how the structure is configured at the time when the energy is supplied. [It is necessary to use this general formulation “the way the structure is configured at the time . . .” instead of “the way the structure is built,” since this formulation allows for the switching of the dynamical properties, or mode preferences.] Changes in configuration, and hence in dynamic properties, could be introduced by the switching of bi-stable RNA secondary structure elements (example: helix 27) or could be due to changes in protein-protein contacts (example: the S 13-L5 contacts in the bridges between the central protuberance of the large subunit and the head of the small subunit). It is this possibility of switching that may be able to combine the multifaceted patterns of ribosome dynamics observed thus far into a single, unifying picture.

CONCLUSION Although “snapshot” evidence collected by cryo-EM of ribosomal conformations determined by antibiotic binding or inhibition of GTP hydrolysis is rather sparse and anecdotal, it has shown that the ribosome undergoes dramatic changes during its functional cycle. We have demonstrated that a method of real-space refinement can be used to obtain atomic models of the ribosome in the alternate conformations, essentially by molding the X-ray structure into the various cryo-EM maps. A careful analysis of such models is expected to yield clues to the local conformational changes underlying the global reorganizations of the ribosome. Finally, we now have evidence that at least some of the motions inferred from the observed conformational changes can be explained on the basis of the properties of the complex mechanical system as a whole. Surely, this is just the beginning of a long voyage toward the discovery of one of life’s most intriguing mysteries.

ACKNOWLEDGEMENTS I thank Michael Watters for assistance with the illustrations, and Adriana Verschoor for a critical reading of the manuscript. This work was

Page 316: Conformational Proteomics of Macromolecular Architecture

The Dynamics of the Ribosome as Inferred by Cryo-EM 303

supported by the Howard Hughes Medical Institute and NIH grants R37 GM29169, R01 GM55440, and P41 RRO1219.

REFERENCES 1. Kvarsson A, Brazhnikov E, Garber M, Zheltonosova J, Chirgadze Y, al-

Karadaghi S, Svensson LA, Liljas A. Three-dimensional structure of the ribosomal translocase: elongation factor G from Thermus thermophilus.

2. Agrawal RK, Penczek P, Grassucci RA, Frank J. Visualization of elongation factor G on the Escherichia coli 70s ribosome: the mechanism of translocation. Proc Natl Acad Sci USA, 1998; 95:6134-6138.

3. Agrawal RK, Heagle AB, Penczek P, Grassucci RA, Frank J. EF-G- dependent GTP hydrolysis induces translocation accompanied by large conformational changes in the 70s ribosome. Nature Struct Biology, 1999; 6543-647.

4. Agrawal RK, Heagle B, Frank J. Studies of Elongation factor G-dependent tRNA translocation by three-dimensional cryo-electron microscopy. In The Ribosome: Structure, Function, Antibiotics, and Cellular Interactions, Edited by R.A. Garrett, S.R. Douthwaite, A. Liljas, A.T. Matheson, P.B. Moore, and H.F. Noller, ASM Press, Washington, DC, 2000, pp53-62.

5. al-Karadaghi S, Evarsson A, Garber M, Zheltonosova J, Liljas A. The structure of elongation factor G in complex with GDP: conformational flexibility and nucleotide exchange. Structure, 1996; 4: 555-565.

6. Ban N, Nissen P, Hansen J, Moore PB, Steitz TA. The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science, 2000;

7. Berchtold H, Reshetnikova L, Reiser CO, Schimer NK, Sprinzl M, Hilgenfeld R. Crystal structure of active elongation factor Tu reveals major domain rearrangements. Nature, 1993; 365: 126-132.

8. Berry 1. Chain Reaction, Tang Teaching Museum and Art Gallery at Skidmore College, Saratoga Springs, NY, 2001.

9. Berriman J, Unwin N. Analysis of transient structures by cryo-microscopy combined with rapid mixing of spray droplets. Ultramicroscopy, 1994; 56:

EMBO J, 1994; 13: 3669-3677.

289: 905-920.

241-252. 10. Carson M. Ribbons. Methods Enzymol, 1997; 277: 493-505.

Page 317: Conformational Proteomics of Macromolecular Architecture

304 Joachim Frank

1 1. Chapman MS. Restrained real-space macromolecular atomic refinement using a new resolution-dependent electron-density function. Acta Cryst, 1995; A51: 69-80.

12. Chen LF, Blanc E, Chapman MS, Taylor K. Real space refinement of Acto- myosin structures from sectioned muscle. J Struct Biol, 2001; 133: 221-232.

13. Czworkowski J, Moore PB. The conformational properties of elongation factor G and the mechanism of translocation. Biochem, 1997; 36: 10327- 10334.

14. Czworkowski J, Wang J, Steitz TA, Moore PB. The crystal structure of elongation factor G complexed with GDP, at 2.7 A resolution. EMBO J , 1994; 13: 3661-3668.

15. Frank J. Three-dimensional Electron Microscopy of Macromolecular Assemblies. Academic Press, San Diego, 1996.

16. Frank J. The ribosome - a macromolecular machine par excellence. Chem

17. Frank J, Agrawal RK. A ratchet-like inter-subunit reorganization of the ribosome during translocation. Nature, 2000; 406: 3 18-322.

18. Frank J et al. Cryo-electron microscopy of the translational apparatus: Experimental evidence for the paths of mRNA, tRNA, and the polypeptide chain, in RA Garrett, SR Douthwaite, A Liljas, AT Matheson, PB Moore, and HF Noller (eds.), The Ribosome: Structure, Function, Antibiotics, and Cellular Interactions, ASM Press, Washington, DC, 2000, pp. 45-5 1.

19. Frank J, Agrawal RK. Ratchet-like movements between the two ribosomal subunits: their implications in elongation factor recognition and tRNA translocation, in Cold Spring Harbor Symposia on Quantitative Biology: The Ribosome, Cold Spring Harbor Press, NY, 2002, p67-75.

20. Gabashvili IS, Agrawal RK, Grassucci R, Squires CL, Dahlberg AE, Frank J. Major rearrangements in the 70s ribosomal 3D structure caused by a conformational switch in 16s ribosomal RNA. EMBO J , 1999; 18: 6501- 6507.

21. Gao H, Sengupta J, Valle M, Korostelev A, Eswar N, Stagg SM, Van Roey P, Agrawal RK, Harvey SC, Sali A, Chapman MS, Frank J. Study of the structural dynamics of the E. coli 70s ribosome using real space refinements. Cell, 2003; 113: 789-801.

22. Gabashvili IS, Agrawal RK, Spahn CMT, Grassucci RA, Svergun DI, Frank J, Penczek P. Solution structure of the E. Coli ribosome at 11.5 A resolution. Cell, 2000; 100: 51-63.

& Biol, 2000; 7: R133-R141.

Page 318: Conformational Proteomics of Macromolecular Architecture

The Dynamics of the Ribosome as Inferred by Cryo-EM 305

23. Gomez-Lorenzo MG, Spahn CMT, Agrawal RK, Grassucci RA, Penczek P, Chakraburtty K, Ballesta JPG, Lavandera JL, Garcia-Bustos JF, Frank J. Three-dimensional cryo-electron microscopy localization of EF2 in the Saccharomyces cerevisiae 80s ribosome at 17.5 P\ resolution. EMBO J ,

24. Harms J, Schlunzen F, Zarivach R, Bashan A, Gat S, Agmon I, Bartels H, Franceshi F, Yonath A. High resolution structure of the large ribosomal subunit from a mesophilic eubacterium. Cell, 2001; 107: 679-688.

25. Herron SR, Jurnak F. Structure of an EF-Tu complex with a thiazolyl peptide antibiotic determined at 2.35 A resolution: atomic basis for GE 2270A inhibition of EF-Tu. Biochem, 2000; 39: 37-45.

26. Kawashima T, Berthet-Colominas C, Wulff M, Cusack S, Leberman R. The structure of the Escherichia coli EF-Tu-EF-Ts complex at 2.5 A resolution. Nature, I 996; 379: 5 1 1-5 18.

27. Kjeldgaard M, Nissen P, Tirup S, Nyborg J. The crystal structure of elongation factor EF-Tu from Thermus aquaticus in the GTP conformation. Structure, 1993; 1: 35-50.

28. Lata KR, Agrawal RK, Penczek P, Grassucci R, Zhu J, Frank J. Three- dimensional reconstruction of Escherichia coli 30s ribosomal subunit in ice. J Mol Biol, 1996; 262: 43-52.

29. Lodmell JS, Dahlberg AE. A conformational switch in Escherichia coli 16s ribosomal RNA during decoding of messenger RNA. Science, 1997; 277:

30. Matadeen R, Patwardhan A, Gowen B, Orlova EV, Mueller F, Brimacornbe R, van Heel M. The Escherichia coli large ribosomal subunit at 7.5 P\ resolution. Structure, 1999;7; 1575-1583.

31. Nissen P, Kjeldgaard M, Thirup S, Polekhina G, Reshetnikova L, Clark BFC, Nyborg J. Crystal structure of the ternary complex of Phe-tRNAphe, EF-Tu, and a GTP analog. Science, 1995; 270: 1464-1472.

32. Nissen P, Kjeldgaard M, Nyborg J. Macromolecular mimicry. EMBO J.,

33. Rawat UBS, Zavialov AV, Sengupta J, Valle M, Grassucci RA, Linde J, Vestergaard B, Ehrenberg M, Frank J. A cryo-electron microscopic study of ribosome-bound termination factor RF2, Nature submitted, 2002.

34. Schilling-Bartetzko S, Bartetzko A, Nierhaus KH. Kinetic and thermodynamic parameters for tRNA binding to the ribosome and for the translocational reaction. J Biol Chem, 1992; 267: 4703-47 12.

2000; 19: 1-10,

1262-1267.

2000; 19: 489-495.

Page 319: Conformational Proteomics of Macromolecular Architecture

306 Joachim Frank

35. Schliinzen F, Tocilj A, Zarivach R, Harms J, Gluhmann M, Jane11 D, Bashan A, Bartels H, Agmon I, Franceschi F, Yonath A. Structure of functionally activated small ribosomal subunit at 3.3 A resolution. Cell, 2000; 102: 615-623.

36. Staehelin T et al. in Vogal H J, Lampen J 0 and Bryson V (eds.), Organizational Biosythesis, Academic Press, San Diego, CA, 1967, pp. 443-457.

37. Stark GH, Rodnina MV, Wieden HJ, van Heel M, Wintermeyer W. Large- scale movement of elongation factor G and extensive conformational change of the ribosome during translocation. Cell, 2000; 100: 301-309.

38. Stark H, Rodnina M, Rinkeappel J, Brimacombe R, Wintermeyer W, van Heel M. Visualization of elongation factor Tu on the Escherichia coli ribosome. Nature, 1997; 389: 403-406.

39. Tama F, Valle M, Frank J, Brooks Ill, CL. Dynamic reorganization of ribosome explored by normal mode analysis and cryo-electron microscopy. Proc Nut1 Acad Sci (USA), 2003; 100: 9319-9323.

40. Valle M, Sengupta J, Swami NK, Grassucci RA, Burkhardt N, Nierhaus KH, Agrawal RK, Frank J. Cryo-EM reveals an active role for aminoacyl- tRNA in the accommodation process. EMBO J , 2002; 21: 3557-3567.

41. Valle M, Zawialov A, Sengupta J, Rawat U, Ehrenberg M, Frank J. Locking and unlocking of ribosomal motions. Cell, 2003; 114: 123-134.

42. VanLoock MS, Agrawal RK, Gavashvili IS, Qi L, Frank J, Harvey SC. Movement of the decoding region of the 16s ribosomal RNA accompanies tRNA translocation. J Mol Biol, 2000; 304: 507-515.

43. White HD, Walker ML, Trinick J. A computer controlled spraying-freezing apparatus for millisecond time-resolution electron cryomicroscopy. J Struct Biol, 1998; 121: 306-313.

44. Wilson KS, Noller HR. Molecular Movement inside the translational engine. Cell, 1998a; 92: 337-349.

45. Wilson KW, Noller HF. Mapping the position of EF-G in the ribosome by directed hydroxyl radical probing. Cell, 1998b; 92: 131-139.

46. Wimberly BT, Brodersen DE, Clemons WM. Jr, Morgan-Warren RJ, Carter AP, von Rhein C, Hartsch T, Ramakrishnan V. Structure of the 30s ribosomal subunit. Nature, 2000; 407: 327-339.

47. Yusupov, MM, Yusupova GZ, Baucom A, Lieberman K, Earnest TN, Cate JN, Noller HF. Crystal structure of the ribosome at 5.5 A resolution.

Page 320: Conformational Proteomics of Macromolecular Architecture

Chapter 14

HOW DO TRANSLATION FACTORS CATALYZE PROTEIN SYNTHESIS

Martin Laurberg", Ole Kristensen', Maria Selrner', Xiao-Dong Su' and Anders Liljas'

The analysis of protein synthesis has taken a leap due to the dramatic progress in crystallographic and cryo-EM studies of ribosomes and their subunits. To be able to understand the mechanisms involved in protein synthesis one has to consider the essential roles of the transla- tion factors that catalyze different steps of the process. Most steps of protein synthesis are catalyzed by these factors. Of specific interest are the translational GTPases that undergo significant conformational changes associated with GTP hydrolysis. Two major questions are how is the GTPase activity induced and how does the conformational change induce the desired action in translation.

Keywords: Protein synthesis, ribosomes, translation factors, GTP hydrolysis, L12.

INTRODUCTION Protein synthesis occurs on ribosomes in all types of cells. Due to the recent work on the structure of ribosomes by x-ray crystallography (Ban et al., 2000; Wimberley et al., 2000; Schluntzen et al., 2000; Yusupov et al., 2001; Harms et al., 2001) and cry0 electron microscopy (Stark et al.,

Tenter for Molecular Biology of RNA, Sinsheimer Laboratories, University of California at Santa Cruz, Santa Cruz, CA 95064, USA. 'Department of Medicinal Chemistry, Royal Danish School of Pharmacy, Universitetsparken 2, 2100 K@benhavn 0, Denmark. 'MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, England 'Life Sciences College, Peking University, Beijing, China. "Corresponding Author: Molecular Biophysics, Center for Chemistry and Chemical Engineering, Lund University, Box 124, SE-221 00 Lund, Sweden. Email address: anders.liliasOmbfvs.lu.se

307

Page 321: Conformational Proteomics of Macromolecular Architecture

308 Martin Laurberg et al.

1997, 2000; Agrawal et al., 1998, 1999; Valle et al., 2002) we have ob- tained a much improved picture of how the process is performed. Central in this process is the ribosome with its two subunits, the small (30s) and the large (50s). Another fundamental component is tRNA, the adaptor, which is able to read a codon of the message (mRNA) with one end of the molecule (anticodon) and carry the corresponding amino acid in the other end. There are three fundamental processes involved in the incor- poration of each amino acid:

1. Decoding of the mRNA and the binding of the aminoacyl-tRNA 2. Peptidyl transfer 3. Translocation

Due to the recent structural work we now know in great detail how the decoding is done on the small subunit and how the ribosome ensures a high fidelity in translation (Ogle et al., 2001, 2002). We also know elements of the peptidyl transfer site on the large subunit (Nissen et a]., 2000; Schliintzen et al., 2001; Yonath, 2002). Both of these processes depend primarily on the ribosomal RNA (rRNA). It is now clear that the ribosome is a ribozyme (Noller et al., 1992; Nissen et al., 2000). The third step, the translocation, is less well characterized even though we know a fair amount of detail (Frank & Agrawal, 2000).

The tRNA molecules are central in the process of translation. Two classical tRNA binding sites on the ribosome are the aminoacyl-tRNA (A-site) and peptidyl-tRNA binding sites (P-site). The site for the exiting deacylated tRNA is also well established (Rheinberger et al., 1981; Moazed & Noller, 1989, Yusupov et al., 2001).

Template directed protein synthesis solely based on 70s ribosomes, mRNA and charged tRNAs has been reported, but is exceedingly slow (Gavrilova et al., 1976; Rutkevitch & Gavrilova, 1982). The translation factors are essential as catalysts of the process. They are proteins that participate specifically in the different steps of protein synthesis and can be divided into initiation, elongation, termination and recycling factors.

Page 322: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 309

THE TRANSLATION FACTORS The translation factors are divided according to in which step they par- ticipate (Table 1). One group of translation factors binds GTP molecules and is induced by the ribosome to hydrolyze the GTP to GDP and in- organic phosphate at the appropriate moment (Kaziro, 1978). These fac- tors are the translational GTPases (tGTPases). Their overlapping binding sites have been established both through chemical methods (Moazed et al., 1988) as well as cryo-electron microscopy (Agrawal et al., 1998, 1999; Stark et al., 1997, 2000; Valle et al., 2002). A number of transla- tion factors are also found to bind to some of the binding sites for tRNA on the ribosome (McCutcheon et al., 1999; Carter et al., 2001; Dallas & Noller, 2001; Pioletti et al., 2001). These factors could be called tRNA related factors. Some of these factors have been found to mimic tRNAs (Nissen et al., 1995; Kristensen et al., 2002). In this review we will summarize some observations concerning the translation factors, primar- ily discussing findings made in our laboratory.

Table 1. Eubacterial translation factors

Function GTPase

Initiation IF-1 Assists IF-2 in initiation IF-2 Binds initiator-tRNA to P-site IF-3 Assists in dissociation of subunits

+

Elongation EF-TU Binds aminoacyl-tRNA to A-site SelB Binds SeCys-tRNA to A-site EF-TS EF-G

Nucleotide exchange factor for EF-Tu Translocates peptidyl tRNA from A-to P-site

+ +

+ Termination RF112

RF-3 Releases W-1,2 from ribosome +

Recognizes termination codons and releases peptide from P-site tRNA

Recycling RRF Dissociates the terminated ribosomes

Page 323: Conformational Proteomics of Macromolecular Architecture

310 Martin Laurberg et al.

The tRNA Mimicking Factors When the structure of EF-G was compared to the ternary complex of EF- Tu, GDPNP and aminoacyl-tRNA it was evident that the domains 111, IV and V mimicked tRNA (Nissen et al., 1995). Subsequently a number of factors have been identified to have shapes like tRNA molecules. The first was the ribosome recycling factor (RRF; Selmer et al., 1999). Sub- sequently human eRFl (Song et al., 2000) and E. coli RF2 (Vestergaard et al., 2001) have been called tRNA mimics. The implication has been that their shapes would suggest that they might bind to tRNA binding sites, primarily the A-site (Kristensen et al., 2002). The binding site of the protein that probably has the most tRNA-like shape, RRF, has been characterized by chemical methods (Lancaster et al., 2002). It is evident that RRF binds to part of the A-site, but its mode of binding is much dif- ferent from that of a tRNA. In the case of RF2 the factor apparently un- dergoes a large conformational change in going from the crystal structure to the ribosome complex studied by cryo-EM methods (Rawat et al., 2002).

Some additional proteins have also been observed to have shapes like tRNA molecules. In these cases there is no information that they would bind to any tRNA binding site. These are the ribosomal protein TL5 (Fedorov et al., 2001), the C-terminal half of SelB (Selmer and Su, 2002) and EF-P (Benson et al., 2000). Thus the relation between tRNA shape and the binding to tRNA sites is far from certain.

The Translational GTPases The tGTPases belong to a large family of GTP hydrolyzing enzymes called G-proteins (Bourne et al., 1991). This family of proteins is further- more structurally and functionally related to a large family of ATP hy- drolyzing enzymes (Leipe et al., 2002). The tGTPases are multi-domain proteins (Fig. 1). They all have two domains in common, domain I, or the G-domain, and the subsequent domain, most frequently called do- main I1 (Evarsson 1995).

Page 324: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 311

370 890

1

G-domain 1'1 Double split pa-p - Domain II

Winged helix - Fig. 1. The domain organization of the tGTPases. The different type of domains is indi- cated by different colors. The G-domain (red) and domain I1 (light green) is the only common parts between all the tGTPases. Some eubacterial IF2 have an N-terminal exten- sion of about 370 residues of unknown structure. Domain IV (pink) of IF2 is connected to domain 111 (light blue) by a long a-helix (Roll-Mecak et al., 2000). The N-terminal half of SelB is a version of EF-Tu whereas the C-terminal half has four related domains of the winged-helix type. EF-G has an insert (moss green) in the G-domain. Domains 111 and V (brown) have structures of the double split P-a-P type (Orengo & Thornton, 1993). The structure of RF-3 is not known, but the G-domain and domain I1 are recognized from the amino acid sequence (Wvarsson, 1995).

The G-domain binds GTP and has most of the essential components for GTP hydrolysis. The domain has a typical version of the Rossmann fold (Kjeldgaard & Nyborg, 1992) that is built from a sheet of parallel p- strands connected by a-helices. The loops connecting the C-terminal side of the p-stands to the a-helices have unique functions and are close to the binding site for the nucleotide. Thus, the first loop is called the P-loop and folds around the phosphate moieties of the G-nucleotide. The next loop is called Switch I, or the effector loop. It is involved in receptor binding and may switch conformation drastically between the GDP and GTP states (Walker et al., 1982). The third loop is called Switch I1 and is also related to the nucleotide binding and the dynamics of the G-proteins (Walker et al., 1982).

Page 325: Conformational Proteomics of Macromolecular Architecture

3 12 Martin Laurberg et 01.

Binding t o Binding t o ribosome GAP. GTPase tRNA (IFZ, EF-Tu and SelB)

tGTPase cycle

Dissociation u G nucleotide exc han qe f r o m the f actor.kEF ribosome

Fig. 2. The,furzcfional cycle of tGTPases. The GTP-binding state, on the left side, is able to bind aminoacyl-tRNA (IF2, EF-Tu and SelB). These ternary complexes or the binary complex EF-G: GTP can bind to the ribosome where the GTP is hydrolyzed activated by the ribosomal GAP. The hydrolysis of GTP leads to a conformational change and the dissociation of the tGTPase from the ribosome. The GDP will now be exchanged for GTP, in the case of EF-Tu with the aid of EF-Ts. RF3 has a somewhat different func- tional cycle in that it binds in complex with GDP to ribosomes with a bound RF1/2. The GDP is exchanged for GTP on the ribosome. RF1/2 and RF3 are released from the ribo- some in conjunction with GTP hydrolysis (Zavialov et al., 2001).

The normal functional cycle for tGTPases is shown in Fig. 2. RF3 deviates from this pattern (Zavialov et al., 2001). The central event in the cycle is indicated with a red arrow and is the GTP hydrolysis on the ribo- some. All G-proteins undergo conformational changes associated with GTP hydrolysis. The conformational changes of the tGTPases are impor- tant for the catalysis of the different steps of protein synthesis on ribo- somes. The conformational changes of the tGTPases primarily have ef- fects with regards to their binding to tRNA and to ribosomes. The GTP hydrolysis by the different factors becomes essentially irreversible steps in the process. Common for all G-proteins is the fact that the GTP hydro- lyzing activity is induced by some other component of the cell at the ap- propriate moment. The general term for the GTPase activators, which

Page 326: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 313

normally are proteins, is GAP, the GTPase activating protein (Bourne et al., 1991).

After GTP hydrolysis the factors are normally released from the ri- bosome and need to exchange their GDP for GTP. For G-proteins in general, this is assisted by a protein called GEF, the G-nucleotide ex- change factor (Bourne et al., 1991). Only EF-Tu of the tGTPases needs a GEF. This protein is called EF-Ts (Kaziro, 1978). In the case of RF3 the ribosome functions as a GEF (Zavialov et al., 2001). The need of a GEF naturally depends on the affinity of the tGTPases for GTP and GDP re- spectively. Table 2 illustrates why GEFs appear not needed by several of the tGTPases.

Table 2. Functional properties of the tCTPasesa

IF-2b EF-Tu' EF-G' SelB* RF-3'

Kd GTP 143M 0.36M 14M 0.74M 2.SM

Kd GDP 12.5M 4.9nM J I M 13.4M SSnM

Affinity

GTP/GDP ratio 0.091 0.014 0.77 18 0.0022

GEF EF-TS Ribosome

GTPase 50s bind- Codon recog- Ribosome Codon Ribosome trigger ing nition binding recognition binding

Function of GTP hy- Factor drolysis recycling? translocation. lease cling.

Accelerated tfiVA-re- Factor recy- tRNA release.

'Adapted from the doctoral thesis of Maria Selmer (2002) bPon et al., 1985 'Kaziro 1978 %anbichler et al., 2000 'Zavialov et al., 2001

Page 327: Conformational Proteomics of Macromolecular Architecture

3 14 Martin Laurberg et al.

N

-Y

Fig. 3. The structure of the C-termnal domains of SelB (Reproduced with the lund permission of EMBO J.). This part of the protein is built up of four domains that all are of the winged helix type (Selmer & Su, 2002). The EF-Tu related part that binds GTP and SeCys-tRNA connects to the red domain. The blue domain in the opposite end of the molecule is most likely the one that contacts the stem-loop structure of the mFWA. Vari- ous mutational studies suggest this.

EF-TU Elongation factor Tu (EF-Tu) is the most extensively studied trans- lational GTPase. EF-Tu catalyzes the binding of aminoacyl-tRNA to the ribosome. It has been structurally characterized in various states among others with GDP, with a GTP analogue, GDPNP, and as ternary com- plexes with GDPNP and aminoacyl-tRNAs (Nissen et al., 1995, 1999). In the ternary complex the acceptor arm and the amino acid are held tightly between the domains of EF-Tu whereas the anticodon part of the tRNA stretches far out from the protein.

The conformational changes between the GDP and GTP states are dramatic. The Switch I and I1 loops undergo large structural changes in relation to loosing the y-phosphate (Kjeldgaard et al., 1993). As a conse- quence domains I1 and I11 move as one block with regard to the G- domain and EF-Tu dissociates from the tRNA and the ribosome.

Page 328: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 315

The binding of the ternary complex to the ribosome has been studied with the aid of cryo-EM (Stark et al., 1997; Valle et al., 2002). Here it is evident that the codon of the tRNA interacts with the mRNA on the 30s subunit whereas the G-domain of EF-Tu interacts with specific regions of the 50s subunit. In this state the acceptor end of the tRNA and the aminoacyl moiety is far from the peptidyl tranferase site where it will be incorporated with the nascent peptide. Evidently the GTP hydrolysis and the dissociation of EF-Tu will allow the acceptor end to accommodate into the peptidyl transfer site.

SelB

SelB is a protein that is essential in incorporating the 21st amino acid, seleno-cystein, into a number of proteins (Forchhammer et al., 1989). The protein functions as an EF-Tu for aminoacylated SeCys-tRNA (Baron et al., 1993). The codon that is used for incorporation of this rare amino acid is a stop codon, UGA. If this stop codon occurs at a specific distance from a certain stem-loop structure of the mRNA, SeCys may become incorporated. SelB can recognize this mRNA structure and bind to it. When the ribosome reaches the associated UGA codon the likeli- hood that RF2 will bind is decreased and instead SeCys is incorporated into the nascent peptide due to the mRNA bound ternary complex of Se- Cys-tRNA, SelB and GTP (Suppmann et al., 1999).

The N-terminal half of SelB corresponds to the three domains of EF- Tu (Kromayer et al., 1996). The structure of the mRNA binding part of SelB was recently determined (Selmer & Su, 2002). It was found to be composed of four closely similar domains arranged in the form of an ‘L’(Fig. 3). From the location of the conserved residues and functional mutants it was evident that the main interactions with the stem-loop of the mRNA are due to the seventh or C-terminal domain (Kromayer et al., 1999; Li et al., 2000). The ternary complex of SeCys-tRNA, GTP and SelB obviously can interact with two points of the mRNA, one between the codon and the anticodon of SeCys-tRNA and the other between the stem-loop structure of the mRNA and the C-terminal domain of SelB.

Page 329: Conformational Proteomics of Macromolecular Architecture

316 Martin Laurberg et al.

- 8 i -._ -

- p. i ,--

- E i 1 d O n r I !

01 &

*- -. _- .

--- * --- R1 mP

Fig. 4. a) The outline of the bacterial ribosome is shown with the tRNA and mRNA bind- ing sites and the binding site for EF-G and the tunnel through which the mRNA is threaded (Yusupov et al., 2001; Yusupova et al., 2001). b) The EF-Tu like parts of SelB and the SeCys-tRNA bind to the ribosome in the manner that EF-Tu and EF-G are known to interact (Stark el al., 1997, 2000, Agrawal et al., 1998,1999; Valle et al., 2002). When the stop codon UGA is exposed in the A-site the anticodon of the SeCys-tRNA competes with RF2 to interact with it. An interaction of the C-terminal domains of SelB with the stem-loop structure of the mRNA favors the incorporation of SeCys. The stem loop struc- ture is around l O O A away from the point of contact between the EF-Tu like part of SelB and the first winged helix domain. Thus the L-shaped C-terminal part of SelB may need to form a more open angle to be able to connect the EF-Tu part with the mRNA (Selmer & Su, 2002). Reproduced from Selmer (2002)

Since the EF-Tu related parts of SelB are held against the 50s subunit in analogy with EF-Tu, and since the anticodon of the tRNA extends far from the factor protein it is evident that an elongated structure of the C-terminal domains of SelB is needed (Fig. 4) to reach the stem-loop structure of the mRNA (Selmer & Su, 2002) for which the approximate position can be estimated from structural examination of the path of the mRNA on the 30s subunit (Yusupova et al., 2001).

EF-C

Elongation factor G (EF-G) catalyzes the translocation of peptidyl-tRNA from the A-site to the P-site. In this process the deacylated tRNA in the P-site gets translocated to the E-site and the mRNA is moved to expose a new codon in the decoding part of the A-site. EF-G is a protein with five

Page 330: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 317

\loo Fig. 5 . The structure of EF-G in complex with GDP (Laurberg et al., 2000). The six domains are indicated. The G' domain, which is an insert in the G-domain, is vari- able between the bacterial, ar- chaeal and eucaryal versions of the protein (Rvarsson et al., 1994). Domain 111 is highly flexible and has been difficult to characterize. In some conformation of EF-G it more stably anchored and possible to outline (Laurberg et al., 2000). The domain has an extended loop that in this conformation gets into contact with the domain IV. It could be called the support loop of domain 111.

domains and an inserted subdomain in the G-domain called G' (Evarsson et al., 1994; Cworkowzki et al., 1994). As mentioned above EF-G mimics the ternary complex of EF-Tu with GDPNP and aminoacyl- tRNA (Nissen et al., 1995).

Two main states of EF-G have been studied by crystallography (Ta- ble 3), with GDP and a conformation without nucleotide (Evarsson et al., 1994; Czworkowslu et al. 1994; Al-Karadaghi et al., 1996)). Two different conformations with bound GDP have also been observed (Laurberg et al., 2000). A significant difference is observed probably due to the binding of a magnesium ion at the p-phosphate. Whereas the struc- tures of the empty EF-G and the one with GDP but without magnesium are extended and relatively similar compared to the structure of a mutant EF-G with GDP and magnesium. Here the conformation is bent (Fig.5) The G-domain and domain I1 form a block against which the tRNA mim- icking domains 111, IV and V are rotated by about 10" leading to a dis- placement of the tip of domain IV by about lOA (Laurberg et al., 2000). This bending of EF-G also leads to full visibility of domain I11 in the electron density maps. It is interesting to note that the homologous do- mains in EF-Tu, eIF2 and EF-G behave differently. In the conforma- tional changes of the factors domains G and I1 in EF-G retain their inter-

Page 331: Conformational Proteomics of Macromolecular Architecture

318 Martin b u r b e r g et al.

action (Laurberg et al., 2000) whereas in EF-Tu (Kjeldgaard et al., 1993) and eIF2 (Roll-Mecak et al., 2000) domains I1 and I11 move in relation to the G-domain.

Three different types of crystals can be obtained of EF-G without nucleotide or with GDP (Laurberg, 2002). The classical form that has been examined several times is the rhombic form of crystals (Chirgadze et al., 1983). In addition plate-like and needle-formed crystals can be ob- tained. The plate crystals have two molecules per asymmetric unit (Laur- berg, 2002). However, the main crystal contact, as in all crystals of EF- G, is the same (Fig. 6). The p-sheet of domain IV connects with the p- sheet of the G’-domain (Laurberg, 2002). Despite the fact that there is no nucleotide or magnesium bound, the EF-G molecule bends to the extent that a close contact is generated between the molecules that make the extended p-sheet. Thus, further bending of EF-G hardly can occur in this crystal form (Laurberg, 2002).

a Fig. 6. Part of the crystal packing for two types of crystals of EF-G (Laurberg, 2002). A contact between the edges of P-sheets of domains G’ and IV is a constant and dominating interaction. This interaction limits the conformational flexibility of EF-G (Laurberg et al., 2000; Laurberg, 2002). The left figure shows a wild type structure (PDB-code 1KTV) of EF-G in a new crystal form with two molecules per asymmetric unit analyzed at 3.8w resolution (Laurberg, 2002). The new conformation and packing are not much different from the ones in crystals of EF-G (H573A, right) with GDP (Laurberg et al., 2000).

Page 332: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 319

Table 3. Crystal forms of EF-G." All have the space group P2,2,2,

Protein Precipitant Crystallization Ligands Cell conditions observed dimensions, (A)

structure Additives (mM) in the a x b x c

wt - MPD 7.8 2 - 3 GDP - 76.3 x 106.4 x 115.6 1 D A R ~

Wt - MPD 7.8 - - 75.6 x 106.0 x 116.4 1 E L 0 wt - PEG8000 7.8 I - - GDP - 77.3 x 106.2 x 115.4 2EFGd G16V" PEG8000 7.6 - 5 10 - - 74.9 x 102.0 x 116.0

H573A- PEG 8000 7.4 0.5 - 3 GDP + 76.7 x 86.0 x 113.4 ~ F N M ~ wt - PEG8000 7.6 - - 87.0 x 103.5 x 176.4 1 KTVg

"Adapted from the doctoral thesis of M. Laurberg, 2002 bAl-Karadaghi, 1996 'Evarsson, 1994 diEvarsson, 1994 "Laurberg, 2002 'Laurberg, 2000 "aurberg, 2002

So far there is no success to obtain a crystal structure of EF-G in the missing GTP conformation despite the fact that the complex readily forms in solution (Kaziro, 1978). Addition of the GTP analogue GDPNP inhibits crystal formation (Laurberg, 2002). A mutant (G16V) with greater affinity for GTP was studied (Laurberg et al., 2003). High con- centrations of GDPNP and magnesium were added as well as alkaline phosphatase to remove any trace of GDP. The resulting crystals showed no trace of nucleotide. Thus, the standard crystal packing does not permit the GTP conformation. This could mean that the extended P-sheet pre-

Page 333: Conformational Proteomics of Macromolecular Architecture

320 Martin Laurberg et al.

vents the GTP conformation that may be an even more bent form of the factor.

EF-G bound to the ribosome in complex with an antibiotic, fusidic acid, has been studied by cryo-EM (Agrawal et al., 1998, 1999; Stark et al., 2000) and by chemical methods (Wilson & Noller, 1998). EF-G binds to the ribosome much in the same way as EF-Tu. When the crystal structure of EF-G with GDP is superimposed on the density originating from the bound EF-G it is evident that the factor has undergone some conformational change and appears straighter than in the crystal struc- tures. The complex of EF-G with fusidic acid locks EF-G firmly bound to the ribosome after translocation has occurred. Here EF-G has hydro- lyzed its GTP, but at present it is not fully understood which natural state and conformation this corresponds to. It certainly differs much from a highly bent conformation that might be the one with bound GTP as dis- cussed above.

Activation of the tGTPases The tGTPases bind to overlapping binding sites on the ribosome (Hei- mark et al., 1976). This is best characterized for EF-Tu and EF-G (Stark et al., 1997, 2000; Agrawal et a1 1998, 1999; Valle et al., 2002). In the interaction with the G-domains of the tGTPases the sarcin-ricin loop of the 23s RNA, around residues 2660 (Moazed et al., 1988; La Teana et al., 2001; Munishkin & Wool 1997) is identified as important. The inter- action is with the Switch I loop of the tGTPases. No detailed structural information is available (Wriggers et al., 2000) Another important region is the thiostrepton binding region of the 23s RNA, residues about 1050- 11 10 (Gale et al., 1981). Of the ribosomal proteins L11, L10 and L12 are reported to bind to the same part of the 23s RNA as thiostrepton (Dijk et al., 1979; Beauclerk et al., 1984; Wimberley et al., 1999). The proteins L10 and the four copies of L12 form the so-called stalk of the large sub- unit (Strycharz et al., 1978). L12 is of particular importance since its re- moval leads to much reduced GTPase activity (Kischa et al., 1971; Fa- kunding et al., 1973; Wahl & Moller, 2002; Mohr et al., 2002). In addi- tion there are observations that the isolated protein can activate a low level of GTPase activity in elongation factors (Donner et al., 1977;

Page 334: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 32 1

Savelsberg et al., 2000). Thus L12 may be the ribosomal GAP. The ribo- some contains two strongly coupled dimers of L12 (Osterberg et al., 1976, 1977). The protein is highly flexible as has been identified by mild proteolysis and NMR. This flexibility of L12 varies depending on whether the ribosomes are free from factors or whether EF-Tu or EF-G is bound to the ribosome with GDPNP or with kirromycin or fusidic acid respectively (Gudkov et al., 1982; Gudkov & Bubunenko, 1989; Gud- kov, 1997). The variation in stalk structure is also observable from cryo- EM studies of ribosomes in various states (Stark et al., 1997; Agrawal et al., 1998, 1999).

The organization of L10 and L12 has been studied for a long time. However, not even the recent crystallographic investigations of large subunits (Ban et al., 2000, Harms et al., 2001) or 70s ribosomes (Yusu- pov et al., 2001) have given clear answers. The crystal structure of iso- lated L12 is highly interesting since it provides two models for the dimer organization (Wahl et al., 2000). As discussed by Sanyal & Liljas (2000) the model where a complete molecule interacts with an N-terminal frag- ment in an antiparallel arrangement may be the correct one, since the other possible dimer is held together through the flexible hinge region. So far no firm insight is available about how L12 interacts with the fac- tors.

CONCLUSIONS Despite our rapidly advancing understanding of the ribosomal structure and of how translation factors catalyze bacterial protein synthesis a num- ber of essential structural problems remain. For some translation factors there is yet no structural information. This is the case for release factor 3 (RF3). However, the primary concern is the shortage of structures of factors when bound to the ribosome whether seen by cryo-EM or X-ray diffraction techniques. This is the case for both the initiating and termi- nating ribosomes. For systems where we have a fair amount of infor- mation there are still states that have not been characterized or where the resolution remains too low. In the case of the ternary complex of EF-Tu with GTP and aminoacyl-tRNA it is of significant interest to establish how the information of a correct decoding between the mRNA and the

Page 335: Conformational Proteomics of Macromolecular Architecture

322 Martin Laurberg et a/.

anticodon of the tRNA on the small subunit is transmitted to the interac- tion between the G-domain of EF-Tu and the large subunit where the GTP hydrolysis is induced. Likewise, how does the conformational change of EF-G induced by GTP hydrolysis lead to translocation? Fur- thermore there is no conclusive molecular insight into what transforms the tGTPases into active GTP hydrolyzing enzymes. It is thus obvious that we have an extensive and detailed basis for our understanding of translation. However, much work by a multitude of approaches is needed and there is no end of challenging tasks.

ACKNOWLEDGMENTS We are grateful for valuable collaboration with Prof. AT Gudkov and Dr. CS. Sanyal. This work was supported by the Swedish Research Council (VR) and The Swedish Foundation for International Cooperation in Re- search and Higher Education (STINT).

REFERENCES 1. Agrawal RK, Heagle AB, Penczek P, Grassucci RA, Frank J. EF-G depen-

dent GTP hydrolysis induces translocation accompanied by large confor- mational changes in the 70s ribosome. Nature Struct. Biol. 1999; 6:643- 647.

2. Agrawal RK, Penczek P., Grassucci RA, Frank J. Visualization of elonga- tion factor G on the Escherichia coli 70s ribosome: The mechanism of translocation. Proc. Nail. Acad. Sci. USA 1998; 956134-6138.

3. A1 Karadaghi S, Aevarsson A, Garber M, Zheltonosova J, Liljas A. The structure of elongation factor G in complex with GDP: conformational flexibility and nucleotide exchange. Structure 1996; 4555-565.

4. Wvarsson A. Structure-based sequence alignment of elongation factors Tu and G with related GTPases involved in translation. J. Mol. Evol. 1995; 41:1096-1104.

5. Evarsson A, Brazhnikov E, Garber M, Zheltonosova J, Chirgadze Y, Al- Karadaghi S, Svensson LA, Liljas A. Three-dimensional structure of the ri- bosomal translocase: Elongation factor G from Therrnus thermophilus. EMBO J. 1994; 13:3669-3677

Page 336: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 323

6. Ban N, Nissen P, Hansen J, Moore PB, Steitz TA. The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science 2000;

7. Baron C, Heider J, Bock A. Interaction of translation factor SELB with the formate dehydrogenase H selenopolypeptide mRNA. Proc Nutl Acad Sci U S A . 1993; 90:4181-4185.

8. Beauclerk AAD, Cundliffe, E, Dijk J. The binding site for ribosomal protein complex L8 within 23s RNA of Escherichia coli. J. Biol. Chem. 1984; 259:

9. Benson TE, McCroskey MC, Cialdella JI, Choi G Pearson JD. Structure of S. aureus Elongation Factor P: Another tRNA mimic in protein translation? Oral and Poster Presentation at Structural Aspects of Protein Synthesis 2. Rensselaerville, NY September 2000.

10. Bourne HR, Sanders DA, McCormick F. The GTPase superfamily: con- served structure and molecular mechanism. Nature 1991;349: 1 17-127.

1 1. Carter AP, Clemons Jr WM, Brodersen DE, Morgan-Warren RJ, Hartsch T, Wimberley BT, Ramakrishnan V. Crystal structure of an initiation factor bound to the 30s ribosomal subunit Science 2001; 291: 498-501.

12. Chirgadze Yu N, Nikonov SV, Brazhnikov EV, Garber MB, Reshetnikova LS. Crystallographic study of elongation factor G from Thermus thermo- philus HB8. J Mol B id . 1983; 168:449-450.

13. Czworkowski J, Wang J, Steitz TA, Moore PB. The crystal structure of elongation factor G complexed with GDP, at 2.7 angstrom resolution. EMBO J. 1994; 13:3661-3668.

14. Dallas A, Noller HF. Interaction of translation initiation factor 3 with the 30s ribosomal subunit. Mol. Cell 2001; 4:855-864.

15. Dijk J, Garrett RA, Muller R. Studies of the binding of the ribosomal pro- tein complex L7/12-L10 and protein L l 1 to the 5’-one third of the 23s RNA: a functional center of the 50s subunit. Nucleic Acids Res. 1979; 6:

16. Donner, D., R. Villems, A. Liljas, C.G. Kurland. Guanosinetriphosphatase activity dependent on elongation factor Tu and ribosomal protein L7L12. Proc. Nutl. Acad. Sci. USA 1978; 75:3192-3195.

17. Fakunding JL, TrautRR, Hershey JW. Dependence of initiation factor IF-2 activity on proteins L7 and L12 from Escherichia coli 50 S ribosomes. J.Biol.Chem. 1973; 248,8555-8559

18. Fedorov R, Meshcheryakov V, Gongadze G, Fomenkova N, Nevskaya N, Selmer M, Laurberg M, Kristensen 0, Al-Karadaghi S, Liljas A, Garber M,

2891905-920.

6559-6563.

271 7-2730.

Page 337: Conformational Proteomics of Macromolecular Architecture

324 Martin Laurberg et al.

Nikonov S. Structure of ribosomal protein TL5 complexed with RNA pro- vides new insights into the CTC family of stress proteins. Acta Cryst. D 2001; 57: 968-976.

19. Forchhammer K, Leinfelder W, Bock A. Identification of a novel transla- tion factor necessary for the incorporation of selenocysteine into protein. Nature. 1989; 342~453-456.

20. Frank J, Agrawal RK. A ratchet-like inter-subunit reorganization of the ri- bosome during translocation. Nature 2000; 406:3 18-322.

21. Gale EF, Cundliffe E, Reynolds PE, Richmond MH, Waring MJ. The mo- lecular basis of antibiotic action. John Wiley & Sons, London, 198 1.

22. Gavrilova LP, Kostiashkina OE, Koteliansky VE, Ruthkevitch NM, Spirin AS. Factor-free (“non-enzymic”) and factor-dependent systems of trans- lation of polyuridylic acid by Escherichia coli ribosomes. J. Mol. Biol.

23. Gudkov AT, Gongadze GM, Bushuev VN, and Okon MS. Proton nuclear magnetic resonance study of the ribosomal protein L7/L12 in situ. FEBS Lett. 1982; 138:229-232.

24. Gudkov AT, Bubunenko MG. Conformational changes in ribosomes upon interaction with elongation factors. Biochimie. 1989;71:779-785.

25. Gudkov AT. The L7L12 ribosomal domain of the ribosome: structural and functional studies. FEBS Lett. 1997,407~253-256.

26. Harms J, Schluenzen F, Zarivach R, Bashan A, Gat S, Agmin I, Bartels H, Franceschi F, Yonath A. High resolution structure of the large ribosomal subunit from a mesophilic eybacterium. Cell 2001; 107:679-688.

27. Heimark RL, Hershey JW, Traut RR. Cross-linking of initiation factor IF2 to proteins L7/L12 in 70s ribosomes of Escherichia coli. J Biol Chem 1976;

28. Kaziro Y. The role of guanosine 5’-triphosphate in polypeptide chain elon- gation. Biochim Biophys Acta 1978; 505:95-127.

29. Kischa K, Moller W, Stoffler G. Reconstitution of a GTPase activity by a 50s ribosomal protein from E. coli. Nature New Biol. 1971; 233: 62-63.

30. Kjeldgaard M, Nyborg J. Refined structure of elongation factor EF-Tu from Escherichia coli. J Mol Biol. 1992; 223~721-42.

31. Kjeldgaard M, Nissen P, Thirup S, Nyborg J. The crystal structure of elon- gation factor EF-Tu from Thermus aquaticus in the GTP conformation. Structure 1993; 1, 35-50.

1976; 101:537-552

251: 779-784.

Page 338: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 325

32. Kristensen 0, Laurberg M, Liljas A, Selmer M. Is tRNA binding or tRNA mimicry mandatory for translation factors? Curr. Prot. Pept. Sci. 2002

33. Kromayer M, Neuhierl B, Friebel A, Bock A. Genetic probing of the inter- action between the translation factor SelB and its mRNA binding element in Escherichia coli. Mol Gen Genet 1999; 262:800-806.

34. Kromayer M, Wilting R, Tormay P, Bock A. Domain structure of the pro- karyotic selenocysteine-specific elongation factor SelB . J. Mol. Biol. 1996; 262,413-420.

35. La Teana A, Gualerzi CO, Dahlberg AE. Initiation factor IF 2 binds to the alpha-sarcin loop and helix 89 of Escherichia coli 23s ribosomal RNA.

36. Lancaster L, Kiel M, Kaji A, Noller H. Orientation of ribosome recycling factor in the ribosome from directed hydroxyl radical probing. Cell.

37. Laurberg M. Dynamics in protein synthesis. Structural studies of translation factors. Doctoral thesis from Lund University.2002; ISBN 9 1-628-5088- 1.

38. Laurberg M, Kristensen 0, Martemyanov K, Gudkov AT, Liljas A. Crystal structure of a fusidic acid hypersensitive mutant of elongation factor G, G16V. Manuscript in preparation, 2003.

39. Laurberg M, Kristensen 0, Martemyanov K, Gudkov AT, Nagaev I, Hughes D, Liljas A. Structure of a mutant EF-G reveals domain I11 and pos- sibly the fusidic acid binding site. J.Mol.Biol2000; 303,593-603.

40. Leipe DD, Wolf YI, Koonin EV, Aravind L. Classification and evolution of P-loop GTPases and related ATPases. J . Mol. Biol. 2002; 317:41-72.

41. Li C, Reches M, Engelberg-Kulka H. The bulged nucleotide in the Es- cherichia coli minimal selenocysteine insertion sequence participates in in- teraction with SelB: a genetic approach. J. Bacteriol. 2000; 182:6302-6307.

42. McCutcheon JP, Agrawal RK, Philips SM, Grassucci RA, Gerchman SE, Clemons WM Jr, Ramakrishnan V, Frank J. Location of translational initia- tion factor IF3 on the small ribosomal subunit. Proc Natl Acad Sci U S A. 1999; 96:4301-6.

43. Moazed D, Noller H. Interaction of tRNA with 23s RNA in the ribosomal A, P, and E sites. Cell 1989; 57586-597.

44. Moazed D, Robertson JM, Noller HF. Interaction of elongation factors EF- G and EF-Tu with a conserved loop in 23s RNA. Nature 1988; 334, 362- 364.

3: 133-141.

RNA 2001; 7, 1173-1 179.

2002;111: 129- 140.

Page 339: Conformational Proteomics of Macromolecular Architecture

326 Martin Laurberg et al.

45. Munishkin A, Wool IG. The ribosome-in-pieces: binding of elongation fac- tor EF-G to oligoribonucleotides that mimic the sarcidricin and thiostrepton domains of 23s ribosomal RNA.Proc Natl Acad Sci U S A . 1997; 94:12280- 12284.

46. Nissen P, Kjeldgaard M, Thirup S, Polekhina G, Reshetnikova L, Clark BFC, Nyborg J. Crystal structure of the ternary complex of Phe-tRNA-Phe, elongation factor Tu, and a GTP analogue. Science 1995; 270, 1464-1472.

47. Nissen P, Thirup S, Kjeldgaard M, Nyborg J. The crystal structure of Cys- tRNA Cys-EF-Tu-GDPNP reveals general and specific features in the ter- nary complex and in tRNA. Structure Fold. Des. 1999; 7:143-156.

48. Nissen P, Hansen J, Ban N, Moore P B, Steitz TA. The structural basis of ribosome activity in peptide bond synthesis. Science 2000; 289:920-930.

49. Noller HF, Hoffarth V, Zimniak L. Unusual resistance of peptidyl trans- ferase to protein extraction procedures. Science. 1992; 256: 141 6- 1419.

SO. Ogle JM, Brodersen DE, Clemons WM Jr, Tarry MJ, Carter AP, Rama- krishnan V. Recognition of cognate transfer RNA by the 30s ribosomal subunit. Science 2001; 292:897-902.

51. Ogle JM, Murphy FV, Tarry MJ, Ramakrishnan V. Selection of tRNA by the ribosome requires a transition from an open to a closed form. Cell 2002; 111: 721-732.

52. Orengo CA, Thornton JM. Alpha plus beta folds revisited: some favoured motifs. Structure. 1993; 1:105-120.

53. Osterberg R, Sjoberg B, Liljas A, Pettersson I. Small angle X-ray scattering and cross-linking study of the proteins from Escherichia coli ribosomes. FEBS lett. 1976; 66:48-5 1.

54. Osterberg R, Sjoberg B, Pettersson I, Liljas A, Kurland CG. Small-angle scattering study of the protein complex of L7/L12 and LIO from Escherchia coli ribosomes. FEBS lett. 1977; 73:22-24.

55. Pioletti M, Schlunzen F, Harms J, Zarivach R, Gluhmann M, Avila H, Ba- shan A, Bartels H, Auerbach T, Jacobi C, Hartsch T, Yonath A, Franceschi F. Crystal structures of complexes of the small ribosomal subunit with tet- racycline, edeine and IF3. EMBO J. 2001;20:1829-1839.

56. Pon CL, Paci M, Pawlik RT, Gualerzi CO. Structure-function relationship in Escherichia coli initiation factors. Biochemical and biophysical charac- terization of the interaction between IF-2 and guanosine nucleotides. J. Biol. Chem. 1985; 260: 8918-8924.

Page 340: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 321

57. Rawat UBS, Zavialov AV, Sengupta J, Valle M, Grassucci RA, Linde J, Vestergaard B, Ehrenberg M, Frank J. A cry0 electron microscopic study of ribosome-bound termination factor RF2. Mature 2003; 421:87-90.

58. Rheinberger HJ, Sternbach H, Nierhaus KH. Three tRNA binding sites on Escherichia coli ribosomes. Proc Nut1 Acad Sci U S A . 1981;78:5310-5314.

59. Roll-Mecak A, Cao C, Dever TE, Burley SK. X-Ray structures of the uni- versal translation initiation factor IF2/eIFSB: conformational changes on GDP and GTP binding. Cell. 2000;103:78 1-92.

60. Rutkevitch NM, Gavrilova LP. Factor-free and one-factor-promoted poly(U,C)-dependent synthesis of polypeptides in cell-free systems from Esherichia coli. FEBS Lett. 1982; 143:115-118

61. Sanyal CS, Liljas A. The end of the beginning: structural studies of ribo- somal proteins. Curr. Op. Struct. Biol. 2000; 10:633-636.

62. Savelsbergh A, Mohr D, Wilden B, Wintermeyer W, Rodnina M V. Stimu- lation of the GTPase activity of translation elongation factor G by ribosomal protein L7/12. J.Biol.Chem. 2000; 275:890-894.

63. Schlunzen F, Zarivach R, Harms J, Bashan A, Tocilj A, Albrecht R, Yonath A, Franceschi F. Structural basis for the interaction of antibiotics with the peptidyl transferase centre in eubacteria. Nature 2001 ; 41391 4-821.

64. Schliinzen F, Tocilj A, Zarivach R, Harms J, Gluehmann M, Jane11 D, Ba- shan A, Bartels H, Agmon I, Franceschi F, Yonath A. Structure of func- tionally activated small ribosomal subunit at 3.3 angstroms resolution. Cell

65. Selmer M. Protein-RNA interplay in translation. Structural studies of RRF, SelB and L1. Doctoral thesis from Lund University. 2002; ISBN 91-7874-

66. Selmer, M., Al-Karadaghi, S., Hirokawa, G., Kaji, A. and Liljas, A. Crystal structure of Thermotoga maritima ribosome recycling factor at 2.55A reso- lution: a tRNA mimic. Science 1999; 286:2349-2352.

67. Selmer M, Su X-D. Crystal structure of an mRNA binding fragment of Moorella thermoucetica elongation factor SelB. EMBO J. 2002; 21:4145- 4153.

68. Song H, Mugnier P, Das AK, Webb HM, Evans DR, Tuite MF, Hemmings BA, Barford D. The crystal structure of human eukaryotic release factor eRFl--mechanism of stop codon recognition and peptidyl-tRNA hydrolysis. Cell. 2000;100:311-321.

2000; 102:615-623.

176-9.

Page 341: Conformational Proteomics of Macromolecular Architecture

328 Martin Laurberg et al.

69. Stark H, Rodnina MV, Rinke-Appel J, Brimacombe R, Wintermeyer W, van Heel M. Visualization of elongation factor Tu on the Escherichia coli ribosome. Nature 1997; 389,403-406.

70. Stark H, Rodnina MV, Wieden HJ, van Heel M, Wintermeyer W. Large- scale movement of elongation factor G and extensive conformational change of the ribosome during translocation. Cell 2000; 100, 301-309.

71. Strychartz WA, Nomura M, Lake JA. Ribosomal protein L7/L12 localized at a single region of the large subunit by immune microscopy. J. Mol. Biol.

72. Suppmann S, Persson BC, Bock A. Dynamics and efficiency in vivo of UGA-directed selenocysteine insertion at the ribosome. EMBO J. 1999; 18,

73. Thanbichler M, Bock A, Goody RS. Kinetics of the interaction of transla- tion factor SelB from Escherichia coli with guanosine nucleotides and se- lenocysteine insertion sequence RNA. J Biol Chem 2000; 275:20458- 20466.

74. Valle M, Sengupta J, Swami NK, Grassucci RA, Burkhardt N, Nierhaus KH, Agrawal RK, Frank J. Cryo-EM reveals an active role for aminoacyl- tRNA in the accommodation process. EMBO J. 2002; 21:3557-3567.

75. Vestergaard B, Van LB, Andersen GR, Nyborg J, Buckingham RH, Kjeldgaard M. Bacterial polypeptide release factor RF2 is structurally dis- tinct from eukaryotic eRF1. Mol Cell. 2001;8:1375-1382.

76. Wahl MC, Moller W. Structure and function of the acid ribosomal stalk proteins. Curr. Prot. P e p . Sci. 2002; 3:93-106.

77. Wahl MC, Bourenkov GP, Bartunik HD, Huber R. Flexibility, conforma- tional diversity and two dimerization modes in complexes of ribosomal pro- tein L12. EMBO J. 2000, 19:174-86.

78. Walker JE, Saraste M, Runswick MJ, Gay NJ. Distantly related sequences in the alpha- and beta-subunits of ATP synthase, myosin, kinases and other ATP-requiring enzymes and a common nucleotide binding fold. EMBO J. 1982; 1:945-951.

79. Wilson KS, Noller HF. Mapping the position of translational elongation factor EF-G in the ribosome by directed hydroxyl radical probing. Cell

80. Wimberly BT, Guymon R, McCutcheon JP, White SW, Ramakrishnan V. A detailed view of a ribosomal active site: the structure of the L1 I-RNA com- plex. Cell 1999; 97:491-502.

1978; 1261123-140.

2284-2293.

1998; 92:131-139.

Page 342: Conformational Proteomics of Macromolecular Architecture

How do Translation Factors Catalyze Protein Synthesis 329

81. Wimberley BT, Brodersen DE, Clemons WMJ, Morgan-Warren RJ, Carter AP, Vonrhein C, Hartsch T, Ramakrishnan V. Structure of the 30s ribo- somal subunit. Nature 2000; 407: 327-339.

82. Wriggers W, Agrawal RK, Drew DL, McCammon A, Frank J. Domain mo- tions of EF-G bound to the 70s ribosome: insights from a hand-shaking be- tween multi-resolution structures. Biophys J . 2000; 79: 1670- 1678.

83. Yonath, A. Ribosome crystallography: dynamics, flexibility and peptide bond formation. 2002. This volume.

84. Yusupov MM, Yusupova GZ, Baucorn A, Lieberman K , Earnest TN, Cate JH, Noller HF. Crystal structure of the ribosome at 5.5 A resolution. Sci- ence 2001; 292:883-896.

85. Yusupova GZ, Yusupov MM, Cate JH, Noller HF. The path of messenger RNA through the ribosome. Cell 2001; 106:233-241.

86. Zavialov AV, Buckingham RH, Ehrenberg M. A post termination ribosomal complex is the guanine nucleotide exchange factor for peptide release factor RF3. Cell 2001; 107:115-124.

Page 343: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 344: Conformational Proteomics of Macromolecular Architecture

PART VI

MOTION ENGINES

Page 345: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 346: Conformational Proteomics of Macromolecular Architecture

Chapter 15

DYNAMIC ASPECTS OF THE BACTERIAL FLAGELLUM

Keiichi Namba"

The bacterial flagellum is a dynamic molecular system made of a rotary motor, a universal joint, and a long helical propeller, by means of which bacteria swim. The helical propeller, for example, is made of a single protein flagellin, and yet its curved and twisted tubular structure can switch between left- and right-handed helical forms in response to the twisting force produced by quick reversal of the motor rotation, allowing bacteria to alternate their swimming pattern between run and tumble. Other parts also exert mechanical functions by their dynamic behaviors, and all these structures are constructed by a self-assembly process. Some of these dynamic aspects have been revealed by structural studies.

Keywords: bacterial flagellum, polymorphic supercoiling, mechanical switching, self-assembly.

INTRODUCTION The bacterial flagellum, an organelle of locomotion for bacterial cells (Fig. l), is a large macromolecular assembly consisting of over 20 lunds of proteins as shown in Fig. 2 (Namba and Vonderviszt, 1997). Most of them are represented by a few to several tens of molecules, while flagellin, which makes up the long helical filament, is present in a few tens of thousands copies. These proteins combine to form a long thin helical propeller in the cell exterior and a rotary motor at its base. The motor, consisting of a rotor and stators, is approximately 30 nm in

*Graduate School of Frontier Biosciences, Osaka University & Dynamic NanoMachine Project, ICORP, JST (2002 - 2007), 3-4 Hikaridai, Seika, Kyoto 619-0237 Japan: Email address: [email protected]

333

Page 347: Conformational Proteomics of Macromolecular Architecture

334 Keiichi Namba

diameter and is embedded within the cell membrane (Fig. 3). It drives the rotation of the helical filament, having a diameter of 20 nm and a length of 10 to 15 pm, at a speed of -20,000 rpm. When bacteria move in a straight line, several filaments are bundled together, and their synchronized rotation provides propulsion (Fig. 1). The energy source

C

P

flagella and its swimming pattern consisting of runs and tumbles.

I

for the torque generation is the flow of protons through the stators of the motor (Fig. 2). The motor operates smoothly even though the proton current is quite low, approximating the level of thermal noise. The bushing supports high-speed rotation, and the universal joint, approximately 55 nm long, helps the helical propellers bundle behind the moving cell body. When the motor reverses its rotation suddenly, which occurs every few seconds, the helical propeller, which is normally left- handed helix, temporarily changes to a right-handed helix. This allows the bundle to fall apart quickly and makes the cell tumble to change its orientation for a change of swimming direction.

The construction of the flagellum starts from its basal body, which spans through the cell and outer membranes, and it proceeds outward according to a specific structural sequence. The flagellar proteins are selected by the flagellar type I11 protein export system attached on the cytoplasmic surface of the motor, and transported through a long channel, 2 - 3 nm in diameter, within the flagellum (see Fig. 2) (Macnab, 1999). Upon reaching the tip of the flagellum, these proteins are incorporated into the flagellar structure. Each protein has an ability to recognize and bind to the target template structure already built at the tip of the growing structure, but the protein concentration must be above a critical value for the constitutive assembly process to proceed. That is partly why specific cap complexes are always necessary at the tip, to

Page 348: Conformational Proteomics of Macromolecular Architecture

Dynamic Aspects of the Bacterial Flagellum 335

keep the local protein concentration high enough for the efficient assembly and growth.

We have been studying the three-dimensional structures of these macromolecular assemblies and the mechanisms of their functions and self-assembly. Our convergent approach involves x-ray crystallography for individual component proteins, x-ray fiber diffraction for fiber structures, and electron cryomicroscopy for large molecular complexes. This strategy is the only feasible option for dealing with these macromolecular assemblies, some of which contain billions of atoms.

FliD (HAPZ)

P rti

Hook-Filament junction

Hook Inirerwi ,joint

niiilt i

/ u \ F'WFhB,FliH,FliL ~ y p e m export apparatus FliO, FIP, FliQ, FliR

Fig. 2. Schematic diagram of the bacterial flagellum. Each part is labeled with its name, the name of the protein that constitutes the part, and its function.

Page 349: Conformational Proteomics of Macromolecular Architecture

336 Keiichi Namba

Fig. 3. Computer graphic representation of the flagellar basal body based on three- dimensional image reconstruction by electron cryomicroscopy (Francis et a/., 1994). The three layers of membranes are the outer membrane (top), peptidoglycan layer (middle), and cytoplasmic membrane (bottom). The stators are the MotNB complexes, which are anchored to the solid peptidoglycan layer and form the proton pathway across the cytoplasmic membrane.

POLYMORPHIC SWITCHING MECHANISM In order to understand the mechanism of polymorphic supercoiling of the flagellar filament and its dynamic switching, we have been studying the structures of two types of straight filaments, called L- and R-type. The 11 protofilaments that form the tube of these two filaments are thought to be all in either of the two conformations that are mixed in certain number ratios to produce curved and twisted tubes of supercoiled filaments (Asakura, 1970; Calladine, 1975, 1978).

Electron cryomicroscopy allowed the structures of the two straight flagellar filaments to be visualized at around 10 A resolution (Mimori et al., 1995; Morgan et al., 1995). The two structures showed distinct subunit packing modes with apparently no change in subunit conformation (Fig. 4). X-ray fiber diffraction from highly well oriented liquid crystalline sols of the filaments allowed us to determine the repeat distance along the protofilament and its lateral packing lattice parameters in the two conformations (Yamashita et nl., 1998). The repeat distance was 52.7 A in the L-type and 51.9 in the R-type, the difference being only 0.8 A (Fig. 5) . This demonstrated that protein molecules can realize a very precise mechanical switch function in spite of their structural flexibility.

Page 350: Conformational Proteomics of Macromolecular Architecture

Dynamic Aspects offhe Bacterial Flagellum 337

L-type R-type

Fig. 4. Three-dimensional density maps of the L- and R-type straight filaments in solid surface representation. The density maps were obtained by helical image reconstruction from electron cryomicrographs at around 10 P\ resolution. Scale bar. 100 A.

i2.7 i\ 51.9

~

L- type (S JW 1660)

Fig. 5. Comparison of the repeat distance along the protofilament and subunit packing arrangements in the L- and R-type straight filaments.

Page 351: Conformational Proteomics of Macromolecular Architecture

338 Keiichi Namba

36

Fig. 6. Crystal structure of the F41 fragment of flagellin (a) and its docking into a density map of the R-type filament (b). The density map was obtained by electron cryomicroscopy and helical image analysis at 20 resolution.

Page 352: Conformational Proteomics of Macromolecular Architecture

Dynamic Aspects of the Bacterial Flagellum

b

339

C

Fig. 7. Conformational switch of flagellin observed by simulated extension of the protofilament model. (a) Three-subunit protofilament model. (b) Process of simulated extension, in which the C a backbones obtained from every 0.5 A step of the simulated extension are superimposed. The P-hairpin in domain D1 shows a conformational jump at some point. ( c ) A stereo pair showing magnified images around the P-hairpin at the extension of 4.5 A and 4.7 superimposed.

We crystallized a core fragment of flagellin made by truncating its both terminal regions that are disordered in the monomeric form but form the inner core to stabilize the filament structure. What we found in the crystal structure of this flagellin fragment at 2.0 A resolution was not only the three-domain structure composed of D 1, D2 and D3, but also the straightened protofilament structure of the R-type with the repeat of 5 1.9 A (Fig. 6) (Samatey et al., 2001). By simulated extension of this atomic model of the R-type protofilament structure, we were able to identify the mechanical switch to be the P-hairpin in domain D1 (Fig. 7).

SELF-ASSEMBLY MECHANISM For the growing process of the flagellar filament to proceed, a cap complex made of HAP2 (hook associated protein 2, or FliD) has to stay attached always at the distal end of the flagellum (Homma et al., 1884;

Page 353: Conformational Proteomics of Macromolecular Architecture

340 Keiichi Namba

Ikeda et al., 1985; Ikeda et al., 1987). In its absence, no growth occurs. The HAP2 cap is stably attached at the distal end, while allowing exported flagellin molecules to bind to the tip of the filament just underneath itself. We used electron cryomicroscopy to visualize the structure of the cap-filament complex and the cap dimer complex formed in solution (Yonekura et al., 2000). The results are shown in Fig. 8. The cap is a pentameric complex of HAP2 and is made of a pentagonal plate and five leg domains. The pentagonal plate is tightly attached to the tip of the filament through five leg domains filling the indentations formed by domain D1 of flagellin at the distal end of the filament by axially staggered arrangement of the 11 protofilament in forming the filament tube. However, one of the five indentations that has a double step and forms the deepest one is left open, obviously ready for a newly exported flagellin subunit to bind. The cap-filament complex is formed over a symmetry mismatch, where the cap has the 5-fold symmetry whereas the filament has a helical symmetry with 11 protofilaments, which can be approximately viewed as 5 S-fold symmetry. Therefore, the five leg domains have to be flexible enough to be able to bind to the indentations, each in slightly different positions relative to those of the legs in the 5- fold symmetry positions.

Based on these structures, we proposed a rotary cap mechanism for the self-assembly process of flagellin to be efficiently promoted. In this model, every insertion of flagellin forces the four leg domains of the cap change their orientation and remaining one step over to prepare the next flagellin binding site (Fig. 9) (Yonekura et al., 2000). In this cap function, naturally the cap plate rotates. While the flagellin assembly proceeds along the 1-start right-handed helix, and therefore the assembly goes in counterclockwise direction (as viewed from the distal end) at every 65.5", the rotation of the cap is 6.5" in clockwise direction because the 5-fold symmetry the cap presents the equivalent structure at every 72". These results again demonstrate how flexibility and preciseness of the protein structure play essential roles in their dynamic functions.

Page 354: Conformational Proteomics of Macromolecular Architecture

Drnamic Aspects of the Bacterial Flagellum 34 1

Fig. 8. Solid surface representation of the cap-filament complex structure obtained by electron cryomicroscopy. (a) End-on view. (b) Side view. (c) Half cut model of (b). (d) Five different side views showing the five gaps between the cap and filament tip. The direction of the views are indicated by number labels from 1 to 5 in (a) and (d). (e) Density map of the cap dimer formed in solution. Scale bar, 100 A.

Fig. 9. Rotary cap model for the efficient self-assembly of flagellin at the distal end of the growing flagellum.

Page 355: Conformational Proteomics of Macromolecular Architecture

342 Keiichi Narnba

We summarized the structural architecture, self-assembly process, and dynamic movement of the flagellum in animations that are attached on the CD copy of the book. The self-assembly process is extracted as a series of still pictures in Fig. 10.

Fig. 10. Self-assembly process of the bacterial flagellum

ACKNOWLEDGMENTS I thank all the members of our group and collaborators for their contributions to the work described here.

Page 356: Conformational Proteomics of Macromolecular Architecture

Dynamic Aspects of the Bacterial Flagellum 343

REFERENCES 1. Asakura, S. Polymerization of flagellin and polymorphism of

2. Calladine, CR. Construction of bacterial flagella. Nature 1975; 225,

3. Calladine, CR. Change of waveform in bacterial flagella: The role of mechanics at the molecular level. J. Mol. Biol. 1978; 118,457-479.

4. Francis, NR, Sosinsky, GE, Thomas, D, and DeRosier, DJ. Isolation, characterization and structure of bacterial flagellar motors containing the switch complex. J. Mol. Biol. 1994; 235, 1261-1270.

5 . Homma, M, Fujita, H, Yamaguchi, S, and Iino, T. Excretion of unassembled flagellin by Salmonella typhimurium mutants deficient in hook-associated proteins. J. Bacteriol. 1984; 159, 1056-1059.

6. Ikeda, T, Asakura, S, and Kamiya, R. Cap on the tip of Salmonella flagella. J. Mol. Biol. 1985; 184, 735-737.

7. Ikeda, T, Homma, M, Iino, T, Asakura, S, and Kamiya, R. Localization and stoichiometry of hook-associated proteins within Salmonella typhimurium flagella. J. Bacteriol. 1987; 169, 1168- 1173.

8. Macnab, RM. The bacterial flagellum: reversible rotary propeller and type I11 export apparatus. J. Bacteriol. 1999; 181, 7149-7153.

9. Mimori, Y, Yamashita, I, Murata, K, Fujiyoshi, Y, Yonekura, K, Toyoshima, C, and Namba, K. The structure of the R-type straight flagellar filament of Salmonella at 9 resolution by electron cryomicroscopy. J. Mol. Biol. 1995; 249,69-87.

10. Morgan, DG, Owen, C, Melanson, LA, & DeRosier, DJ. Structure of bacterial flagellar filaments at 11 r\ resolution: Packing of the a- helices. J. Mol. Biol. 1995; 249, 88-1 10.

11. Namba, K, and Vonderviszt, F. Molecular architecture of bacterial flagellum. Quart. Rev. Biophys. 1997; 30, 1-65.

12. Samatey, FA, Imada, K, Nagashima, S, Vonderviszt, F, Kumasaka, T, Yamamoto, M, and Namba, K. Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature

13. Yamashita, I, Hasegawa, K, Suzuki, H, Vonderviszt, F, Mimori- Kiyosue, Y, and Namba, K. Structure and switching of bacterial flagellar filament studied by X-ray fiber diffraction. Nature Struct. Biol. 1998; 5, 125-132.

flagella. Adv. Biophys. 1970; I, 99-155.

121-124.

2001; 410,331-337.

Page 357: Conformational Proteomics of Macromolecular Architecture

344 Keiichi Namba

14. Yonekura, K, Maki, S, Morgan, DG, DeRosier, DJ, Vonderviszt, F, Imada, K, and Namba, K. The bacterial flagellar cap as the rotary promoter of flagellin self-assembly. Science 2000; 290,2148-21 52.

Page 358: Conformational Proteomics of Macromolecular Architecture

Chapter 16

MYOSIN POLYMORPHISM AND MUSCLE co NTRACTI o N *

Kenneth C. Holmest and Rasmus R. Schroeder$

The cyclical interaction between myosin and the actin filament is responsible for muscle contraction. The myosin cross-bridge, which is the ATPase, binds to actin and then undergoes a conformational change (the power stroke) that “rows” the actin filament along. Protein crystallography of myosin has yielded high-resolution models of the beginning and end of the power stroke, which is driven by ATP hydrolysis. ATP also controls the cross bridge affinity for the actin filament. This is low in the presence of ATP and much higher without nucleotide. Recent high-resolution electron microscopy of the acto- myosin complex has yielded atomic models of the actin myosin interaction that show two new myosin conformations. These explain the reciprocal link between actin affinity and ATP affinity. Thus there are four states of the myosin cross bridge. The function of the myosin cross bridge is carried out by regulated interactions between these four states.

Keywords: Actin, myosin cross-bridge, ATP, 50K upper domain, 50K lower domain, strong binding, power stroke, relay helix, switch 1, switch 2.

GENERAL Crystallographic studies have demonstrated that there are two major conformational states of the myosin cross bridge. These states differ in particular in the orientation of the long lever arm that is distal to actin in

*From the Max Planck Institute for Medical Research, 691 20 Heidelberg, Germany. ‘Corresponding author, Email address: [email protected] ‘Email address: [email protected]

345

Page 359: Conformational Proteomics of Macromolecular Architecture

346 Kenneth C. Holmes & Rasmus R. Schroeder

the actin-myosin complex. The rotation of this distal lever arm is responsible for the transport of actin filament past myosin cross bridges by a cyclical interaction rather like rowing. The cross bridge contains the ATPase. Myosin is a “P-loop protein” with switch 1 and switch 2 elements similar to those of the G-proteins. The ATP is bound by the P- loop between the switch 1 and switch 2 elements. The hydrolysis of ATP drives the cross bridge cycle and the affinity of the myosin cross bridge for actin is controlled by the binding of ATP. In the presence of ATP the binding is weak, in the absence of ATP it is strong. The two binding sites are strongly coupled although they lie some 50A apart. The myosin ATPase is inhibited by product release, which is relieved by binding to actin.

The major domain of the myosin cross bridge (50K domain) is split into two subdomains (upper and lower) by a cleft that connects the actin binding site with the nucleotide binding site. Evidence now points to this cleft closing on strong binding of the cross bridge to actin (Yengo et al.,

Upper 50K Domain

i p ” Regulatory Light Chain

Converter Domain Essential Light Chain

Figure 1. Myosin Cross Bridge (Rayment et al., 1993b). The link to the thick filament is via the C terminus. The ATP binding site is shown. The actin-binding site (left) is split by a “deep cleft”.

Page 360: Conformational Proteomics of Macromolecular Architecture

Myosin Polymorphism and Muscle Contractioic 347

2002). Furthermore, recent high resolution cry0 EFTEM images (Holmes et al., 2003) shows that most of the SOK upper domain is involved in this movement. The closing of the cleft doubles the surface area of the actin- myosin interface by bringing the so-called cardiomyopathy loop into contact with actin. This movement appears to be an essential ingredient of strong binding. However, the switch 1 element of the nucleotide- binding site is firmly anchored in the SOK upper domain. As a result of strong binding to actin, therefore, switch 1 move away from the nucleotide-binding site. Thus the strong binding to actin moves switch 1 and opens up the ATP binding site. This movement is quite distinct from the opening of switch 2, which is caused by a S o rotation of the SOK lower domain with respect to the rest of the motor domain, which in turn leads to a 60" rotation of the lever arm. Probably the movement of both switch 1 and switch 2 is necessary for product release. While the openingklosing of switch 2 has been observed a number of times by x- ray crystallography, the closing of the cleft has not yet been seen in isolated cross bridges. The movement of switch 1 breaks three hydrogen bonds to the polyphosphate moiety and thereby weakens the nucleotide binding. Conversely the binding of ATP to the active site forces the moving-in of switch 1 and closes the ATP binding site. This in turn leads to the opening of the cleft, which halves the area of the actin binding site and results in a major reduction of the binding affinity to actin. Thus the movement of 50K upper domain provides a geometrical basis for the reciprocal relationship between actin binding affinity and nucleotide binding affinity.

Description of the Cross-Bridge

The myosin cross bridge, also called motor domain or Sl fragment, consists of a 7-stranded Psheet surrounded by numerous a-helices (Rayment et al., 1993b; Rayment et al., 1993a) (Fig 1). Tryptic treatment of S1 yields three fragments that are referred to by their molecular weights as the 25K (N-terminal), 50K (middle section) and 20K (C- terminal) fragment (Balint et al., 1975). These fragments correspond roughly with subdomains of the cross bridge. However, the SOK domain actually consists of two subdomains now referred to as "upper" and

Page 361: Conformational Proteomics of Macromolecular Architecture

348 Kenneth C. Holmes & Rasmus R. Schroeder

Figure 2. The crystal structure of truncated Dictyosteleum myosin 2 showing the rotation of the converter domain (light blue) between the OPEN (left) and CLOSED (right) conformers).

“lower”. Some of the a-helices form the actin-binding domain (50K lower domain). Some protrude to form the 50K upper domain (which is also involved in strong actin binding). There is a deep cleft between the 50K upper domain and 50K lower binding domain. The cleft extends from the nucleotide-binding site to the actin-binding site. It has been speculated (Rayment et al., 1993a; Schroder et al., 1993) that this cleft

Figure 3. Models of the post-power stroke (left) and pre-power stroke (right) states based on the crystal structures of truncated Dictyosteleum myosin 2 in the OPEN and CLOSED

conformations. The actin filament is shown on the right of each diagram.

Page 362: Conformational Proteomics of Macromolecular Architecture

Myosin Polymorphism and Muscle Contraction 349

might be the means of communication between the actin binding site and the nucleotide binding site. In the C-terminal part of the molecule the small compact subdomain called the “converter” serves as a socket for the C-terminal extended a-helix. This binds two calmodulin-like “light chains” and forms a “lever arm” joining onto the thick filament. Thus the myosin cross bridge consists of four subdomains. During cross-bridge activity these subdomains retain their tertiary structure and move approximately as solid bodies.

Crystallography Shows “Open” (Pre-Power Stroke) and “Closed” (Post-Power Stroke) States of the Myosin Cross Bridge Crystallographic studies particularly of myosin from Dictyosteleum, which has been truncated to remove the lever arm (see review (Geeves & Holmes, 1999)) have revealed two conformers of the myosin cross bridge, active site OPEN and active site CLOSED, A hinged movement of the switch 2 element opens and closes the active site around the y- phosphate. The switch 2 movements are actually an angular movement of about 5” between the whole of the 50K lower domain and the motor domain - the helix 629-647 (Dictyosteleum numbers) is the hinge for this movement. This movement is accompanied by a dramatic 60“ rotation of the converter domain (Fig 2).

By building the Dictyosteleum structures into the structures of the rigor complex obtained by cryoelectron microscopy of decorated actin (Holmes et al., 2002; Rayment et al., 1993a) and replacing the truncated lever arm it is possible to model the post- (OPEN) and pre- (CLOSED) power stroke states (Fig. 3). The movement of the switch 2 region between CLOSED and OPEN is about 5w. During the cross-bridge cycle (Lymn & Taylor, 1971) the cross bridge binds to the actin in the CLOSED state. By a mechanism we discuss at the end of this article, the cross bridge then goes from active site CLOSED to OPEN (see also (Geeves & Holmes, 1999)). Through a linkage described in the next section, this small movement is translated into a 60” rotation of the converter domain. This in turn leads to a l O O A translation of the distal end of the lever arm.

Page 363: Conformational Proteomics of Macromolecular Architecture

350 Kenneth C. Holmes & Rasmus R. Schroeder

The Coupling Between Switch 2 Movement and the Rotation of the Converter Domain

The coupling between the movement of switch 2 (accomplished by a rotation of the 50K lower domain by 5" w.r.t. the motor domain) and the rotation of the converter domain is provided by the relay helix and relay loop. The relay helix is also referred to as the "switch 2 helix" by analogy to a similar helix in trimeric G-proteins. Figure 4 shows the disposition of the relay helix and relay loop in the post and pre- power stroke states. This part of the cross-bridge is the quintessence of muscle contraction. The binding of ATP favors the moving in of switch 2 (CLOSE conformation) to form an H-bond between the amide group of Gly-466 and a y-phosphate oxygen. The moving in of switch 2 (and the 50K lower domain) towards the P-pleated sheet of the motor domain (green) strains the relay helix with the result that it unwinds one H-bond (i.e. the distal end as seen from actin rotates through 100"). The converter

Figure 4. The relay helix (center) and relay loop (blue) in the pre- power stroke and post-power stroke states. In the pre-power stroke state the relay helix is kinked. The kink (2-turns to the left of the start of the blue coloring) leads to a 100" rotation of the outer end of the relay loop compared with the post-power stroke state. The kink is resolved during the transition to the post- power stroke state. The removal of the kink in turn rotates the converter domain and the attached lever arm (left) by 60". The actin binding domain (lower 50k domain, shown in part -white) and the actin helix are to the right.

Page 364: Conformational Proteomics of Macromolecular Architecture

Myosin Polyrnolphism and Muscle Contraction 35 1

domain is anchored in the outer end of the relay helix by hydrophobic interactions. The resulting kink leads to a 60” rotation of the converter domain. After ATP hydrolysis and the rebinding to actin switch 2 moves out again allowing the relay helix to straighten, thereby rotating the converter domain. A second stabilizing connection between the converter domain and the motor domain is provided by the “SH1 helix”, which is bonded to the distal end of the relay helix by hydrophobic interactions. During the rotation of the converter domain the SHl helix and the outer end of the relay helix rotate round each other (Windshugel et al., 2003). The integrity of this interaction appears to be important for the mechanical stability of the lever arm.

WEAK AND STRONG BINDING The myosin cross bridge binds to the actin filament in two distinct ways, weak and strong (or A and R (Geeves & Conibear, 1995)) that cannot readily be explained by differences between the myosin conformers OPEN and CLOSED. Since its discovery it has seemed reasonable to postulate that the deep cleft should shut on strong binding to actin (Rayment et al., 1993a; Schroder et al., 1993). Crystal structures of myosin (all without actin) have never shown the deep cleft closed. However, recent findings from spectroscopy (Yengo et al., 2002) and electron microscopy (Holmes et al., 2003) show that the 50K upper domain moves so as to close the cleft on strong binding to actin.

Strong Binding Moves Switch 1 and Opens the Nucleotide Binding Pocket

The 50K upper domain is bounded by the disordered trypsin-sensitive proteolytic loops 1 and 2. It is joined onto the motor domain via hinge points (270 and 246) at the bottom of the central 7-stranded P-sheet and some flexible residues near 449 (all numbers in this section refer to chicken skeletal sequence) that are disordered in smooth muscle (Dominguez et al., 1998). Furthermore there is a connection via the “strut” sequence (asp601 pro602) that has been mutated by Sutoh (Sasaki et al., 2000) to produce constitutively weak actin binding. Energy filtered

Page 365: Conformational Proteomics of Macromolecular Architecture

352 Kenneth C. Holmes & Rasmus R. Schroeder

Open cleft (weak binding) Actin h

VTY 50K upper domain

Closed cleft (strong binding)

Figure 5. A view looking along the actin filament The open actin binding cleft found in crystalline structures of the cross bridge closes on strong binding to actin (rigor complex), The 50K upper domain rotates about 10" towards the actin filament (model based on chicken skeletal muscle coordinates and cryo-electron microscopic studies given in Holmes et a1 (2003)).

cryo-electron microscopy of decorated actin at 14A resolution (Holmes et al., 2003) shows that the deep cleft in the myosin cross bridge closes on strong binding to actin. This is accomplished by a substantial swinging movement of the 50K upper domain The switch 1 sequence, which forms part of the nucleotide binding site, is an integral part of a small 4-stranded P-sheet that is embedded in the 50K upper domain.

Page 366: Conformational Proteomics of Macromolecular Architecture

Myosin Polymorphism and Muscle Contruetion 353

Switch 1 move with the 50K upper domain. When the 50K upper domain swings round on strong binding switch 1 is pulled away so as to open the nucleotide-binding pocket thereby breaking 3 H-bonds between switch 1 and the y-phosphate (Fig. 6).

Figure 6. A view at 90" to Fig. 5 (actin filament on the left) showing the effect of strong binding. The 50K upper domain swings towards the actin filament and moves out switch 1, which is part of a small 4-stranded P-sheet. This breaks hydrogen bonds made to the P and y phosphates and turns the nucleotide-binding tunnel into a nucleotide-binding groove.

Page 367: Conformational Proteomics of Macromolecular Architecture

354 Kenneth C. Holmes & Rasmus R. Schroeder

The Reciprocal Relationship Between ATP Binding and Actin Binding Because loop 1 is particularly involved in binding to the y-phosphate, the binding of ATP (rather than ADP) is more likely to pull in loop 1, which in turn opens the deep cleft and reduces the area of the cross-bridge in contact with actin by 50%. Conversely, the strong binding of actin opens the nucleotide-binding site and reduces the coordination of the p and y phosphates. Thus the seesaw-like movement of the 50K upper domain provides a structural basis for the reciprocal relationship between actin affinity and ATP affinity.

THE STRUCTURAL BASIS OF THE CROSS BRIDGE CYCLE The structural investigations summarized above have shown that the myosin cross-bridge can take up four states: switch-2-open (OPEN) and switch-2-closed (CLOSED); and switch-1-open (STRONG-actin-binding) and switch-1 -closed (WEAK-actin-binding). Many aspects of the Lymn- Taylor cross-bridge cycle (Lymn & Taylor, 1971) can be explained by these structural states. The power stroke is excellently correlated with the rotation of the converter domain that accompanies the movement of switch-2 (the transition CLOSED to OPEN). Thus the moving out of switch-2 (CLOSED to OPEN) drives the power stroke. Apparently the moving out of switch 1 (WEAK to STRONG) controls the actin affinity. At the beginning of the cycle both switch 1 and switch 2 are “in” and close to the phosphates.. This state is thought to be the ATPase. On binding to actin switch 1 moves out. Switch 2 moves out as the power stroke progresses. At the bottom of the power stroke both switch 1 and switch 2 are “out”, which seems to be necessary for release of ADP (the inorganic phosphate probably gets out earlier). It is possible that the switch 1 affinity for ATP could be higher than for ADP + inorganic phosphate. This would then lead to a higher affinity of the cross-bridge for actin when it carries ADP and phosphate rather than ATP and might explain the higher probability of rebinding of the cross-bridge to actin after ATP hydrolysis.

Page 368: Conformational Proteomics of Macromolecular Architecture

Myosin Polymorphism and Muscle Contraction 355

A Model of Strongly Bound Pre-Power Stroke State

The pre-power stroke bound state is experimentally unavailable (at least in non-precessive myosins), therefore it is instructive to model this state from the available structural data. The following studies have been carried out with Dictyosteleum data since data on the prepower stroke state is not available for chicken skeletal myosin. Fortunately, the sequences are very similar so that a superposition of the two myosins can be carried out unambiguously. Pairs of structures (same construct, same ligands) are available for Dictyosteleum myosin in both the pre and post power stroke states (Smith & Rayment, 1996a; Fisher et al., 1995; Kull et al., 2004). If we assume that there is a constant geometry for the actin- myosin interaction then we may model the prepower stroke state by using the 50K lower domain position found in the post-power stroke state when strongly bound to actin (rigor complex) (Holmes et al., 2003) to determine the orientation of the motor domain in the prepower stroke state Further we assume that the 50K upper domain takes the same position in the pre and post power stroke states since this is apparently determined by the strong interaction with actin. The position of these two domains in the strongly bound geometry is shown in Fig 7a. Using the crystallographicaly determined structures of Dictyosteleum myosin 2 in the pre-power stroke state we arrive at the model of the start of the power stroke shown in Fig. 7b. No major clashes are generated in this model.

Triggering the Power Stroke Mutational analysis has shown the importance of an invariant buried salt that bridges switch 2 and switch 1. Dictyosteleum myosin motor domains myosin with mutations E459R or R238E that block salt-bridge formation show defects in nucleotide-binding, and reduced rates of ATP hydrolysis.. Inversion of the salt-bridge in double-mutant M765-IS eliminates most of the defects observed for the single mutants.. The salt bridge exists in the closed form not bound to actin (pre-power stroke state in crystals) but is broken in the open (post power stroke state). One of the consequences of the model described in 3.1 is to predict that the swinging in of the 50K upper domain to affect strong actin binding in the prepower stroke state would break this salt bridge. This in turn suggests

Page 369: Conformational Proteomics of Macromolecular Architecture

356 Kenneth C. Holmes & Rasmus R. Schroeder

,,

Figure 7. The structure a shows the positions of the 50K upper and lower domains for Dictyosteleum myosin arrived at by superimposing the crystallographic coordinates of Dictyosteleum myosin 2 on the coordinates of the chicken myosin in the actin-myosin complex described above. The structure b demonstrates the structure of the motor domain of Dictyosteleum myosin has been orientated from the 50K lower domain. Since this was a truncated myosin motor (lacking the lever arm and light chain binding regions) the missing lever arm has been inserted from the position of the converter domain using the coordinates of the chicken myosin lever arm.

how the strong binding to actin could initiate the power stroke by breaking the salt bridge: switch 2 would no longer be tethered to switch 1 and could start to move out as the lever arm moves down.

ACKNOWLEGEMENTS The following computer programs have been used in constructing the figures: GRASP (Nicholls et al., 1991); Bobscript (Esnouf, 1997) and Molscript (Kraulis, 1991): Raster3D (Merritt & Bacon, 1997)

REFERENCES 1. Balint, M., Sreter, F. A., Wolf, I., Nagy, B. & Gergely, J. The substructure

of heavy meromyosin. The effect of Ca2+ and Mg2+ on the tryptic fragmentation of heavy meromyosin. Journal of Biological Chemistry. 1975; 250: 6168-77.

Page 370: Conformational Proteomics of Macromolecular Architecture

Myosin Polymorphism and Muscle Contraction 357

2. Dominguez, R., Freyzon, Y., Trybus, K. M. & Cohen, C. Crystal structure of a vertebrate smooth muscle myosin motor domain and its complex with the essential light chain: visualization of the pre-power stroke state. Cell.

3. Esnouf, R. M. An extensively modified version of MolScript that includes greatly enhanced coloring capabilities. J Mol Graph Model. 1997; 15: 132- 4, 112-3.

4. Fisher, A. J., Smith, C. A., Thoden, J., Smith, R., Sutoh, K., Holden, H. M. & Rayment, I. Structural studies of myosin:nucleotide complexes: a revised model for the molecular basis of muscle contraction. Biophysical Journal.

5. Geeves, M. A. & Conibear, P. B . The role of three-state docking of myosin S1 with actin in force generation. Biophysical Journal. 1995; 68: 194s- 201s.

6. Geeves, M. A. & Holmes, K. C. Structural mechanism of muscle contraction. Ann. Rev. Biochemistry. 1999; 68: 687-727.

7. Holmes, K. C., Kull, F. J., Jahn, W., Angert, I. & Schroeder, R. R. High resolution cryo-electronmicroscopy of decorated actin shows that the actin binding cleft is closed in the rigor state and that this opens switch 1. In press 2003.

8. Lymn, R. W. & Taylor, E. W. Mechanism of adenosine triphosphate hydrolysis by actomyosin. Biochemistry. 197 1 ; 10: 4617-24.

9. Kraulis, P. J. MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. Journal of Applied Crystallography.

10. Kull, J., Schlichting, I., Becker, A., Manstein, D. & Holmes, K. C. The structure of Dictyosteleum myosin truncated at position 754 with bound ADP.BeF3. (Zn preparation). 2004.

11. Merritt, E. & Bacon, D. Raster 3D: Photorealistic Molecular Graphics. Methods in Enzzymology. 1997; 277: 505-524.

12. Nicholls, A., Sharp, K. A. & Honig, B. GRASP: Graphical representation and analysis of structural properties. Proteins. 1991; 11: 281 -296.

13. Rayment, I., Rypniewski, W. R., Schmidt-Base, K., Smith, R., Tomchick, D. R., Benning, M. M., Winkelmann, D. A., Wesenberg, G. & Holden, H. M. Three-dimensional structure of myosin subfragment- 1 : a molecular motor. Science. 1993b: 261: 50-8.

1998; 94: 559-71.

1995; 68: 27s-28s.

199 1 ; 24: 946-50.

Page 371: Conformational Proteomics of Macromolecular Architecture

358 Kenneth C. Holmes & Rasmus R. Schroeder

14. Rayment, I., Holden, H. M., Whittaker, M., Yohn, C. B., Lorenz, M., Holmes, K. C. & Milligan, R. A. Structure of the actin-myosin complex and its implications for muscle contraction. Science. 1993a; 261: 58-65.

15. Sasaki, N., Ohkura, R. & Sutoh, K. Insertion or deletion of a single residue in the strut sequence of Dictyosteleum myosin I1 abolishes strong binding to actin. Journal of Biological Chemistry. 2000; 275: 38705-38709.

16. Schroder, R. R., Manstein, D. J., Jahn, W., Holden, H., Rayment, I., Holmes, K. C. & Spudich, J. A. Three-dimensional atomic model of F-actin decorated with Dictyosteleum myosin S1. Nature. 1993; 364: 171 -4.

17. Smith, C. A. & Rayment, I . X-ray structure of the magnesium(ii)ADPvanadate complex of the Dictyosteleum-discoideum myosin motor domain to 1.9A resolution. Biochemistry. 1996a; 35: 5404- 5417.

18. Windshugel, B., Holmes, K. C., Smith, J. C. & Fischer, S. The return stroke of myosin, computed by conjugate peak refinement. (in preparation). 2002.

19. Yengo, C. M., De la Cruz, E. M., Chrin, L. R., Gaffney, D. P. & Berger, C. L. Actin-induced closure of the actin-binding cleft of smooth muscle myosin. Journal of Biological Chemistry. 2002; 277: 241 14-241 19.

Page 372: Conformational Proteomics of Macromolecular Architecture

PART VII AROUND THE BENCH PROTEOMICS

Page 373: Conformational Proteomics of Macromolecular Architecture

This page intentionally left blank

Page 374: Conformational Proteomics of Macromolecular Architecture

Chapter 17

IS CRYSTALLIZATION A “BOTTLENECK” OF MODERN STRUCTURAL

CRYSTALLOMIC?*

Jan Sedzik’

Crystallization is recognized among structural biologists as a necessary process before three-dimensional structure can be solved at an atomic level. Crystallization has a dose of mysticism among protein chemist. Some treats it as an “art” and others as “black magic”. These concepts aroused from a limited knowledge in the physical chemistry of proteins in solution. Crystallization appears only in a metastable state. To define crystallization conditions the experiments are guided either by a chance search or by dedicated factorial design. Here we will briefly describe a factorial design method to rationally approach the metastable state. In summary, there is nothing mysterious in crystallization of biological macromolecules, and the success can often be achieved within a limited number of experiments.

Keywords: crystallization, membrane proteins, virus, structural biology

INTRODUCTION Crystallization is known since decades as a method of purification, or separation of substances that are solubilized in a Crystalliza- tion itself is a complex physicochemical process of forming solid phase of the dissolved components in test tube or inside living organisms under normal or pathological condition^'^^" The phenomenon of crystallization

“Margolin and Navia66 have introduced similar term “crystalomics” for techniques of crystal prepara- tion in practical industrial processes. We now opt for a general term “crystallomic”, to denote the know-how of finding conditions of nucleation and maintaining a steady crystal growth of biological macromolecules in solution. ‘Department of Biosciences, Karolinska Institutet, Sweden. Email address: [email protected].

36 1

liquid31.73.

Page 375: Conformational Proteomics of Macromolecular Architecture

362 Jan Sedzik

Water, 3 A, I 8 Da

Figure 1 . Relative dimensions in diameter (A) and molecular mass (Da) of water, globular rnyelin P2 protein, and the spherical Semliki Forest virus, SFW. Crystals of myelin P2 protein are about 1 mm long, while those of SFV are about 0.1 mm. Ice crystals develope in different shapes and sizes. Those shown are from Walter Tape, Atmospheric Halos, American Geophysical Union, Washington DC, 1994. (Find more at http://www.its.caltech.edu/-atomic/snowcrystals/)

follows, virtually, the same law6' everywhere on Earth or in far away places of the solar system". The process of crystal formation and growth appears in an environment rich in different chemical or biological com- ponents including small molecular weight organic23 or non-organic sub- s t a n c e ~ ~ ~ , amino acids and pep tide^'^, proteins", DNA73, RNA33, com- plexes proteins-DNAI6, ribosomes75 ribozymes13, whole viruses2j, virus caps id^^^, virus-antibodies complexes2*, protein-antibody complexes35, lipid assemblies54363, prote~glycans~~, polysa~charides~~ and even in some cases membrane^^^. The range of dimensions and molecular weights of such molecules are exemplified in the Fig. 1. Three-dimensional crystals of proteins or viruses provide the possibility to reveal atomic details.

METHODS UTILIZING CRYSTALS OF BIOLOG ICAL MOLECULES Crystals are usually beautiful and visually pleasant objects, in which molecules are arranged in periodic repeating pattern that extends in two- or three dimensions. At present about 12,000 water-soluble proteins are

Page 376: Conformational Proteomics of Macromolecular Architecture

Is Crystallization a “Bottleneck” of Modern Structural Ciystallornic? 363

available as three-dimensional crystals. So far 34 three-dimensional crys- tals of membrane proteins have been produced, but only 2 two-dimen- sional ones suitable for structure determination by electron microscopys6. Success rate for viruses is higher; in December of 2001, there were 52 crystals of viruses deposited in the Biological Macromolecule Crystalli- zation Data base, BMCD, and the NASA Archive for Protein Crystal Growth Data27. Human genome contains approximately 26,000 protein encoding genes5. More then a quarter of them refer to membrane pro- t e i n ~ ~ . As for today, there are 118 human (genus homo sapiens) proteins in 179 crystal forms suitable for structural determinat i~n~~. Indeed, this is a very small fraction of the total number of proteins of the human body. However, on May 15’h 2001, there were 12,514 structures deposited that were determined by X-ray ~rystallography~~ and among them almost 7,000 structures of proteins29. It is obvious that all these molecules at some point had been crystallized.

Crystals of proteins have so far been exploited mainly in effort to de- termine their atomic structure. Crystals of biological molecules - either two- or three-dimensional - can be utilized by major structural techniques like X-ray crystallography”, neutron crystallography38977, and, for 2-D crystals, electron diffraction” and electron microscopy3. When the mac- romolecules are small, <lo6 Da, the X-ray crystallography is the most suitable for determination of the atomic arrangements within the mole- cule. It uncovers distribution of the electron density (e/A3) within studied molecules, which is interpreted as distribution of mass in the tertiary model of the molecule. To be fully successful this technique requires the primary amino acid sequence of the studied protein, and practical know- how for how to reproducibly grow large crystals characterized by low mosaicity and high diffractihng In many cases cryoEM can be a complimentary technique to X-ray ~rystallography’~~’~.

Vapor Diffusion “HANGING DROP” -The Most Popular Crystallization Method Among crystallization protocols, as described in BMCD, the hanging drop method (Fig. 2) was utilized in total 1391 cases, followed by vapor diffusion on plates or slides, in 541 cases; and by batch method in 427

power47,49. I

Page 377: Conformational Proteomics of Macromolecular Architecture

364 Jan Sedzik

lvol reservoir 1 vol virus

Water I Reservoir

c /z+ c

C

Figure 2. Schematic drawing of the hanging drop method, the most popular method for crystallization of proteins and viruses. The “hanging drop” contains the component to be crystallized. Vapor diffusion is driv- ing the system towards equilibrium with the reservoir. During this proc- ess the system may pass into a me- tastable state where nucleation can appear and crystals start to grow.

cases. The exceptional popularity of the technique could be because it requires a relatively small volume of the sample. The principle is very simple. One volume (2-10 PI) of proteirdvirus, at a concentration around 10 mg/ml in a non-buffered solution, is mixed on a cover slip with an equal volume of the reservoir solution and the slip placed over the reser- voir. An airtight environment is maintained during the process of equili- bration of the drop with the reservoir solution. Since the concentration of the precipitants in the drop is less then in the reservoir, evaporation of water will concentrate all components. If there is a sufficient concentra- tion of protein, or precipitant, the protein may reach the state of satura- tion, followed by supersaturation, and precipitation or nucleation. At same point stabile nuclei may form, if the process is not too rapid, and growth of crystals can follow. The most desired for successful crystalli- zation is to stop the equilibration process at the metastable state. In hang- ing drop experiments, which usually requires a monitoring period of 1-3 months before crystals appear, several conditions are simultaneously set up for test.

CRYSTALLOMIC’S PARADOX - A KEY TO STRUCTURAL PROTEOMICS X-ray crystallography is the most appropriate method for determining the atomic structure of macromolecules. Good quality crystals may appear in the first experiment66, but could also require several years of experimen- tation before success79. The theoretical base for appearance of a nucleus,

Page 378: Conformational Proteomics of Macromolecular Architecture

Is Crystallization a “Bottleneck” of Modern Structural Crystallomic? 365

and its further growth into a crystal, is not known in its details. There- fore, the simplest approach is to systematically search combinations of possible variables and find - just by chance - conditions that will yield, large, single and good quality crystals for structural determination. The more sophisticated strategy is to adopt rational approaches. Among them are statistical methods utilizing regression analysis64 as a guide to crys- tallization trials65. The objective design of screening procedures3’ include rational, or incomplete, or efficient factorial designs and protocols , or respond surface methods for optimizing and improving reproducibility of crystal growth”, or even modeling of crystallization5’.

8,9,lO, 12

Mining the Biological Macromolecules Crystallization Database After reviewing the data deposited in the BMCD36, it is easy to postulate variable factors that should be included in a potential crystallization ex- periments. A selection of salts at different concentrations should be tested49. Other precipitants would be polyethylene glycols or small mo- lecular weight organic solutes. It is easily to identify at least 12 such variables. The chemical additives can be searched extensively; at least 60 such components are commonly used, with at least 5 different concen- trations. Temperature may be set at three levels (4”C, 20°C and 35°C) and the pH of the buffer to 10 levels. The protein concentration has to be considered as a separate variable7’, and the total number of levels of this factor set at 10.

With membrane proteins there is an additional complication with the choice of detergents - the heterogeneous natural lipids of the studied membrane protein may have to be replaced by a homogeneous deter- gent . There are several detergents to test, adding, say 24 levels as a minimum. There are also other variables that may be taken into conside- ration, like the method of crystallization: hanging drop, sitting drop, dia- lyzing etc. In the BMCD there are 17 different methods described27 that were applied in more then 10 cases. Another problem is how the proteins are purified (particularly important with membrane proteins), lipids may still be bound to the purified protein^^',^', or be exchanged for another detergent . In summary, by reviewing the catalogue of Hampton

26

67,68,69

Page 379: Conformational Proteomics of Macromolecular Architecture

366 Jan Sedzik

Research (Laguna Niguel, CA, USA), the total number of experiments required to obtain crystals of a macromolecule that was never before at- tempted can be calculated. Assuming a selection of 30 precipitants at 10 concentrations, 10 different pH levels, 60 additives at 5 concentrations, 3 temperature levels, 10 initial concentrations of the macromolecule to be crystallized, 24 detergents, and I 7 crystallization and purification meth- ods, the total number of individual experiments would be 11,016,000,000! Performing 1,000 trials daily that would require 30,943 years. Indeed, an unmanageable and very costly approach.

General Theories on Crystallization

Crystallization of biologically important molecules is usually performed from solution; this approach is nowadays a standard practice. The limit- ing factor, named by some researchers as a “bottleneck”, is to find the physicochemical conditions of the solute49, or even modify protein4’ to such extent that dispersed molecules of protein (or virus) will start aggre- gate (nucleate) and follow a steady growth into large single crystals. Some proteins crystallize easily and quickly (for example lysozyme), while for others the success of crystallization was achieved first after 20 years of tria~s’~.

The comprehensive study of the phenomenon of crystallization indi- cates that crystallization appears only in a so-called metastable state59. For simplicity, when proteinshiruses (called solutes) are added to the solvent they may dissolve and stay in solution in form of monomers, or as a mixture of monomers and aggregates. The maximal amount solute that can be dissolved in a solution, determines maximum solubility. Maximum solubility depends on physicochemical characteristics of the solvent, temperature and pressure. It is well established that at protein concentration below maximum solubility, crystals will never appear.

When the concentration increases, molecules come closer. One might ask to what extent monomer-to-monomer distance affects the crys- tallization. Let us consider two cases: (a) P2 protein of myelin with a di- ameter of 35 p\ and (b) Semliki Forest virus, SFV, a spherical particle of 680 A in diameter. These molecules occupy the volumes of about 2 x lo4 A’ and 2 x lo8 A3, respectively. If concentrated (as monomers) from 1 to

Page 380: Conformational Proteomics of Macromolecular Architecture

fs Crystallization a “Bottteneck” of Modem Structural Crystallornic? 367

100 mg/ml the average center-to-center distance between them decreases from about 300 A to 65 A, in the case with P2, and from 4450 A to about 1000 A for SFV. That is, the average distance between the studied mole- cules varies with the cubic root of the available volume, i.e. on 100-fold concentration the distance only decreases about 5 fold. During the equilibration of the crystallization drop with the reservoir, thc volume may decrease to half and hence the concentration of protein in the drop to the double. During the process the average center-to-center distances between protein molecules decreases 0.8 fold, i.e. to a negligible extentz4.

The phenomenon of crystallization can be described from the per- spective of osmotic pressure in analogy with the relation between solu- tions and gases46. For gases the common law states:

P . V = n . R . T (1)

where p denotes pressure, V volume, II the number of molecules, R the gas constant, and T the absolute temperature.

This equation may be applied to the solution of a hanging drop, and the partial osmotic pressure of dissolved molecules considered:

ppro, = C,,, . R . T (for protein) (2)

pprec = C,,,,, . R . T (for precipitant) ( 3 )

The protein reaches saturation at concentrations Grot, depending on the precipitant concentration C,,,,. The saturating concentration of protein, Grot, describes the border between precipitated and soluble form of the protein. Combining Eqs. (2) and (3) gives the equation:

(4)

This links protein concentration at saturation, with concentration of precipitant. An example of such an interface curve, or borderline, in a protein phase diagram is shown in Fig. 3, where is also indicated a super- saturation region in which nucleation of solid-phase protein would ap- pear. The whole process of screening for crystallization conditions can be significantly shortened when crystallization attempts is based on this principle”. In general, when working out the phase boundary, “precipi- tation vs. no precipitation” (or even for crystallization conditions), at high protein concentration the precipitant concentration should be low,

6“,,,, . c,,,, = constant

Page 381: Conformational Proteomics of Macromolecular Architecture

368 Jan Sedzik

and vice versa, for low protein concentration the concentration of pre- cipitant should be high. This relationship has been earlier considered when screening for crystallization conditions for Human Rhinovirus 1 422.

Another approach to crystallization can be developed from the per- spective of probability, assuming that spontaneous fluctuation in the lo- cal concentration of the solute will result in formation of nuclei, but un- less their size exceeds a critical value, they will re-dissolve rather than grow up spontaneously as a crystal. In this theory three-dimensional crystal growth appears only if the system is in a metastable state, where nucleation appears. This process depends on random fluctuation of solute concentration but not on its solubility nature. The probability of forming a critical size nucleus, Pcrlty, depends on the free energy, AG,,,,, required to form such a nucleus and can be described by the formula:

Pcrlty = exp(-AG,,,, lksT) , where AGc,,, = 167~~~1(3p, ,Ap)~ ( 5 )

Here the AG,,, is the energy barrier for forming stable protein nuclei, kB the Boltzmann’s constant, and T the absolute temperature. The y is free energy density per unit area of the crystal-fluid interface, ps, is the num- ber density of the crystal phase, and Ap is the difference in chemical po- tential between the fluid and the crystal. This approach is mathematically more complicated and will not be described

Figure 3. Precipitation diagram for a protein, or a virus, showing its concentration vs. concen- tration of a precipitant. Arrows indicate varia- tion of concentrations from the beginning to end of four crystallization experiments, using the “hanging drop” method shown in Fig.2.

Both theories were developed with the aim of getting a better insight in the nucleation phe- nomenon so as to more rationally produce crys- tals suitable for structural determination. It is evi- dent from both ap- proaches that to get a nucleation of protein molecules, followed by steady growth of a crys- tal, it is necessary to maintain supersaturation

Page 382: Conformational Proteomics of Macromolecular Architecture

Is Crystallization a “Bottleneck” of Modern Structural Crystallomic? 369

and a metastable state. By its nature, the metastable state easily passes into stabile states, represented by solids, like non-soluble crystals, amor- phous precipitates, or liquid phase separation. This is a very simplistic view; in real experiments the supersaturation sporadically induces forma- tion of crystals. Formation of single and stable nuclei, which further will grow as crystals, depends on a variety of parameters, which both theories are not yet able to cover. For example, the protein lysozyme will not crystallize in the presence of ammonium sulfate. However, lysozyme crystals form easily with any other salt as theory may predict5’, ammo- nium sulfate is obviously an exception48. Neither theory can give a clear answer to why ammonium sulfate - the most widely used precipitant for protein crystallization - failed in crystallizing lysozyme. Thus, neither theory is able to predict which precipitant is more suitable for a particular macromolecule; but the wrongly chosen one will surely hinder success in crystallization.

Making Crystallization a Statistical Affair The metastable state is an important prerequisite for nucleation of mole- cules before growth into a crystal. However, it must be emphasized that supersaturation or metastable states will not always lead to nucleation and crystallization. Most often the outcome is an amorphous precipita- tion. The classical way to overcome this problem is to test a multitude of conditions and search, by chance, the winning combination.

Carters et al. ’* studied the problem of finding a rational way for de- fining crystallization conditions. They showed that the phenomenon of crystallization could be an appealing system to exercise the power of sta- tistics in experimental design. Besides statistical knowledge, this requires an initial crude screening for conditions where crystallization appear and optimization of conditions’2. Initial screening and fine adjustment are two different aspects. To perform initials screening, it is required to guess what nature of precipitant (salt, organic polymer or organic solu- tion) and pH of the buffer to use. Since there are no coherent rules for how to get crystals, all these parameters are most often evaluated simul- taneously. As judged by the Crystallization Database, a suitable range of

Page 383: Conformational Proteomics of Macromolecular Architecture

370 Jan Sedzik

pH is 5-8. The total number of possible experiments is unmanageable large, and therefore one must find some rational way to choose relatively few experiments to explore the relationship between response in crystal- lization and the applied variables. It is often unknown if such relationship exists, or can be detected. In the search for crystallization conditions, the correlation coefficients for the fit of some simple linear models to rea- sonable sized samples of crystallization outcomes were ~ a l c u l a t e d ~ ~ . This showed that crystal formation might not follow a simple multiple linear model. Nevertheless, the general experience is that a multiple linear re- gression analysis makes it possible to eliminate some variables as not relevant, and thereby restrict the search within a well-defined boundary

. In addition, rationally designed experiments allow of variables plotting of precipitation diagrams for two variables, concentration of pro- tein vs. concentration of precipitant. If the crystallization fails during the screening, this is also information that helps to form the diagram.

The question may arise how to begin the screening for crystallization when we have a limited knowledge of the nature of the protein. First it is important to establish a level of confidence for accepting or rejecting the information, that is to decide about the size of the sampling experiments that we can “afford”, and then to establish what kind of data to be collec- ted and analyzed. In the design of the experiments it would be difficult to manually tabulate all the experimental combinations; therefore a com- puter is a valuable tool. Historically, Carter’ wrote the first software, INFAC, for this purpose. In the present review we will focus on DESIGN software from our hand65. The recent version can handle up to 15 variables, each with 2-15 levels. As introduced by CartersL2, the algo- rithm is designed such that any combination of two levels for two diffe- rent factors is repeated at least once.

DESIGN uses its own scoring system, set between 0-10~~. For ex- ample, if hanging or sitting drop is clean and no precipitate develops, the score is 0, if few relatively large and well diffracting crystals develops, the score is 10; scores from 0.5-3 arc reserved if amorphous precipitate developed, if there is a strong amorphous precipitate the sore is 3 , when week amorphous precipitate develops, the scores can be 0.5, The phase separation has score 4. Scores 5 , 6,7, 8, 9 are assigned as follows: when

8,67,69,71

Page 384: Conformational Proteomics of Macromolecular Architecture

Is Crystallization a “Bottleneck” of Modern Structural Crystullornic? 37 1

there are long thin needles, the score is 5 ; when there are thin plates, the score is 6; when showers of microcrystals develops the score is 7; when crystals of ugly, unpleasant morphology develops the score is 8; the score 9 is given when single, large crystals of optically pleasant morpho- logy have developed. The scoring system allows us to quantitatively de- scribe the crystallization outcome and work out precipitation diagrams, i.e. concentration of protein vs. concentration of precipitate, for relevant

tThus, all variables included in the design can be evaluated for their contribution to crystallization and a small subset of potential experiments can be established for optimization7’.

NON-CRYSTAL METHODS FOR STRUCTURAL DETERMINATION OF BIOLOGICAL MOLECULES Most what is understood about protein or virus functions come from studies of crystals that contain about lo1* macromolecules4. However, in spite of the tremendous impact on structural biology, protein and virus crystallography has inevitable limitations and restrictions. The most widely known is the difficulty to obtain diffraction-quality crystals. Spherical viruses, particularly if the diameter is around 700 A, are on the border for application of X-ray analysis and might require other tech- niques that do not require crystals. Over the years, the instrumental de- velopment of electron cryomicroscopy (cryoEM) has made it possible to study structures of increasing complexity, allowing understanding of their assembly and intrinsic dynamics7’. The quality of cryoEM data and their further processing have been effectively developed to suppress noise appearing in image data. The further advancement of the cryoEM technology will enhance the understanding of how fine details of protein folding can be revealed in a virus particle as a whole. This is especially valuable since the method requires no heavy-metal derivatives or crystal- constrained packing. The question “who needs crystal anyway”” of large biological structures is not trivial, since with cryoEM the “bottleneck” (crystal) can be avoided, and large structure be approached down to atomic resolution.

varibles70,71.

Page 385: Conformational Proteomics of Macromolecular Architecture

3 72 Jan Sedzik

IF N O T CRYSTALLOGRAPHY - ARE NMR OR CRYOEM ALTERNATIVE OPTIONS? Once protein has been purified for structural determination there is a po- tential hurdle to decide what to use: NMR, X-ray crystallography or cryoEM? At present, only X-ray crystallography techniques is mature enough to allow solving the structure to high atomic details and allowing uncovering small but significant structural changes which provide clues to the function of a biological macromolecule as a whole6. However, the regrettable inability to predict and control protein crystallization may seriously cripple this effort. Even to get rid of “crystals in protein crystal- lography” altogether has been attempted by applying an X-ray free-elec- tron laser beam for studies of as little as just ten molecules of a protein5’.

The NMR heavily depends on costly instrumentation. At present there is an upper size limitation of proteins to -20 kDa and the obtained data need to a high extent to be interpreted manually with difficulty of assessing the statistical correctness of the structure2’. In this regard, the advantages of NMR hold for small protein molecules, and the advantage of cryoEM mainly for very large structures. Both techniques require high protein purity, but avoid the requirement to grow crystals. If further de- veloped they may become serious treats for X-ray protein crystallogra- phy in years to come.

The overlooked “bottleneck” of crystallization discussed here is not the only problem for determination of protein structures. For example, in molecular biology the expression of membrane proteins is another “bottle- neck”. The current expression systems, as optimized for soluble proteins, are not suitable for membrane proteins, why alternatives have to be devel- oped for different classes of membrane-anchored or membrane-associated proteins and domains. The single peptide chain globular protein can be expressed in bacteria, whereas the heteromeric proteins may not. We may ask if this is another “bottleneck” for acquiring their structuresz1.

IS CRYSTALLIZATION A BOTTLENECK? - FINAL THOUGHT The one possible answer to the title question is: No crystallization is not a bottleneck, it is rather a psychological or mystical barrier of intellectual

Page 386: Conformational Proteomics of Macromolecular Architecture

Is Crystallization a “Bottleneck” of Modern Structural Crystallornic? 373

inability resulting, historically from lack of the knowledge about princi- ples for protein crystal formation. The financial investment (grants) in protein crystallography went mainly for development of the sophisticated software for solving the structure of proteins, or for construction of syn- chrotron facilities, while investments for studying the mechanism of crystallization were lacking such support4o946. In the human genome there is apparent information for 26,000 proteins5, while so far only the struc- tures of 118 proteins are known! Thus, there is a plenty of work for pro- tein crystallographers, NMR- and EM-scopist; crystallomers in particular should be quite busy in years to come.

A C K N O W L E D G M E N T Supported by grants from: the Medical Research Council (MFR-12175), the Natural Science Research Council (NFR- 1 169 l), STINT Foundation, Swedish Structural Biology Network, JS-CrystalResearch and KI Board of Research.

REFERENCES 1. Addadi L, Weiner S, Geva M. On how proteins interact with crystals and

their effect on crystal formation. Z Kurdiol, 2001;90S: 92-98. 2. Auer S, Frenkel D. Prediction of absolute crystal-nucleation rate in hard-

sphere colloids, Nature, 2001 ;49: 1020- 1023. 3. Amos LA, Henderson R, Unwin PN. Three-dimensional structure determi-

nation by electron microscopy of two-dimensional crystals. Prog Biophys Mol Biol, 1982;39: 183-23 1.

4. Baker TS, Johnson JE. “Principle of Virus Structure determination, in Structural Biology of Viruses”, Eds. Chiu W, Burnett RM, Garcea RL. Ox- ford University Press, New York, Oxford, pp. 38-79, 1997.

5. Baltimore D. Human genome unveiled. Nature, 2001;409:814-816. 6. Berisio R, Lamzin VS, Sica F, Wilson KS, Zagari A, Mazzarella L. Protein

titration in the crystal state. J Mol B i d , 1999;292:845-854. 7 . Boyd D, Schierle C, Beckwith J. How many membrane proteins are there?

Protein Sci, 1998;7:20 1-205. 8. Carter CW. Jr. Efficient factorial designs and the analysis of macromolecu-

lar crystal growth conditions. METHODS: A Companion to Methods in En- zymology, 1990;l: 12-24.

Page 387: Conformational Proteomics of Macromolecular Architecture

374 Jan Sedzik

9. Carter CW. Design of crystallization experiments and protocols. In A Ducruix and R Giege (eds.) Crystallization of proteins and nucleic acids: A Practical Approach., IRL Press, Oxford, 1992, pp 47-7 1.

10. Carter CW. Jr. A local approximation to super saturation affords a useful coordinate transformation for the study of crystal growth. Acta Cryst,

11. Carter CW. Response surface methods for optimizing and improving repro- ducibility of crystal growth. Meth Enzymol, 1997;276:74-99.

12. Carter CW. Jr, Carter CW. Protein crystallization using incomplete factorial experiments, J Biol Chem, 1978;254: 12219-12223.

13. Cate JH, Gooding AR, Podell E, Zhou K, Golden BL, Kundrot CE, Cech TK, Doudna JA. Crystal structure of a group 1 ribozyme domain: principles of RNA packing. Science, 1996;273: 1678-168s.

14. Clieng RH. Visualization on the grid of virus-host interactions. Lect Notes Comp Sci, 2000;13:141-154.

15. Cheng RH, Reddy VS, Olson NH, Fisher AJ, Baker TS, Johnson JE, Func- tional implications of quasi-equivalence in a T=3 icosahedral animal virus established by cryo-electron microscopy and X-ray crystallography. Struc- ture, 1994;2:271-282.

16. Conlin RM, Brown RS. Reconstitution of protein-DNA complexes for crys- tallization. Methods A401 Biol, 2001 $48547-556.

17. Corey RB, Marsh RE. X-ray diffraction studies of crystalline amino acids, peptides and proteins. Fortschr Chem Org Naturst, 1968;26: 1-47.

18. DeRosier DJ. Who needs crystal anyway? Nature, 1997;386:26. 19. Durbin SD, Feher G. Protein crystallization. Annu Rev Phys Chem,

20. Durbin SD, Feher G. Studies of crystal growth mechanisms of proteins by electron microscopy. J Mol Biol, 1990;212:763-774.

21. Edwards AM, Arrowsmith CH, Christendat D, Dharamsi A, James D. Friesen JD, Greenblatt JF, Vedadi M. Protein production: feeding the crys- tallographers and NMR spectroscopists. Nature, 2000;7:970-972.

22. Erickson JW, Frankenberger EA, Rossmann MG, Fout GS, Medappa KC, Rueckert RR. Crystallization of a common cold virus, human rhinovirus 14: "isomorphism" with poliovirus crystals. Proc Nut1 Acad Sci USA, 1983; 80:

23. Fitzgerald LJ, Gallucci JC, Gerkin RE. Structure of tetralithium 1,4,5,8 naphthale-tetracarboxylate dodecahydrate. Acta Crystallogr, 1992;48C:

24. Fredericks WJ, Hammonds MC, Howard SB, Rosenberger F. Density, ther- mal expansivity, viscosity and refractive index of lysozyme solutions at crystal growth concentrations. J Cryst Growth, 1994; 141: 183-192.

1996;D52:647-654.

1996;47: 171-204.

93 1-934.

1430- 1434.

Page 388: Conformational Proteomics of Macromolecular Architecture

Is Crystallization a “Bottleneck” of Modem Structural Crystallomic? 375

25. Fry EE, Grimes J, Stuart DI. Virus crystallography. Mol Biotechnol, 1999;

26. Garavito RM, Picot D, Loll PJ. Strategies €or crystallizing membrane pro- teins. J Bioenerg Biomembr, 1996;28:13-27.

27. Gilliland GL, Tung M, Blakeslee DM, Ladner J. The Biological Macro- molecule Crystallization Database, Version 3.0: New Features, Data, and the NASA Archive for Protein Crystal Growth Data. Acta Crystallogr,

28. Gigant B, Barbey-Martin C, Bizebard T, Fleury D, Daniels R, Skehel JJ, Knossow M. A neutralizing antibody Fab-influenza hemagglutinin complex with an unprecedented 2: 1 stoichiometry: characterization and crystalliza- tion. Acta Crystullogr, 2000; D56: 1067- 1069

29. Guex N, Diemand A. Peitsch MC. Protein modeling for all. TZBS,

30. Hennessy D, Buchanan B, Subramanian D, Wilkosz PA, Rosenberg JM. Statistical methods for the objective design of screening procedures for macromolecular crystallization. Actu Crystullogr, 2000;D56: 8 17-827.

3 1. Hinson JA, McMeekin TL. A rapid method for preparing crystalline human hemoglobin and the separation of crystalline hemoglobin A in quantity. Biochem Biophys Res Commun, 1969;35:94-101.

32. Hoff van J.H. The role of osmotic pressure in the analogy between solutions and gases. ZPhys Chern, 1887;1;461-508.

33. Holbrook SR, Holbrook EL, Walukiewicz HE. Crystallization of RNA. Cell Mol Life Sci, 2001;58:234-243.

34. Hoshi K, Ejiri S, Ozawa H. Localizational alterations of calcium, phospho- rus, and calcification-related organics such as proteoglycans and alkaline phosphatase during bone calcification. J Bone Miner Res, 2001;16:289-298.

35. Hunte C. Insights from the structure of the yeast cytochrome bcl complex: crystallization of membrane proteins with antibody fragments. FEBS Lett,

36. Izmailov AF, Myerson AS, Arnold S. A statistical understanding of nuclea- tion. J Cryst Growth, 1999;196:234-242.

37. Jenniskens P, Blake DF.Jenniskens P, Blake DF. Crystallization of amor- phous water ice in the solar system. Astrophys J, 1996; 473: 1104-1 113.

38. Korszun Z. Neutron macromolecular crystallography. Methods Enzymol,

39. Kunji ER, Spudich EN, Grisshammer R, Henderson R, Spudich JL. Elec- tron crystallographic analysis of two-dimensional crystals of sensory rhodopsin 11: a 6.9 A projection structure. J Mol Biol, 2001; 308:279-293.

12113-23.

1994;D50:408-413.

1999 ;24: 364-367.

2001 ;504:126-132.

1997;276:218-232.

40. Lattman EE. No crystals no grant. Proteins, 1996;26:i-ii

Page 389: Conformational Proteomics of Macromolecular Architecture

376 Jan Sedzik

41. Liuzzi GM, Rizzo T, Ventola A, Riccio P, Quagliariello E. Purification of lipid-associated basic protein from guinea pig spinal-cord myelin. Acta Neurol (Napoli), 1991;13:113-120.

42. Longenecker KL, Garrard SM, Sheffield PJ, Derewenda ZS. Protein crystal- lization by rational mutagenesis of surface residues: Lys to Ala mutations promote crystallization of RhoGDI. Acta Crystallogr, 2001 ;D57:679-688.

43. Lundager Madsen HE, Christensson F, Chernov AA, Polyak LE, Suvorova EI. Crystallization of calcium phosphate in microgravity. Adv Space Res, 1995;16:65-68.

44. Margolin A. Novel crystalline catalysts. TIBTECH. 1996;14:223-230. 45. Margolin AL, Navia MA. Protein crystals as novel catalytic materials.

46. Martonosi AN. No crystals --no grant. FASEB .1996;10:529 47. McPherson A. Crystallization of biological macromolecules. Cold Spring

Harbor Laboratory Press, Cold Spring Harbor, 1999. 48. McPherson A. A comparison of salts for the crystallization of macromole-

cules. Protein Sci, 2001;10:418-422. 49. McPherson A. Current approaches to macromolecular crystallization. Eur J

Biochem, 1990; 189: 1-23. 50. Mozzarelli A, Rossi GL. Protein function in the crystal. Annu Rev Biophys

Biomol Struct, 1996;25:343-65. 51. Myerson A. (ed.) Molecular modeling applications in crystallization. Cam-

bridge University Press, 1999. 52. Navroz P. Shorter, brighter, better. Nature, 2002;415:110-I 11. 53. Pande A, Pande J, Asherie N, Lomakin A, Ogun 0, King J, Benedek GB.

Crystal cataracts: human genetic cataract caused by protein crystallization. Proc Natl Acad Sci USA, 2001;98: 6116-6120.

54. Pascher I, Lundmark M, Nyholm PG. Sundell S. Crystal structures of mem- brane lipids. Biochim Biophys Acta, 1992;1113:339-373.

55. Persike N, Pfeiffer M, Guckenberger R, Radmacher M, Fritz M. Direct ob- servation of different surface structures on high-resolution images of native halorhodopsin. J Mol Biol, 2001; 310:773-780.

56. Popot JL, Engelman DM. Helical membrane protein folding, stability, and evolution. Annu Rev Biochenz, 2000;69:881-922.

57. Ravishankar R, Thomas CJ, Suguna K, Surolia A, Vijayan M. Crystal struc- tures of the peanut lectin-lactose complex at acidic pH: retention of unusual quaternary structure, empty and carbohydrate bound combining sites, mo- lecular mimicry and crystal packing directed by interactions at the com- bining site. Proteins, 2000;4:260-270.

58. Ries-Kautt M, Ducruix A. Relative effectiveness of various ions on the solubility and crystal growth of lysozyme. J Biol Chem, 1989; 264:745-748.

Angew Chem Int Ed Engl, 2001;40:2204-2222.

Page 390: Conformational Proteomics of Macromolecular Architecture

Is CIystallization a “Bottleneck ’’ of Modern Structural Crystallomic? 377

59. Ries-Kautt M, Ducruix A. Inferences drawn from physicochemical studies of crystallogenesis and precrystalline state. Meth Enzymol, 1997;276:23-59.

60. Riccio P, Rosenbusch JP, Quagliariello E. A new procedure for the isolation of the brain myelin basic protein in a lipid-bound form. FEBS Lett,

61. Rosenberg F. Inorganic and protein crystal growth - similarities and diffe- rences. J Cryst Growth, 1986;76:618-636.

62. Sachs L. Applied statistics. A handbook of techniques, Springer-Verlag, New York, Berlin, Heidelberg, Tokyo (second edition), 1984.

63. Schuster B, Sleytr UB. S-layer-supported lipid membranes. J Biotechnol, 2000;74:233-254.

64. Sedzik J. Regression analysis of factorially designed trials - a logical ap- proach to protein crystallization. Biochim Biophys Actu, 1995; 1251: 177-85.

65. Sedzik J. DESIGN: a guide to protein crystallization experiments. Arch Biochem Biophys, 1994;308:342-348.

66. Sedzik J, Bergfors T, Jones AT. Weise M. Bovine P2 myelin basic protein crystallizes in three different forms. J Neurochem, 1988; 56: 1908-1913.

67. Sedzik J, Kotake Y, Uyemura K. Purification of PASWPMP22 - an ex- tremely hydrophobic glycoprotein of PNS myelin membrane. Neurore- port, 1998;9: 1595-1600.

68. Sedzik J, Kotake Y, Uyemura K. Purification of PO myelin glycoprotein by a Cu*+-immobilized metal affinity chromatography. Neurochem Res, 1999;

69. Sedzik J, Uyemura K, Tsukihara T. Sodium dodecyl sulfate bound to the hydrophobic myelin glycoproteins (PO and PASIIPMP22) can be ex- changed for neutral detergents using ceramic hydroxy apatite column - a link to crystallization. J Neurochem, 2001 ;78S: 46.

70. Sedzik J, Hammar L, Haag L, Skoging-Nyberg U, Tars K, Marco M, Cheng HR. Structural proteomics of enveloped viruses: crystallization, crystallo- graphy, mutagenesis and cryo-electron microscopy. Recent Res Devel Virol,

71. Sedzik J, Kotake Y, Uyemura K, Ataka M. Factorially designed crystalli- zation trials of the full-length PO myelin membrane glycoprotein. 1. Pre- cipitation diagram, J Cryst Growth, 2003;247:483-496.

72. ten Wolde PR, Frenkel D. Enhancement of protein crystal nucleation by critical density fluctuations. Science, 1997; 277: 1975-1 978.

73. Timsit Y, Moras D. Crystallization of DNA. Methods Enzyrnol, 1992; 211: 409-429.

74. Torgesen JL. Purification by single-crystal growth. Ann NY Acad Sci, 1966;137:30-43.

75. Wittmann HG, Mussig J, Piefke J, Gewitz HS, Rheinberger HJ, Yonath A. Crystallization of Escherichia coli ribosomes. FEBS Lett, 1982;146:217-220.

1984;177:236-240.

241723-732.

2001 ;3:53-70.

Page 391: Conformational Proteomics of Macromolecular Architecture

Jan Sedzik 378

76. Wynne SA, Crowther RA, Leslie AG. The crystal structure of the human hepatitis B virus capsid. Mol Cell, 1999;3:771-780.

77. Wlodawer A. Neutron diffraction of crystalline proteins. Prog Biophys Mol Biol, 1982;40:115-159.

78. Wu B, Hammar L, Xing L, Markarian S, Yan J, Iwasaki K, Fujiyoshi Y, Omura T, Cheng RH. Phytoreovirus T=l core plays critical roles in organiz- ing the outer capsid of T=13 quasi-equivalence. Virology, 2000; 271: 18-25.

79. Yoshikawa S, Shinzawa-Itoh K, Tsukihara T. Crystal structure and reaction mechanism of bovine heart cytochrome c oxidase. Keio Univ Life Sci Med, 1996;l: 13-23.

Page 392: Conformational Proteomics of Macromolecular Architecture

Chapter 18

SENSOR SURFACE INTERACTIONS IN THE S T U D Y OF M A C R O M O L E C U L A R

ASSEMBLIES

Jose M. Casasnovas*, Sevak Markarian and Lena Hammar'

Macromolecular assemblies, like viruses, are often built by multiple copies of a few components. These may have similar or diverse func- tions. The multivalency of the assembly allows ligand recognition with high avidity. Nevertheless, affinity is linked to the monovalent ligand interaction, related to the nature of the interactive surface. Such inter- actions can be followed in real time by the aid of surface plasmon reso- nance. Thus a sensor surface may be prepared with either the assembly or the ligand immobilized at the sensor and their interaction studied. Kinetic and thermodynamic properties of ligand binding to the macro- molecular assembly can be determined. Variations in the structure of the assembly, like those occurring during virus infection may also be revealed by this technique.

Keywords: antibody affinity, alphavirus, epitopes, lectins, picornavirus, glycoconjugates, receptor interaction, biosensors, surface plasmon resonance, virus structure

SENSOR SURFACE INTERACTIONS Surface plasmon resonance (SPR) is considered a reliable technique for exploring macromolecular interactions.22 This technology has been ap- plied in the Biacore and other instruments, where an analyte is immobi-

*Centro Nacional de Biotecnologia-CSIC, Madrid, Spain. Email address: [email protected] 'Department of Biosciences, Karolinska Institute at Novum, SE-19144 Huddinge, Sweden. Ernail address: [email protected]; www.biosci.ki.se/kisv

379

Page 393: Conformational Proteomics of Macromolecular Architecture

380 JosP M. Cusasnovas, Sevuk Murkurian & Lena Hammur

lized on a sensor surface in a chip and placed in a micro-chamber through which the free ligand is injected.” Ligand binding to the analyte changes the mass attached at the sensor surface. This leads to a plasmon resonance over the sensor and result in the change of the refractive index, which can be monitored by the instrument and reported in resonance units (RU). Variations in bound mass results, within limits, in a propor- tional RU response.24 Based on this the number of ligand binding sites on a macromolecular analyte can be accessed, and competitive interactions explored for different ligand. The technique records macromolecular interactions in real time, so that association and dissociation phases can be analyzed for determination of kinetic rates. Moreover, thermodynamic parameters can also be extracted from affinity and kinetic constants measured at different temperatures and environments. Additionally, the technique can be used for measurements of the concentration of biologi- cally active molecules and a selection of diagnostic probes.26

APPLICABILITY TO LARGE MOLECULAR ASSEMBLIES SPR has been extensively used for analysis of intermolecular interactions during the last decade. The growing number of commercial instruments indicates the success of this technology, although the Biacore instrument appears the most extended. The broader use of the technique is related to its automation, the increase in sensitivity and in properties of available sensor chip surfaces, so that a large variety of samples can now be studied.22 The technique has been applied for the study of ligand binding to macromolecular assemblies, particularly to intact virus particles. Kine- tic and thermodynamic parameters have been determined for binding of

Viruses are built as antibodies and cellular receptors to viruses. assemblies of multiple copies of subunits with identical function, so that determination of monomeric ligand-binding affinity requires immobili- zation of the multivalent virus particle at the sensor chip. In our studies on small naked viruses, most of the ligand-binding sites appear to be pre- served after the covalent immobilization of the virus particles to the sen- sor surface (Mode A, Fig. 1). This has allowed determination of binding kinetics and thermodynamics for antibodies and receptors molecules.

3-5:17:20:27

Page 394: Conformational Proteomics of Macromolecular Architecture

Macromolecular Assemblies Studied at Sensor Su faces 38 1

A Fig. 1. Experimental design for different virus application. In A the virus is covalently attached to the sensor surface. and the receptor or other probe introduced in the flow. In B the probe is immobilized and the free virus introduced in the flow, while in C, the virus is reversibly immobi- lized, as in B, to allow the study of inter- actions with free ligands. The latter method makes it possible to study monovalent ligand binding kinetics at enveloped, or otherwise labile viruses, but it requires the reloading of the virus between binding experiments.

B

Ligand-induced conformational changes in a virus particle have also been explored using this mode3. The regeneration procedure needed to remove the bound ligand between the cycles is more critical than the risk to destroy the analyte during the immobilization. Therefore, in many cases and in particular with enveloped viruses, the virus has to be re- versibly attached to the surface so that it can be replenished between the ligand binding cycles (See Fig. 1, Mode C).

The mode C design have been explored with the Semliki Forrest virus,”;” an alphavirus serving as a model for Type 2 virus fusion mechanism.l”’s Attachment by lectin, or antibody interaction provides a non-invasive reversible immobilization that allows the virus-antibody kinetics to be studied in an adequate range of pH. A lectin suitable for such experiments, is the small, stable and well-characterized snowdrop (Galanthus nivalis) lectin, GNA, witch is also highly selective for termi- nal al-3 linked mannose in mannans or protein glycoconjugates. It does not interfere with glucose, as many other mannose-binding lectins do. This makes it suitable to be used in combination with antibodies in appli- cations like this or in capturing ELISA, since most antibodies do not carry this type of mannose residues. It binds to HIV,8 and similar ~ i r u s e s , ~ and as we found also to SIT.’”

An alternative mode of operating the binding studies is to inject the assembly macromolecule through the flow cell with immobilized ligands at the sensor surface (Mode B, Fig. 1). This approach is particularly helpful for revealing dynamic changes on the virus surface in free solu-

Page 395: Conformational Proteomics of Macromolecular Architecture

382 Jose‘ M. Casasnovas, Sevak Markarian & Lena Harnmar

tion. The virus can then also be run over a series of sensor surfaces, coated with different probes, like antibodies, against different configura- tions. This mode has been used to identify stages in prefusion rearrange- ments of SFV under conditions mimicking the infection situation in the endosome. ’

We will demonstrate these main modes of experimental set up with applications from the picornavirus and alphavirus fields. In the naked capsid viruses we are focusing on how variation in receptor interaction in different picornaviruses relate to their infection strategies, while in the enveloped virus the glycosylation of the envelope glycoproteins is con- sidered. Thereby it is pointed out that not only antibody epitopes, but also components in the carbohydrate coat, provided by the envelope glycoproteins, may vary in surface accessibility and affect external struc- ture of the virus in a manner reflecting stages in the life cycle of the virus.

Studies Applied to Non-Enveloped Viruses

Non-enveloped virus particles, as picornaviruses, are built as a “naked” protein capsid with a nucleic acid closely packed inside.23 The capsid is formed by assembly of multiple and identical subunits or protomers, con- taining a single or several viral proteins. Particular epitopes on the outer part of the capsid shell are targeted by the host’s immune defense on in- fection, or used for attachment of the virus particle to a receptor mole- cule on the target cell.

SPR was initially used for the study of antibody binding to viruses. The virus particles retained the native conformation after covalent im- mobilization in the sensor chip, providing an advantage in respect to the conventional solid phase immunoassay, where the attached proteins become partly dena t~ red .~ Later, the technology was used to study recep- tor binding to viruses.

Most of the studies on receptor binding to virus particles have been done for members of the picornavirus family (Table 1). The particles of these viruses have icosahedral symmetry and 60 receptor binding sites.” SPR analysis of receptor binding has been reported for several members of the picornavirus family, including: R h i n o v i r ~ s e s , ~ ~ ~ ~ ~ ~ Poliovirus20~27

3-5;17:20;27

Page 396: Conformational Proteomics of Macromolecular Architecture

Macromoleculur Assemblies Studied at Sensor Suguces 383

Table 1. Receptor binding kinetics for some picomavimses, and the accessibility of their receptor binding sites.

Virus-Receptor kass biSsx 10-3 KD Accessibility (M-1s-1) (s-1) (nM)

HRV3-ICl 5300 1.8 340 Low

HRV16-IC1 5700 1.4 230 Low

PV 1 -PVR 22000 4.3 200 Medium

EVl1-CD55 150000 0.3 2000 High

The table includes average kinetic and affinity constants determined for binding of monomeric receptors intercellular adhesion molecule-1 (IC1),27 poliovirus receptor (PVR)" and the CD55 receptor" respectively to human rhinoviruses (HRV) serotypes 3 and 16, poliovirus 1 (PV1) and echovirus serotype 11 (EV11). Accessibility was inferred from structural data.2:'3''4'6:27

and Echovir~ses. '~ There is a certain diversity in the receptor binding modes among picornaviruses, which correlate with differences in the conformation of the receptor binding sites visualized by structural

Receptor binding to human rhinoviruses had the slowest kinetic association rate (Table I), consistent with the lowest accessibility of the receptor binding site, located in the bottom of a de- pressive surface or canyon on the virus capsid,16 By contrast, the CD55 receptor bound to echoviruses bound with a fast association rate, result- ing from receptor binding to a highly exposed siteI3. Therefore, a corre- lation between the magnitude of the kinetic association rate and the con- formation of the receptor binding site can be established based on the kinetic data and structural studies reported to date (Table 1). Moreover, the most accessible sites are expected to be more hydrophilic and conse- quently provide the weakest virus-receptor interactions, resulting in a high dissociation kinetic rate. We can then conclude that structural infor- mation on receptor and other ligand binding sites in viruses and other macromolecular assemblies can be inferred from the kinetic rates.

studies.2; 13;14: 16:27

Studies of Enveloped Viruses Many viruses become enclosed in a membrane by budding from one or the other of the cells membranes. The lipids in this envelope membrane

Page 397: Conformational Proteomics of Macromolecular Architecture

384 JosP M. Cusasnovas, Sevak Markarian & Lena Hummur

are cell-derived while glycoproteins are virus encoded. Well-known examples of virus that usually buds at the infected cell plasma membrane are the Alphaviruses, Influenza virus and Human Immunodeficiency virus (HIV). In these viruses the glycoproteins form trimeric structures, usually referred to as spikes that are anchored in the envelope membrane. Their function would include both receptor binding and fusion between the virus and target membranes, so as to allow the genome to enter the cell. Earlier in this book we discuss the fusion mechanisms of Semliki Forest virus (SFV), a member of the alphavirus group, and how prefusion reorganizations in the external structures can be followed by antibody interaction at a sensor surface (Chapter 4). Here, instead, their

120

I00

6p 2" 80 - 60 t 9

40

?O

GNA DSL

Fig. 2. Leff panel: N-linked glycosylation of the envelope proteins E l , E2 and E3 of Semliki Forest virus. Each of the peptides carries a complex type glycoconjugate (ct). Only the E2 has also a high mannose type structure (hm). The El and E2 are present as three dimers in the trimeric protrusions on the SFV surface. The peptides transverse the envelope membrane, close to their C-terminals. The E3 is cleaved, as indicated by the arrow, from the precursor-E2 at a late stage in the maturation of the virus. In the cleavage-deficient SFVsql mutant the virions buds from the cells with the non-mature constellation. Right panel shows the relative binding of free wildtype SFV, or mutant sql, to three lectins in a sensor binding experiment. The virus particles were included in the flow over lectin coated sensor surfaces in three successive channels and the binding relative to that in the GNA-channel recorded. Bars refer to standard error of the mean in 3 experiments. GNA is the mannose-binding lectin from Galanthus nivulis, with preference for terminal (a-I,3)Mun, DSL is the Datura strurnonium lectin, which binds terminal (P-I,4)GlcNAc, or h c N A c , structures characteristic of complex type glycoconjugates, and NPL is the lectin from Narcissus pseudo-narcissus, selective for (a-1,h)Man in polymannose structures.

Page 398: Conformational Proteomics of Macromolecular Architecture

Macromolecular Assemblies Studied at Sensor Su$aces 385

carbohydrate coat will be considered. Actually, the glycoconjugates may sometimes be the dominant feature of the virus external domain. Exam- ples like the HIV tell about the difficulty for antibodies to access their epitopes due to heavy g lyc~sy la t ion .~ ;~ ; '~ While the more than 20 N- linked glycoconjugates in HIVgpl20 covers the surface like a sugar dome,I2 the SFV envelope proteins comes with a more modest glycosyla- tion; There is one such sugar structure at the fusion protein El , and two at the assumed receptor-binding protein E2 (Fig. 2). Still these conjugates would contribute to the properties of the envelope. The non- mature form of this virus carries a precursor form of E2 (pE2) from which an about 9kDa glycopeptide, E3, is cleaved off at a late stage in virus maturation. A small number of the precursor still remains in the final particle. The schematic drawing in the Fig. 2, left panel, summarizes the glycosylation situation with these glycopeptides. The E3 domain carries at least one complex type sugar and would provide a prominent shield of a large portion of the SFV surface as part of pE2. This should also be the case in the non-infectious, cleavage-deficient SFV mutant sq125 where the precursor form of the glycoprotein E2 is retained due to mutations in the cleavage site (Fig. 2). This view is supported by the observation that proteolytic treatment of the mutant restores infectivity.

Virus Interactions at Lectin Coated Sensor Surfaces

It is not surprising that the enveloped virus surface with its glycoproteins can be targeted by lectins, i.e. carbohydrate binding proteins. Lectins were used early on for blood group determination and glycoprotein purification. Today a rich panel of commercially available lectins provides the carbohydrate chemists with valuable tools for advanced characterization of various carbohydrate structures. Quite a few lectins have a high selectihvity for particular carbohydrate structures. This makes it easy to check the type of sugar structures present in a glycoprotein by a lectin Furthermore, biotinylated lectins attached to streptavidin coated sensor chip provide a convenient system for testing the avail- ability of different types of glycoconjugates on the virus surface. The streptavidin-biotin interaction is strong enough to allow extensive

blot6;8

Page 399: Conformational Proteomics of Macromolecular Architecture

386 Jose' M. Cusasnovas, Sevak Murkarian & Lena Hammar

regeneration of the system, while the lectin does not need to be replen- ished between the runs.

An example of this type of test, showing its applicability for concen- tration determination of a virus in suspension is given in Fig. 3. Here the sensor surface is coated with the mannose binding Galanthus nivalis lectin, GNA, and the virus introduced in the flow over the sensor (Mode B of Fig. 1). It also demonstrates that the interaction is competed for by free lectin. For regeneration pulses of metylmannoside is introduced in the flow. The background for this interaction is that the SFV contains a high mannose glycoconjugate in glycoprotein E2. As judged from the binding studies in Fig. 3, and 2, right panel, this sugar is available for

Fig. 3. Concentration dependent binding of free SFV to GNA- coated sensor surfaces and its competition by free GNA.

external interaction with lectin coated sensor surfaces. From the lectin- binding pattern, it is concluded that the glycoconjugate would be a trimmed, five-mannose (M5) structure, rather than a large high mannose type (M-9), preferred by the narcissus lectin (Fig. 2, right panel, GNA and NPL). Considering that this M5 structure has escaped decoration to the complex type on its passage through the Golgi system, one would assume that it has been shielded by protein oligo- merization - the trimerization of the

pE2-El heterodimers, building up the external spikes of the virion, are formed at this stage. Therefore, to be accessible for external lectin interaction, as we see in the Figs. 2 and 3, the maturation process must have released the shield. The cleavage-deficient mutant SFV-sql, retains the non-mature form and, consequently, is less prone to bind the GNA than the DSL, as compared to the wildtype virus (Fig. 2, right panel). This would also partly explain the low infectivity of the mutant virus, since it is not efficiently binding the target cell and does not fuse at the pH optimal for the wild type virus. These shortcomings of the mutant particles are reversed by proteolytic treatment that should release the E3 domain.

Page 400: Conformational Proteomics of Macromolecular Architecture

Macromolecular Assemblies Studied at Sensor Surfaces 387

loo - Fig. 4. SFV binding to lectin coated

amount of purified virus was injected over sensor surfaces coated with the 8

sensor surfaces vz. pH. A subsaturating

lectins GNA, DSL, RCA, and VVL, at d 60 - different pH. This revealed different profiles of virus binding to the .a 40 - surfaces. RCA is the Ricinus cornrnunis 120-agglutinin, with specificity for 20 - terminal p-Gal, or P-GalNAc, and VVL

c4

. . . . . .

2.7 80-6 \ + GNA \, . v.. DSL +- RCA +. VVL \ . .

\o GalNAc-0-SerlThr. ';''

THE DYNAMIC VIRUS SURFACE SPR has been used for analysis of receptor-mediated conformational changes in human rhinovirus (HRV) particle^.^ It is known that the receptor for the major group of human rhinoviruses (intercellular adhe- sion molecule- 1, ICAM- 1) triggers release of the internal polypeptides and RNA from the c a p ~ i d . ~ ~ The release of these capsid components was monitored in real time using SPR, showing how the methodology can be used for detection of subunit association states within macromolecular assemblies.

Another studied rearrangement in the exterior domains of a virus is the prefusion rearrangements occurring in the SFV on acidification. A set of monoclonal antibodies were then used to probe exposed domains following the reshuffling of the virus surface during prefusion stages." Like protein epitopes also variation in exposure of sugar structures may result from the dynamics of the surface, and be recognized by selective lectins. We found only a moderate variation virus binding with pH to sensor surfaces coated with the three first mentioned lectins (Fig. 4). However, the virus showed a binding optimum at around pH 6.2 to the VVL surface and bound poorly at neutral pH (Fig. 4). Considering that all the tested lectins bound the denatured SFV well at neutral pH, the profile of the VVL would not reflect a pH dependent affinity profile of the lectin-sugar interaction, but rather the appearance of a small structure that is hidden in the neutral virion. This slightly precedes the emergence

is a Vicia vllosa lectin, selective for

Page 401: Conformational Proteomics of Macromolecular Architecture

388 Jose' M. Casasnovas, Sevak Markarian & Lena Hammar

of the fusion peptide at the exterior of the virion in response to a lowered pH. l1

Significance of Binding Studies for Structural Dynamics

We have presented here some applications of SPR to binding studies of different types of viruses and ligands. Kinetic data determined from such binding studies not only described the interaction as such, but also provided information that could be used for modeling of ligand recognition domains. Additionally, real time recording of interactions occurring under relevant conditions allowed description of conforma- tional rearrangements related to virus infection processes. Similar experimental strategies as those presented here can be used for both direct analyses of ligand interaction lunetics and to follow rearrange- ments in macromolecular assemblies.

ACKNOWLEDGEMENTS This work has been granted from the Swedish Board for Medical and Natural Sciences (VR) and from the Karolinska Institute, Stockholm Sweden, which is thankfully acknowledged. We also like to thank the Biacore Corporation, Uppsala, for kind help and support.

REFERENCES 1. Babino A, Tello D, Rojas A, Bay S, Osinaga E, and Alzari PM. The crystal

structure of a plant lectin in complex with the Tn antigen. FEBS Lett, 2003;

2. Belnap D, McDermott B, Filman D, Cheng N, Trus B, Zuccola H, Racaniello V, Hogle J, and Steven A. Three-dimensional structure of poliovirus receptor bound to poliovirus. Proc Natl Acad Sci USA, 2000;

3. Casasnovas J, Reed R, and Springer T. Kinetics of receptor and virus interaction and receptor-induced virus disruption: Methods for study with surface plasmon resonance. Methods: A Companion to Methods in Enzymology, 1994; 6:157-167.

4. Casasnovas JM, and Springer TA. Kinetics and thermodynamics of virus binding to receptor. Studies with rhinovirus, intercellular adhesion

536: 106- 10.

97: 73-78.

Page 402: Conformational Proteomics of Macromolecular Architecture

Macromolecular Assemblies Studied at Sensor Surj6aces 389

molecule-1 (ICAM-I), and surface plasmon resonance. J Biol Chem, 1995;

5. Dubs MC, Altschuh D, and Van Regenmortel MH. Interaction between viruses and monoclonal antibodies studied by surface plasmon resonance. lmmunol Letters, 1992; 3159-64.

6. Eriksson S, Bhikhabhai R, and Hammar L. Lectin binding properties of HIV glycoproteins. Pharmacia-LKB Biotechnology Technical Notes, 1989; Application File Nr 301.

7. Gilljam G, Siridewa K, and Hammar L. Purification of simian immunodefi- ciency virus, SIVMAC25 1 , and of its external envelope glycoprotein, gp148. J Chromatogr A , 1994; 675:89-100.

8. Hammar L, Eriksson S, and Morein B. Human immunodeficiency virus glycoproteins: lectin binding properties. AIDS Res Hum Retroviruses, 1989;

9. Hammar L, Hirsch I, Machado AA, De Mareuil J, Baillon JG, Bolmont C, and Chermann JC. Lectin-mediated effects on HIV type 1 infection in vitro. AIDS Res Hum Retroviruses, 1995; 11:87-95.

10. Hammar L, Markarian S, and Cheng RH. Exploring virus surface structure. Bia Journal, 1998; 5:22-23.

11. Hammar L, Markarian S, Haag L, Lankinen H, Salmi A, and Cheng RH. Prefusion Rearrangements Resulting in Fusion Peptide Exposure in Semliki Forest Virus. J Biol Chem, 2003; 278:7189-98.

12. Haseltine WA, and Wong-Staal F. The molecular biology of the AIDS virus. Sci Am, 1988; 25952-62.

13. He Y, Bowman V, Mueller S, Bator C, Bella J, Peng X, Baker T, Wimmer E, Kuhn R, and Rossmann M. Interaction of the poliovirus receptor with poliovirus. Proc Natl Acad Sci USA, 2000; 97:79-84.

14. He Y, Lin F, Chipman P, Bator C, Baker T, Shoham M, Kuhn R, Medof M, and Rossmann M. Structure of decay-accelerating factor bound to echovirus 7: A virus-receptor complex. Proc Natl Acad Sci USA, 2002; 99:10325- 10329.

15. Heinz F, and Allison S. The machinery for flavivirus fusion with host cell membranes. Curr Opin Microbiol, 2001 ; 4:450-5.

16. Kolatkar P, Bella J, Olson N, Bator C, Baker T, and Rossmann M. Structural studies of two rhinovirus serotypes complexed with fragments of their cellular receptor. EMBO J , 1999; 18:6249-6259.

17. Lea S, Powell R, McKee T, Evans D, Brown D, Stuart D, and van der Merwe P. Determination of the affinity and kinetic constants for the

270: 13216-24.

5:495-506.

Page 403: Conformational Proteomics of Macromolecular Architecture

390 Jose' M. Casasnovas, Sevak Murkarian & Lena Hammar

interaction between the human virus ehovirus 11 and its cellular receptor, CD5.5. J Biol Chem, 1998; 273:30443-30447.

18. Lescar J, Roussel A, Wien MW, Navaza J, Fuller SD, Wengler G, and Rey FA. The Fusion glycoprotein shell of Seniliki Forest virus: an icosahedral assembly primed for fusogenic activation at endosomal pH. Cell, 2001; 105:137-48. ~

19. Malmqvist M. Biospecific interaction analysis using biosensor technology. Nature, 1993; 361: 186-187.

20. McDermott B, Rux A, Eisenberg R, Cohen G, and Racaniello V. Two distinct binding affinities of polivirus for its cellular receptor. J Biol Chem,

21. Osinaga E, Bay S, Tello D, Babino A, Pritsch 0 , Assemat K, Cantacuzene D, Nakada H, and Alzari P. Analysis of the fine specificity of Tn-binding proteins using synthetic glycopeptide epitopes and a biosensor based on surface plasmon resonance spectroscopy. FEBS Lett, 2000; 469:124-8.

22. Rich R, and Myszka D. Advances in surface plasmon resonance biosensor analysis. Curr Opin Biotechnol, 2000; 1154-61.

23. Rueckert R. Picornaviridae: The viruses and their replication. in T Monath, (ed.), Virology, Raven Press, New York, 1996, pp 609-654.

24. Stenberg E, Persson B, Roos H, and Urbaniczky C. Quantitative determination of surface concentration of protein with surface plasmon resonance using radio labeled proteins. J Colloid and Interphaye Sci, 1990; 143513-526.

2.5. Tubulekas I, and Liljestrom P. Suppressors of cleavage-site mutations in the p62 envelope protein of Semliki Forest virus reveal dynamics in spike structure and function. J Virol, 1998; 72:2825-3 1.

26. van Regenmortel M, Altschuh D, Chatellier J, Christiansen L, Rauffer- Bruyere N, Richalet-Secordel P, Witz J , and Zeder-Lutz G. Measurement of antigen-antibody interactions with biosensors. J Molec Recogqition, 1998; 11:163-167.

27. Xing L, Tjarnlund K, Lindqvist €3, Kaplan GG, Feigelstock D, Cheng RH, and Casasnovas JM. Distinct cellular receptor interactions in poliovirus and rhinoviruses. Einba J , 2000; 19: 1207-16.

2000; 275:23089-23096.

,

Page 404: Conformational Proteomics of Macromolecular Architecture

Chapter 19

PPlDB - A PROTEIN-PROTEIN INTERACTIONS DATABASE

Prasanna R. Kolatkar* and Lin Kuit

The development of the proteomics field in the post-genomics era has led to an accumulation of data concerning parameters affecting protein folding and assembly. Rationally implemented search-and-evaluation engines would provide a valuable tool for structural and functional predictions directly from the sequence of a protein. This would have an impact on many fields, such as folding, regulation of protein function and in other biotechnical applications.

Keywords: Database, protein-protein interaction

PROTEl N-PROTEI N INTERACTIONS Biological processes are a complex set of events involving many molecules that work together to carry out very specific reactions. DNA and RNA are some of the entities involved in serving as templates for creating the key molecules of life - the proteins. Proteins are a diverse breed of molecules with unique properties, which allows them to carry out a wide variety of reactions and with high specificity. They are involved in all aspects of biological function including regulation of DNA elements, circulating oxygen throughout our body, and even allowing us to taste and smell things. Although there are some events that a single protein can carry out individually, proteins often have to work with other proteins to carry out complex tasks. Thus there are an intricate series of protein-protein interactions that bring together many

'Genome Institute of Singapore, 1 Science Park Road, The Capricorn #05-01, Science Park 11, Singapore 1 17528; Email address: [email protected] 'College of Life Sciences, Beijing Normal University, Beijing 100875, P. R. China; Email address: linkui @bnu,edu.cn

391

Page 405: Conformational Proteomics of Macromolecular Architecture

392 Prasanna R. Kolatkar & Lin Kui

components to facilitate a myriad of reactions or pathways. Indeed the whole field of proteomics is trying to understand this interesting maze that allows organisms to function and therefore exist.

In-Vitro Systems

Many methods are being employed to understand protein-protein interactions. These methods include yeast two-hybrid systems (Fields, 1989), two-dimensional geldmass spectrometry (Hillenkamp, 1990), and even protein-chip systems (Fung 2001; Bruenner, 1996). All these in- vitro methods are quite powerful and can help to find interacting protein partners but they all have certain drawbacks. The most significant drawback is that the proteome is a much larger entity than the genome and the level of high-throughput analysis needed for proteomics is much greater than employed in the field of genomics. In addition the necessary array of needed experiments is much greater due to the diversity of reactions and the corresponding data.

Existing Databases Using in Silico Methods

In silico methods can therefore provide an excellent alternative platform to help guide the experimental methods by facilitating high-throughput analysis, which yields potential interacting partners that can be subsequently confirmed using in-vitro methods. Various in-silico systems designed to analyze protein-protein interactions include Database of Interacting Proteins-DIP(Xenarios, 200 1; Marcotte, 1999), Bimolecular Interaction Network Database- BIND (Bader, 200 l), and yeast protein database YPD (PE Hodges, 1998). DIP uses an automatic algorithm that uses sequence information to help define putative protein- protein interactions in addition to experimental data. BIND is a repository of putative pathways which is built using information from an experimental database (Pruitt 2001) as well as automatic information extraction from text (Donaldson,2000). YPD is a large database of yeast proteome information which has information created manually by a team of people validating information by checking journals.

Page 406: Conformational Proteomics of Macromolecular Architecture

PPiDB -A Protein-Protein Interactions Database 393

CREATING DATABASES Biological data and specifically protein-protein interaction data is growing at incredible rates. The creation of databases that can be built rapidly and automatically is extremely useful to be able to stay abreast of all the information and especially the most recent information. It is also important that the data is stored in an efficient manner for subsequent retrieval and a user-friendly design is employed to allow biologists to make full use of the information. There are many tools available that can be used to create such databases and this chapter will describe a particularly elegant and powerful system.

Creation of Protein-Protein Interactions Database (PPiDB) Protein-Protein Interactions Database (PPiDB) was created using the KRIS (Davidson, 1997) database integration system. The database uses and integrates different types of information to form a comprehensive knowledge warehouse for understanding protein-protein interactions.

Computational Methods Because of the importance and complicated pathways involving protein- protein interactions in living cells, the traditional method for understanding protein function has been remarkably time-consuming. Researchers have used various “wet” biological methods and assays to understand function as described earlier in this chapter. In addition methods such as X-ray crystallography, NMR and electron microscopy have also been employed to get a detailed understanding of protein- protein interactions. All these methods are laborious and involve significant preparation of samples and subsequent experimentation.

Here, a different approach is presented that uses a computational platform, which merges two methods to infer protein functions and interactions from protein sequences based on domain fusion information and knowledge of protein orthologs. The basic idea of the domain fusion method derives from the observation that some pairs of interacting proteins have homologs within another organism, which are fused into a single protein chain. The core method described here is based on

Page 407: Conformational Proteomics of Macromolecular Architecture

394 Prasanna R. Kolatkar & Lin Kui

Marcotte et al's (1999) Rosetta Stone method which looks for proteins that are distinct in one organism but fused together in another. For instance, two independent proteins in the fly genome might be found as a single longer protein in the worm genome. If the proteins are fused together in the worm genome, it suggests that the individual proteins enable function within the fly by interacting together. Using such methods, Eisenberg and his colleagues have already deduced the likely functions for over half of the 2,500 yeast proteins for which no function was known. The fused protein A-B is called the Rosetta Stone Sequence and is described schematically in Fig. 1.

A-B

A

Fig. 1. A-B is the Rosetta Stone protein that suggests that proteins A and functionally related and have a better-than-random chance of interacting

B are

We also used the interologs idea based on Walhout et al. (2000) to see if we could find additional putative interacting pairs. In their study, Walhout et al. (2000) used large-scale two-hybrid analysis as a way to functionally annotate large numbers of uncharacterized proteins predicted by complete genome sequences. Because the two-hybrid system is an artificial assay, the data should first be integrated with other information to evaluate the likelihood of biological relevance of each potential interaction. They classified the potential interactions according to two-hybrid criteria and/or known biological information and explored the possibility that the knowledge of interactions conserved in other organisms might represent useful biological information called interologs which are defined as the following: pair X/Y conserved interactions are referred to as interologs of pair X'/Y' interactions in other species if X' and Y' are orthologs of X and Y, respectively. PPiDB was created by also computing some putative interactions using a similar algorithm. However, due to the limitation of our current computing capacity, we could not infer as many interactions as possible in the initial version 1.0.

Page 408: Conformational Proteomics of Macromolecular Architecture

PPiDB -A Protein-Protein Interactions Database 395

Source Databases The following databases listed in Table 1 are used as data sources to create the PPiDB. The KRIS data integration engine (see Section KRIS Engine and Fig. 2) is used to extract and integrate carefully selected data itenis from these data sources. The powerful data modeling functions in KRIS allow complicated relationships among the data objects created to be modeled efficiently despite employing heterogeneous data schemas from the different databases (see Table 1).

Table 1. The list of main data sources which are downloaded and localized

% Swissprot

% GenBank

I%c PubMed NCBI PubMed database

QPfam The Protein Families database

9DIJ

An annotated protein sequence database

NCBI gene data bank

The Database of Interacting Proteins

KRIS Engine

PPiDB has been implemented using a powerful data integration and analysis system, KRIS. KRIS can facilitate efficient building of a prototype for complex biological data objects and their relationships and integrating information from distributed and heterogeneous data resources. There are 3 types of object containers in KRIS; set, bag and list. Objects in a container can be either homogeneous or heterogeneous; in the latter objects are tagged by different caps using variant data type. Record objects are main data models to describe entities in the real world. Similar to the general records in relational databases, the records consist of several fields as well. However, each field of a record can consist of any type of data within KRIS, for instance, it can be a bag, a list, a set, or a record, etc. Hence, data modeling facilitated by KRIS is more powerful than other methods. KRIS is not only a powerful data integration system with many built-in object operators and functions, but also a data analysis system containing many existing bioinformatics applications

Page 409: Conformational Proteomics of Macromolecular Architecture

396 Prasanna R. Kolatkar & Lin Kui

including BLAST tool kits, HMM tools, and a graphics toolkit. It is highly suitable for addressing bioinformatics problems [Davidson et al., 1997; Lin et al., 19981.

User Browsers

11 Apache WWW Server

CGls CPL scriots

KRIS Engine PPiDB

CPL l"'c'p'eter

d& & &% Fig. 2. The schematic diagram of the PPiDB system

IMPLEMENTATION A N D UPDATE The Fig. 2 shows the outline of PPiDB system and the relationships between the data sources used for creating the database. Data sources including Pfam, Swissprot, and DIP have to be downloaded, transformed, and localized before creating PpiDB. The system creates data sources in CPL objects corresponding to the original data. The related IDS for the references are subsequently computed and corresponding abstracts are automatically downloaded. This information is used by the validation subsystem to produce all the predictions of potential interaction pairs according to the computational methods described in the previous section "Computational Methods". The database can subsequently be updated

Page 410: Conformational Proteomics of Macromolecular Architecture

PPiDB - A Protein-Protein Interactions Database 397

after downloading and localizing the four data sources mentioned previously. The system doesn’t provide incremental update strategy currently due to the computational restriction.

VALIDATION OF THE PREDICTED RESULTS The computational methods described previously usually produce astoundingly numerous putative interaction pairs. In fact, Eisenberg’s group has found some 50,000 Rosetta stone sequences in organisms, each of which is composed of two fused sequences from the 6,200 proteins of the yeast genome. This would suggest that each yeast protein interacts with about 10 other proteins in the cell. Therefore, the researchers hypothesize that the proteins start off as one big protein and then split into two or more as the organisms evolve. In PPiDB, there are 753,508 predicted interacting pairs between 19,548 proteins before validation (See the statistics in Fig. 4). However, it is likely that there are many false positives of interactions in the computed result set. Hence, the predicted results need to be validated to improve the quality.

Combining different information before loading the results into PPiDB facilitates validation of putative interactions. We first take interactions examined by biological experiments that are kept in the Database of Interacting Proteins (DIP, Xenarios et al., 2001). Next we utilize keywords within literature to evaluate and validate putative interactions. This involves downloading all abstracts of the papers that are directly linked to the protein sequences or domains; the algorithm subsequently checks the common set of functional keywords shared by the putative pair of interacting proteins. The keywords are also parsed from the corresponding entries in the Swissprot database to augment the potential set of keywords. If both keyword sets of two proteins share certain keywords, the pair of proteins is validated as a putative interaction in PPiDB. Finally we apply a NLP (natural language process) based technique on the previously downloaded data set of all abstracts and use the pre-defined protein-protein interaction key phrases to mine the real interactions in the sentences. For examples the potential phrases could be, “interact”, “interaction of ’, “bind”, “inhibit”, “complex of ’, etc. Different validation methods are assigned a different confidence number

Page 411: Conformational Proteomics of Macromolecular Architecture

398 Prasanna R. Kolatkar & Lin Kui

which in turn is used by the subsystem to group the validated interactions into different categories. The statistics of PPiDB is given in Fig. 4. More than 50% of predicted interacting pairs are filtered as false positives. The visualization subsystem will use different colors to represent the different confidence pairs from the database (Fig. 3).

PUBLISHING THE DATABASE After validation, there are 3503 19 putative interactions between 15,301 proteins derived from 429 different species in PPiDB 1.0. The database also integrates the related information extracted from Pfam database, Swissprot database, and others. PPiDB additionally provides a query engine and a BLAST search interface for users to query PPiDB and to find their homologs in PPiDB. Fig. 4 shows PPiDB’s home page and Fig. 5 shows an example of TPR domain information and its potential interacting domains. Many internal and external cross-links can be used for retrieving other related information conveniently. The whole database is implemented as a multiple-tiered client-server system (Fig. 2). The objects are stored as CPL types in the database and are represented as HTTP objects after queries and search.

UTILITY OF PROTEIN-PROTEIN INTERACTIONS DATABASE A database of protein-protein interactions is extremely valuable as it can build the initial step to forming hypotheses for potential pathways. There are many different types of databases that are available for analyzing these protein-protein interactions. Some databases are constructed based simply on genomic information while others are fully dependent on experimental data. Increasingly databases are being created which combine different types of information so that complex pathways can be better understood. The increased use of automation for finding new protein-protein interactions along with the ability to integrate different types of data is a critical advance for creating useful databases as the information is growing exponentially. Although validation is still the bottleneck, many methods being developed will help streamline this task

Page 412: Conformational Proteomics of Macromolecular Architecture

PPiDB - A Protein-Protein Interactions Database 399

>ueried Interactionb): b Species: I Escherichia coli I Escherichia coli retron Ec67 I Escherichia oli retron Ec79 I Escherichia coli retron Ec86 I Escherichia coli retron 3c 107 I Escherichia coli plasmid pPY 1 I3 I Escherchia coli I & Network:

I

Interactions : P09097[dna gyrase subunit a] <=> P06982[dna wrase subunit b] <=> P;10083[topoisomerase iv subunit b]

<=> P09097[dna gyrase subunit a] <=> P70082[topoisomerase iv subunit a] P10443[dna polymerase iii, alpha chain] <=> P 15043[atp-dependent dna helicase recq] <=> P43329[atp-dependent helicase hma] <=> P3702l[atp-dependent helicase hrpb] <=> P2 1 hqi[atp-dependent rna helicase dbpa] <=> P23304[atp-deuendent ma helicase dead] <=> P2 1507[at~-dependent ma helicase srmb] <=> P03007[dna polymerase iii, epsilon chain] <=> P07025[excinuclease abc subunit b] <=> PO4995 [exodeoxyribonuclease i]

P06982tdna gyrase subunit b]

Fig. 3. Some known proteins interacting with PO9097 and PO6982 of E. coli

Page 413: Conformational Proteomics of Macromolecular Architecture

400 Prasanna R. Kolatkar & Lin Kui

Protein-Protein Interaction Database (PPiDB v 1.0)

<<< Last updated: Dec 1 2000 >>>

uer Search - Data Guide What's Interactin Home - - New Domain: - Mining

for some known interacting proteins of

The lives of biological cells are mainly controlled by molecular interactions, e.g. between DNA and proteins, proteins and proteins or proteins and small molecules. Among these, protein-protein interactions play an especially crucial role because they are essential for almost every biological process. Interacting proteins could be in metabolic and signaling pathways or in complexes. Q Statistics

r------ 1 domains 1 709 1 N/A

j species I 554 429

I 1 proteins 1 19548 1 15301

What's Interactin Data Home - uer Search - Guide

Domain: - - Mining - -

GIs, 1 Science Park Road, The Capricorn 05-01 Singapore 11 7528 Please send comments

Kleisli I CPL A E z H

and suggestions to Kui Lin

Fig. 4. The Home page of PPiDB 1 .O

DemoE. coli.

Page 414: Conformational Proteomics of Macromolecular Architecture

PPiDB -A Protein-Protein Interactions Database 40 1

Protein-Protein Interaction Database (PPiDB v 1.0)

<<< Last updated: Dec 1 2000 >>> Data

Guide uer Search - New Domains' - - Mining; - What's Interactin

Home -

Information of TPR Domain DB links: I Pfam I PubMed I Swissprot I PDB I Genbank I DIP I Interacting with [ 12 ] domains:

1. CheR methyltransferase 2. 3. FKBP-type peptidyl-prolvl cis-trans isomerases 4. Glycosyl transferases 5. RanBPl domain. 6. Rhomboid family 7. SH3 domain 8. Ser/Thr protein phosphatase 9. Transcriptional regulatory protein, C terminal 10. Zinc finger. C3HC4 type (RING finger) 1 1. Zn-finger in Ran binding protein and others. 12. jmiC domain

Cyclophilin type peptidyl-prolyl cis-trans isomerase

Fig. 5 . TPR domain and its putative interacting partners in PPiDB 1.0

as well. The amalgamation of information will lead to increased efficiencies in the laboratory as well as novel discoveries to be made which will greatly increase our understanding of the proteome.

REFERENCES 1. Bader GD, Donaldson I, Wolting C, Ouellette BF, Pawson T, Hogue CW.

BIND-The biomolecular interaction network database. Nucleic Acids Res, 2001; 29(1):242-245.

Page 415: Conformational Proteomics of Macromolecular Architecture

402 Prasanna R. Kolatkar & Lin Kui

2. Bruenner, B. A., Yip, T-T., and Hutchens, T. W. Quantitative analysis of oligonucleotides by matrix-assisted laser desorptiodionization mass spectrometry. Rapid Commun. Mass Spectrom, 1996; 10: 1797- 180 1.

3. Davidson, S., Overton, C., Tannen, V., and Wong, L. BioKleisli: A Digital Library for Biomedical Researchers, International Journal of Digital Libraries, 1997; 1: 36-53.

4. Donaldson, I., Hogue,C., Martin, J., deBruijin, B., Wolting, C., and Baskin, B http://bioinfo. mshri.on.ca/prebind/ 2000.

5. Fields,S.,Song 0. A novel genetic system to detect protein-protein interactions. Nature, 1989; 340: 2455-246.

6. Fung, E.T., Thulasiraman, V., Weinberger, S., Dalmasso, E.A. Protein biochips for differential profiling. Current Opinions in Biotechnology,

7. Hilllenkamp, F., Karas M. Mass spectrometry of peptides and proteins by matrix-assisted ultraviolet laser desorptiodionization. Methods Enzymol,

8. Hodges,PE, Payne,WE, Garrels, JI. The Yeast Protein Database (YPD): a curated proteome database for Saccharomyces cerevisiae. Nucleic Acids Res, 1998; 26: 68-72.

9. Lin K., Ting,A., Wang, J., and Wong, L. Hunting TPR domains using Kleisli, in S. Miyano and T. Takagi (eds.), Genome Informatics. Universal Academic Press, Inc., Tokyo, 1998, pp. 173-182.

10. Marcotte, E.M., Pelligrini,M.,Ng, H-L., Rice, D.W., Yeates,T.O., Eisenberg, D. Detecting protein function and protein-protein interactions from genome sequences. Science, 1999; 285: 751-753.

11. Pruitt K.D., Maglott, D. R. RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Research, 2001; 29(1): 137-140.

12. Walhout AJ, Sordella R, Lu X, Hartley JL, Temple GF, Brasch MA, Thierry-Mieg N, Vidal M. Protein interaction mapping in C. elegans using proteins involved in vulva1 development. Science, 2000; 287: 116-22.

13. Xenarios, I., Fernandez, E., Salwinski, L., Duan, X. J., Thompson, M. J., Marcotte, E. M., Eisenberg, D. DIP: The Database of Interacting Proteins: 2001 update. Nucleic Acids Res, 2001; 29(1): 239-241.

14. Thermus thermophilus HB8. J 01 Bid. 1983; 168: 449450.

2001; 12(1): 65-69.

1990; 193:280-295.

Page 416: Conformational Proteomics of Macromolecular Architecture

Chapter 20

- VIRUS PARTICLE EXPLORER (VIPER): A REPOSITORY OF VIRUS CAPSID

STRUCTURES*

Vijay S. Reddy', Padmaja Natarajan, Gabriel Lander, Chunxu Qu',

Charles L. Brooks, Ill and John E. Johnson

Virus structures represent mega-molecular nucleoprotein complexes. The three dimensional structures of 74 unique virus capsids, from 21 families and 30 different genera, have been determined at near atomic resolution. We have devised a website and a database of high- resolution virus structures namely VIrus Particle ExploreR (VIPER: http://mmtsb.scripps.edu/viper/) as a repository of virus structures, where all the structures are stored in a single (standard) icosahedral convention. Each capsid is shown pictorially along with a list of the physical properties. Furthermore, the derived results of structural and computational analyses on each capsid are provided highlighting the inter-subunit residue-residue contacts, binding energies, quasi-equiva- lence and assembly pathways. The structural and analysis tools develo- ped to analyze virus structures can be accessed through the VIPER web site. Efforts are ongoing to include cryo-EM reconstructions as well as the models fitted into these densities.

Keywords: Virus structure, capsid architecture, database, computational analysis, protein-protein interactions.

*From the Department of Molecular Biology, #MB31, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037. 'Corresponding author E-mail address: reddyv @ scripps.edu 'Current address: Ludwig Institute for Cancer Research, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093.

403

Page 417: Conformational Proteomics of Macromolecular Architecture

404 Vijay S. Reddy et al.

I N TRO D U CT I 0 N The number of viral capsid structures determined at near atomic resolu- tion is growing in sync with the advances in the synchrotron radiation sources, detectors and computer hardware and software. We realized the need to organize these structures at one place, where all the capsid structures (coordinates) are oriented uniformly in one single icosahedral convention, as the icosahedral symmetry is central to all the viral capsids characterized at high resolution. This is unlike the Protein Data Bank (PDB) (Berman et al., 2000), where the capsid structures can be deposi- ted in any orientation as long as they are consistent with the crystal symmetry and packing. We have taken all the capsid structures available in the PDB and in some cases obtained through personal communication and transformed them into the standard convention. These coordinates are available through a website and a database of virus structures, namely Virus Particle Explorer (VIPER) and can be accessed through the World Wide Web (WWW) at http://mmtsb.scripps.edu/viper/ (Reddy et al., 2001). Having all the virus structures in the single orientatiodconvention unquestionably facilitates the collective analyses of virus structures pri- marily due to a single set of icosahedral matrices are utilized. In addition to repository of virus structures, the VIPER site is also being developed with the educational and research needs in mind. A number of computa- tional and structure tools that have been developed to analyze virus structures in terms of the protein-protein interactions, quasi-equivalence, virus assembly and surface properties are made available through the VIPER site.

ORGANIZATION OF THE VIPER SITE The organization of the VIPER site is shown as a flowchart in the Fig. 1. Briefly, an E-mail alert is setup to inform the VIPER developer team as and when a new virus structure gets deposited in the PDB on a weekly basis. The coordinates of the new structure are oriented into the VIPER convention (Fig. 2) by displaying the PDB structure in the graphics program 0 (Jones et al., 1991) and determining the appropriate transfor- mation required. This determination is also aided by the information pro- vided in the “REMARK’ records in the PDB file as well as the original

Page 418: Conformational Proteomics of Macromolecular Architecture

- VIrus Particle Explore& (VIPER) 405

Vlrus Particle ExploreR (VIPER)

Pictorial repres- . I ' Overall capsid surface i Subunit fold P- Quasiequivalent lattice i Capsomeric organization i c---

i m a 1 I n f o r m I

r Diameter T number

j Familiy and Genus name

:- I 1 MEDLINE link to primary Citation i ' SWISSPROT link to a.a. sequence +

Protein Explorer link for Visualization ~ , Links to related entries . - [m%g%%% 'i k , Spacegroup ' Crystallization conditions 1 % Crystal contacts J

Intersubunit contact tables lntersubunit association Energies - Residues at intersubunit interfaces Quasiequivalence (0) scores

Tools and Utilities v /' Oligomer Generator ~ lcosahedral Server I Map a residue i Visual VIPER [ VIPER analysis i .-.--._I__ lll_

Help and FAQs

+------ i , ! Help&FAQs I I Search based on PDBID I About calculations I j How to download chime plugin ;

~ - - -__-__,

Figure 1. Flow chart showing the organization of the contents of the VIPER site. For each entry in the database, a web page showing the molecular structure, physical properties is created. In addition, links to relevant information on the WWW and the derived results due to various analyses are provided.

Page 419: Conformational Proteomics of Macromolecular Architecture

406 Vijay S. Reddy et al.

3 Z(2)

Figure 2. Schematic diagram showing the standard icosahedral orientation, Z(2)-3-5- X(2), used in the VIPER site. X, Y, Z are the orthogonal coordinate axes and the numbers 2, 3 and 5 correspond to icosahedral %-fold, 3-fold and 5-fold axes respectively. In this convention, the particle (icosahedron) is located at the origin, (O,O,O) and oriented such that a pair of icosahedral2-fold axes are coincident with the Z and X axes with a pair of 3 and 5-fold axes lying between the Z and X axes in the XZ plane. All the viral capsids in the VIPER site are oriented in this orientation and positioned at the origin.

publication. In the VIPER convention, the icosahedral particle is located at the origin, (O,O,O), with a pair of icosahedral 3 and 5-fold axes posi- tioned between the orthogonal Z and X axes in the XZ plane (Fig. 2). A particular icosahedral asymmetric unit is chosen as the standard for the capsids with different T numbers and that convention is strictly followed. These coordinates form the basis for rest of the VIPER analysis. Each virus structure entry is represented by a web page showing i) surface rendered image of the entire capsid color-coded according to the Z-coor- dinates of the atoms in the reference asymmetric unit, which are propor- tional to the particle radius. ii) ribbon and wire diagrams of the subunit tertiary and quaternary (capsomeric) structures respectively and iii) schematic representation of the corresponding quasiequivalenthcosa- hedral lattice. Furthermore, each entry carries the information on physical characteristics (e.g., diameter, T number), unit cell and crys- tallization information and links to amino acid sequence (Swiss-Prot),

Page 420: Conformational Proteomics of Macromolecular Architecture

- VZrus particle Explore8 (VIPER) 407

primary citation (MEDLINE) and the derived results of structural analy- sis. The surface representations were generated on a single scale using the program GRASP (Nicholls, Sharp, and Honig, 1991), such that the relative differences in their dimensions are preserved. The ribbon and wire diagrams were generated using the programs MOLSCRIPT (Kraulis, 1991)and RASTER3D (Merritt and Bacon, 1997). A typical VIPER page for a viral capsid is shown in Fig. 3.

Black Beetle Virus

Resume: PDBlD Resolution % A A SeqAcc# po4329 Family Nodaviridae T number 3 # of Subunits 180 Diameter Ave 324/\, Max 346A Crystal Information

Primary Citation

Download Files: Rendered Surface T=3 lattice

Header PDB to VIPER matrix VIPER Coordinates

Computations: Quasi Equivalence Contact Tables inter Subunit Association enerqies Residues at the interfaces L

interactive PE Graphics Subunit fold Capsomeres

Figure 3. A representative web page of an entry in the VIPER website. The web page for black beetle virus (PDB-ID: 2BBV) shows the molecular surface of the capsid, the tertiary fold of the capsid protein subunit, a geometrical representation of the quasi- equivalent lattice and the representative oligomeric organization of the subunits in the capsid. The molecular surface of the capsid is color-coded as a function of the Z- coordinate of the subunits in the asymmetric unit, which is proportional to the particle radius. Listed in the menu bar on left is the related information on a particular entry along with the corresponding links to PDB, Swiss-Prot and MEDLINE databases. Hyper links are also provided to the derived results due to various analyses.

Page 421: Conformational Proteomics of Macromolecular Architecture

408 Vijay S. Reddy et al.

Table 1 Inter subunit contacts for the entry 2BBVa

AI-B1 B1-C1 C1-A1

W:191-P:325 (H-H)

C:193-P:325 (H-H)

P:194S:164 (H-P) P:194-R:322 (H-B) P: 194-N:324 (H-P)

K:196-S:164 (B-P) K: 1966s: 165 (B-P) K: 196-D:254 (B-A) K: 196-R:322 (B-B)

S:198-D:254 (P-A) S:198-L:256 (P-H)

N:199-H:215 (P-B) N: 199-L:256 (P-H)

V:200-E:257 (H-A)

Q:201-H:215 (P-B) Q:201-G:258 (P-H) Q:201-1:259 (P-H) Q:201-P:264 (P-H)

P:203-A:265 (H-H) P:203-N:266 (H-P)

W:191-P:325 (H-H)

C:193-P:325 (H-H)

P:194-S:164 (H-P) P:194-R:322 (H-B) P: 194-N:324 (H-P)

K: 1 96-S : 1 64 (B-P) K:196-S:165 (B-P) K: 196-D:254 (B-A) K: 196-R:322 (B-B)

S: 198-D:254 (P-A) S:198-L:256 (P-H)

N:199-H:215 (P-B) N: 199-L:256 (P-H)

V:200-E:257 (H-A)

Q:20IpH:215 (P-B)

Q:201-1:259 (P-H) Q:201-P:264 (P-H)

P:203-A:265 (H-H) P:203-N:266 (H-P)

W:191-P:325 (H-H)

C: 193-P:325 (H-H)

P:194-S: 164 (H-P) P:194-R:322 (H-B) P:194-N:324 (H-P)

K:196-S:164 (B-P) K:196-S:165 (B-P) K:196-D:254 (B-A) K:196-R:322 (B-B)

S: 198-D:254(P-A) S: 198-L:256(P-H)

N:199-H:215 (P-B) N:199-L:256 (P-H)

V:200-E:257 (H-A)

Q:201-V:214 (P-H) Q:201-H:215 (P-B)

Q:201-1:259 (P-H) Q:20IpP:264 (P-H)

P:203-A:265 (H-H)

"Shown are representative residue-residue contacts that stabilize different quasi 3-fold interfaces (AI-Bl, B1-C1 and C1-AI) in the capsid of black beetle virus (PDB-ID: 2BBV). A l , B1 and C1 correspond to the subunits occupying the structurally unique environments in the T=3 icosahedral lattice (Figure 4). Al-BI, Bl-CI and CI-A1 refer ro the inter subunit interfaces between A1 and B1, B 1 and C1, C1 and A1 subunits respectively. The inter residue contacts are also distinguished by the types of residues involved (e.g., hydrophobic (H), polar (P). acidic (A) and basic (B) ). The same residue pairs are aligned across different interfaces suggests that the quasi 3-fold symmetry in 2BBV is quite well maintained. The complete lists of contacts are available at the VIPER website.

Page 422: Conformational Proteomics of Macromolecular Architecture

- VIrus particle gxplorelj (VIPER) 409

ANALYSIS AND WEB TOOLS A number of structural and computational tools were developed for rapid and high throughput analysis of virus structures. Primarily these tools are developed as part of the parent Research Resource, Multi Scale Modeling Tools for Structural Biology (MMTSB) (http://mmtsb. scripps.edu). These tools are written in CHARMM (Brooks et al., 1983) scripting lan- guage and augmented with PERL (Practical Extraction and Report Lan- guage) scripts. The currently available analyses are described below.

1) Contact tables: Contacting residue pairs at the subunit-subunit interfaces are identified based on the residue pair-wise distance criteria (Godzik, Kolinski, and Skolnick, 1992). These are arranged such that the residue-pairs at the interfaces related by the same quasi-symmetry are listed side by side as separate columns with individual residue pairs aligned. The contacting residue-pairs are further characterized by the type of the amino acid residues involved (e.g., polar, non-polar, charged). Table 1 shows an example of such contact table of quasi 3-fold interfaces of black beetle virus (PDB- ID: 2BBV).

2) Estimation of quasi-equivalence: Each of the interfaces in a quasi- equivalent capsid is represented in terms of an N x N matrix, where N corresponds to the total number of ordered residues. The elements corresponding to contacting residue pairs are identified by 1’s and those corresponding to non-contacting residue pairs by 0’s. The quasi-equivalence between the pairs of interfaces is estimated as the normalized dot product of the two matrices (Damodaran et al., 2002).

3 ) Amino acid propensities: The amino acid residue pair propensities at the subunit interfaces are calculated by mapping the residue type information in to a 20x20 matrix for each interface (Lander et al., unpublished results).

4) Free energies of association: Free energies of subunit-subunit associations are calculated based on the loss of atomic solvent accessible surface areas multiplied by the atomic solvation parameters (Eisenberg and McLachlan, 1986). The total association energy and the residue wise contributions are calculated from the same information.

Page 423: Conformational Proteomics of Macromolecular Architecture

410 Vijay S. Reddy et al.

2BBV: Subunit Association Energies Subunit QuasiAcosa- Buried Association lnterface hedral Surface Energies

Al-BI B1-CI CI-A1 B5-A1 Cl-C6 A 1 -A5 B5-Cl C 1 -B6 A 1 -C6

Symmetry

Quasi 3-fold Quasi 3-fold Quasi 3-fold Quasi 2-fold ICOS. 2-fold ICOS. 5-fold Quasi 5-fold Quasi 6-fold Quasi 6-fold

Area (A2) -84.0 -70.0 -72.0 -36.0 -21.0 -36.0 -37.0 -26.0 - 18.0

(kcaUmol)

4262.0 3621.0 3662.0 1877.0 1402.0 1982.0 1992.0 1624.0 1207.0

CI-RI ._ -6.0 562.0 C 1 -R6 _ _ 4.0 1 19.0 R1-R6 -- -4.0 816.0

Figure 4. Inter subunit association energies of the unique subunit interfaces present in the capsid of black beetle virus (PBD-ID: 2BBV). On the left is the table listing the different interfaces, quasi-symmetry associated with the interface, association energies and buried surface areas. Shown on the right is the geometric representation of the T=3 quasi- equivalent lattice. Each trapezoid represents a subunit of the viral capsid. The letters A, B and C identify the trapezoids representing 3 distinct subunit environments in the T=3 capsid. The numbers ( I , 2, 3 ...) identify different icosahedral asymmetric units. Al , B1 and CI subunits form the reference asymmetric unit.

In addition, the inter particle contacts that occur in the crystal lattice calculated by the method of Natarajan and Johnson (Natarajan and Johnson, 1998) are available for the relevant entries. The derived results of such analyses on each capsid are made available through the corre- sponding web pages. The above analyses are available for the users through the VIPER website.

There are also a number of structural tools made available at the VIPER site.

1) Zcosuhedrul sewer: This tool generates the quasi-equivalent lattice for a given T-number by folding the planar hexagonal lattice into a closed shell (Dr. Chunxu Qu unpublished results) based on the principles of Caspar and Klug (Caspar and Klug, 1962). The T- number usually refers to number of subunits in an icosahedral asymmetric unit. It is defined as T= h2+hk+k2, where h and k are integers.

Page 424: Conformational Proteomics of Macromolecular Architecture

- VIrus Particle ExploreR (VIPER) 41 1

2) Oligomer Generator: This utility generates user specified oligomer of the capsid protein subunits in a given capsid. This also has options to generate complete or partial capsids.

3) Visual Viper: This tool fetches various pictorial descriptions of different capsids available at the VIPER site and displays them in the web browser.

4) Links to Protein Explorer: Server side links to the visualization program, Protein Explorer (Martz, 2002) are provided for each entry such that the user can interactively display the subunits in the icosa- hedral asymmetric unit in real time all within the web-browser. However, the CHIME plug in necessary to use this utility is currently available for PCs and Macintosh computers only.

DOWNLOADING THE DATA AND RESULTS All the data and results that include the transformed coordinates in VIPER orientation, matrices, various images of the capsids and the de- rived results from various analyses can be readily downloaded for re- search and education. The users are strongly encouraged to cite the VIPER site and the primary reference (Reddy et al., 2001) when they present the data and images downloaded from the VIPER site.

FUTURE DIRECTIONS The new tools and the derived results will be made available through the VIPER site as and when they are generated. We have begun to include and analyze the models fitted into densities derived from cryo-electron microscopy and image analyses that are available in PDB. We also have started to create a relational database of virus structures and the derived results of various analyses, which will be made accessible through the VIPER site. Furthermore, efforts are underway to obtain and include cryo-electron microscopy reconstructions (densities) at the VIPER site.

ACKNOWLEDGMENTS The VIPER site is being developed as training, service and dissemination component of the Research resource: Multi-scale Modeling Tools for Structural Biology (MMTSB), which is fully supported by the National

Page 425: Conformational Proteomics of Macromolecular Architecture

412 Vijay S. Reddy et al.

Center for Research Resources of the National Institutes of Health (RR 12255).

REFERENCES 1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H,

Shindyalov IN, and Bourne PE. The Protein Data Bank. Nucl Acids Res, 2000; 28: 235-42.

2. Brooks B, Bruccoleri D, Olafson D, States D, Swaminathan S, and Karplus M. CHARMM: A~ program for macromolecular energy, minimization and dynamics calculations. J Comput Chem, 1983; 4: 187-217.

3. Caspar DL, and Klug A. Physical principles in the construction of regular viruses. Cold Spring Harbor Symp Quant Biol, 1962; 27: 1.

4. Damodaran KV, Reddy VS, Johnson JE, and Brooks I11 CL. A general method to quantify quasi-equivalence in icosahedral viruses. J Mol Biol,

5. Eisenberg, D, and McLachlan, AD. Solvation energy in protein folding and binding. Nature, 1986; 319: 199-203.

6. Godzik A, Kolinski A, and Skolnick J. Topology Fingerprint Approach to the Inverse Protein Folding Problem. J Mol Biol, 1992; 227: 227-38.

7. Jones TA, Zou J-Y, Cowan SW, and Kjeldgaard M. Improved methods for the building of protein models in electron density maps and the location of errors in these models. Acta Crys,. 1991; A47: 110-119.

8. Kraulis P. MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J Appl Crystallogr, 1991 ; 24: 946-950.

9. Martz, E. Protein Explorer: Easy yet powerful macromolecular visualiza- tion. Trends Biochem Sci, 2002; 27: 107-9.

10. Merritt EA, and Bacon JD. Raster3D: photorealistic molecular graphics. Methods Enzymol, 1997; 277: 505-524.

1 1. Natarajan P, and Johnson JE. Molecular packing in virus crystals: geometry, chemistry, and biology. J Struct Biol, 1998; 121: 295-305.

12. Nicholls A, Sharp KA, and Honig B. Protein folding and association: in- sights from the interfacial and thermodynamic properties of hydrocarbons. Proteins, 1991; 11: 281-96.

13. Reddy VS, Natarajan P, Okerberg B, Li K, Damodaran KV, Morton RT, Brooks CL 111, and Johnson JE. Virus Particle Explorer (VIPER), a Website for Virus Capsid Structures and Their Computational Analyses. J Virol,

2002; 324: 723-37.

2001; 75: 11943-11947.

Page 426: Conformational Proteomics of Macromolecular Architecture

INDEX

P-propeller, in clathrin, 165 2-0x0 acid dehydrogenase complex,

2-oxoglutarate dehydrogenase, 176 5-fold symmetry, 49 5-fold symmetry, in virus

175,182

crystallography, 12, 13, 42

Acetyl-CoA carboxylase, 187 Acetyl-CoA, 175 Actin, 345 Actinorhodin polyketide synthase,

191 Acyl carrier protein, ACP, I89f Adenovirus, 22 Adhesion molecule-1 , ICAM- 1, as

picornavirus receptor, 383,387 Ageno, Mario, formula, 30 Allosteric enzymes, 222 Allosteric, 222f Alpha virus, SPR analyses, 379 Alphavirus, 78, 83f Alphavirus, genome organization, 87 Alphavirus, structural proteins, 87 Amorphous precipitates, 369 Amphiphile mixtures, 123 Antibiotics, interaction with rRNA,

Antibody affinity, SPR analyses, 379 Antibody epitopes, accessibility, 379 AP180, 161 APS kinase, APS 3’-

APS, see ATP sulfate

Aquaporin fold, 152

245

phosphotransferase, 223

adenyltransferase, 223

Aquaporin-1, AQP1, structure, 152f,

Aquifex aeolius, lumazine synthase,

Arms, attaching liposomes, 63 Arms, extended, in virus assembly,

Arms, in viral coat proteins, 56 Arms, pore-forming, in virus, 63 Arms, see also Swinging arms Arrestin, 163 Asymmetric unit, in virus structure,

ATP sulfurylase, 222 ATP sulfurylase, ATP sulfate

ATP sulfurylases, 222f ATPase, in myosin, 346 Auxilin, 161, 165f

154F

204,210f

63

406

adenylyltransferase, 223

Bacterial elongation factor EF-Tu, 53 Bacteriophage T4, tail structure, 50 Bacteriophage CD 174,22 Bacteriorhodopsin, 123, 152 Bernal, J.D., 5 Bicelle crystallization method, 123 Bilayer membranes, 80 Bilayer, three-dimensional, 123 Binding interactions in actin

Biosensors, 379f Biotin dependent enzymes, 187 Biotin in carboxylases, 186 Biotinyl groups, 174, 186 BMCD, Biological Macromolecular

Crystallization Data base, 362

filaments, 352, 353

413

Page 427: Conformational Proteomics of Macromolecular Architecture

414 Index

Brag rods, 138 BSV, see Tomato Bushy Stunt Virus,

Canine parvovirus, see CPV Cap, rotary, in cap-filament complex,

Capsid architecture, 403 Capsid protein, SFV, structure, 88 Caspar, Don, 9, 15 Caspar, formula, 31 Caspar-Klug, theory, 19, 35, 56 Cation channels, 91 Cauliflower mosaic virus, 47 Chaperone Hsc70, and Auxilin, 166, Cholesterol, in SFV fusion, 94, 97 Clathrin adaptor AP2, 161f Clathrin, 161f CoA, 175f Coat proteins, in virus, 53 Coated vesicles, 16 1 Co-crystals, of ribosomal particles,

Companions, of clathrin, 165 Complex formation, entropy of, 54 Computational analysis oc virus

Conformational mobility, in

Conformational polymorphism, in

Conformational switch, in flagellin,

Cooperativity, in giant enzymes, 222 CPV, canine parvovirus, 65 Crane, Richard, 5 Crick, Francis, 15 Crick-Watson theory of virus

structure, 4 Cross-bridge cycle, 355 CryoEM of rigor complex, 349 CryoEM, bacterial flagellum

34 1

248

structure, 403f

ribosome, 258

virus assembly, 45

339

structure, 336f

CryoEM, of ribosome, 291f CryoEM, see electron

cryomicroscop y Cryoprotectants, 124 Cryoprotectants, list of, 125 Cryoprotection, 136 Crystal structure, of ATP

sulfurylases, 222f Crystal structure, of sulfurylases, 222 Crystal structure, ribosome

intermediates, 259, 262 Crystalline lattice, order of, 113 Crystallization methods, bicelle, 123 Crystallization, 361 Crystallization, of membrane

proteins, 122 Crystallization, protocols for, 363,

3 69 Crystallization, screening for

conditions, 370 Crystallographic phasing, 135 Crystallomic, 361 Crystalomics, 361 Crystals, type 1, 122 C-terminal arms, polyoma virus

assembly, 69 Cubic phase crystals, 123 Curvature in capsomer contacts, 70 Curvature, virus particles, 70 Cytochrome c oxidase, bovine heart

muscle, 116 Cytochrome c oxidase,

mitochondrial. 122

DC-SIGN, 94 Dengue virus, 99 Desmodium yellow mottle virus,

Detergents, in crystallization, 365 Detergents, in membrane protein

dihydrolipoyl acetyltransferase, 175

regulating arms, 68

purification, 116

Page 428: Conformational Proteomics of Macromolecular Architecture

Index 415

dihydrolipoyl dehydrogenase, 176,

Disassembly, control of, 62 Disorder, in virus proteins, 53, 56 Domain rotation, in virus proteins, 53 Dynamics, of ribosome, 245f, 29 lf,

184

296

E l in SFV structure, 92 Ebola virus, 8 1 Ectodomain, of SFV El , 84,92 Edeine antibiotic, in ribosome, 261 Eigenmodes, in normal mode

Electron cryomicroscopy, clathrin,

Electron cryomicroscopy, CryoEM,

Electron crystallography, 148 Electron density distribution, x-ray

crystallography, 114 Electron lasers, for structure

determination, 133 Electron microscopy, virus

morphology, 21, 22 Elongation cycle, 293f, 293, 295 Elongation factor G, 291f, 3 16f Elongation factor G, crystal forms,

Elongation factor Tu, in ribosome,

Encephalitis, viral, mosquito borne,

Endocytic motif of receptors, 163 Endocytosis, 79 Endosequence, fusion peptide, 9 1 Entropy, of complex formation, 54 Evolution of sulfurylases, 222

analysis, 301

structure, 161

membrane proteins, 148

319

291f, 314

84

Filament, flagellar, 336f Flagellin, 339 Flagellum, bacterial, 333f

Flat and bent contacts, in virus

Flaviviruses, 83 Flavodoxin, 203 met-tRNA, complex in ribosome,

Folds, in virus proteins, 60 Franklin, Rosalind, 6, 15 Fuller, Buckminster, architectural

Fuller, Buckminster, formula, 30 Furin cleavage, 87, 89, 100 Fusion mechanisms, 78 Fusion mechanisms, class 1,

influenza virus model, 81 Fusion mechanisms, class 2,

alphavirus model, 83, 84 Fusion peptide, SOf, 97 Fusion peptide, in SFV, 89 Fusion protein, in alphavirus, 79, 90 Fusion protein, in HIV, 79, 82 Fusion protein, in influenza

Fusion protein, postfusion structure,

Fusion, intermediates, 80

assembly, 65, 67

292

design, 2, 25

virus, 79

99

Galanthus nivalis lectin, GNA, 95,

Geodesic domes, as virus model, 25 Glycine cleavage system, glycine

decarboxylase, 184, 185 Glycoconjugates of SFV, 92 GNA, see Galanthus nivalis lectin, G-protein-coupled receptors and

GTP hydrolysis, 245,301, 307f

38 1

Clathrin, 163

Hanging drop method, crystallization

Heavy riboflavin synthase, 199, protocol, 363

200F, 202

Page 429: Conformational Proteomics of Macromolecular Architecture

416 Index

Helical bundle, in water channel, 152,

Helical bundles, in virus fusion, 83 Helical bundles, influenza fusion

mechanism, 82 Helical filaments, 333 Helical forms, switch in flagellum,

Helical, contacts in ribosomal

Helical, double helical region in

Helium cooled specimen stage, 148 Helix-helix interaction, in aquaporin

Hemagglutinin, 82 Hemifusion, 80, 81 Heparan sulfate, 94 Hepatitis B fold, 58 Herpes simplex virus, 22 Herpes virus, micrograph of, 21 Heterotrimer formation, in virus

fusion, 83f High-affinity laminin receptor, 94 HK97 fold, 58 H-protein from pea leaf, 185, 186 Hydrophobic channel, in AQP1, 154

154F

333

subunits, 253

RNA, 297

fold, 153

Ice crystals, 362 Icosahedral puryvate dehydrogenase

Icosahedral symmetry, in virus

Icosahedron, Keppler’s natural

Integral membrane proteins, structure,

Intensity of reflection, in x-ray

complex, 178

structure, 13,42,406

crystals, 42

134

crystallography, 1 13

Jelly roll fold, in viral coat proteins, 57,58

Kiefersauer, crystal mounting system,

Mug, Aaron, 7, 15 26

L12, in ribosome, 245, 307 Lamellar structure, of bicelles, 123 Lectin binding, 38 1 Lectin binding, studied by SPR, 379f Levivirus fold, in viral coat proteins,

Ligand binding kinetics, 38 1 Lipid bilayer, 97 Lipid cubic phase, 123 Lipoic acid, lipoyl groups, 174, 180 Liposomes, 82,94 Locomotion, bacterial, 333

Lumazin synthase, p60 capsid of, 205 Lumazine synthase, 198f Lumazine synthase, icosahedral

particles, of, 201 Lumazine synthase, pentamers, 209 Lysozyme, crystal formation, 369

57

L-SIGN, 94

Macrolide antibiotics, 248 Mannose-binding proteins, 94 Maturation cleavage, 82, 91 McHale, John, 18, 27 Mechanical switching in bacterial

flagellum, 333f Membrane proteins, crystallization

of, 122,361 Membrane proteins, functional

details, 148f Membrane proteins, structure, 11 lf,

133, 148f Metastable state, in crystallization,

366 Monoolein, 123 mRNA binding, 253,293f MS2 fold, in viral coat proteins, 58 Multienzyme complexes, 175, 198

Page 430: Conformational Proteomics of Macromolecular Architecture

Index 417

Multifunctional enzyme complexes,

Multifunctional synthetases, 189 Myelin P2 protein, crystals, 362, 366 Myosin cross bridge, 345, 347 Myosin, 50K upper/lower domains,

Myosin, ATP, 345f Myosin, crystal structures, 348f Myxoviruses, 22

171

356

NoV, 60 Nanomachines, 17 1 NASA Archive for Protein Crystal

Neurovirulence, 9 1 Nodaviridae, internal terminals, 63 Normal mode analysis, of cryoEM

Nucleation of solid phase protein, 367 Nudaurelia capensis o virus, see

Growth Data, 362

data, 301

NoV

Octahedral symmetry, in giant enzymes, 176

Octahedral virus particles, polyomavirus VP1,47

Osmotic pressure, in crystal formation, 367

Parainfluenza virus, 22 Paramecium bursaria Chlorella virus,

Pariacoto virus, 71 Pentagonal tessellation, 48 Pentameric capsomers, polyoma

Peptidyl transferase, in ribosome,

Peptidyl transferase, two-fold rotation

5

virus, 46,47

256,267,270,307f

center, 272

Phage MS2, translational repressor of

Phases, in structure determination,

Phosphopantetheine, 173 Phosphopantetheinyl groups, 174,

Picorna virus, connecting arms, 62 Picornavirus, 382 P-loop protein, myosin, 346 Point-group crystal, 16 Polio virus, arms in, 62 Poliovirus, receptor kinetics, 382f Poliovirus, structure, historical, 17,

Polyamines, 7 1 Polymorphic Supercoiling, 333f Polymorphic switching, 336 Polymorphism, in polyoma virus

Polyoma virus, tubes, 47 Polyoma virus, 1 1, 68 Polyoma virus, capsomers, 46 Polyoma virus, reconstruction, 44 Pore formation, in NwV, 64 Pore formation, in parvovirus, 65 Post-fusion, structure, 98f Power stroke, in myosin, 345f, 348f Precipitants, 365 Precipitation diagram, 368 Prosthetic groups as swinging arms,

Protection of intermediates, 173 Protein disorder in virus, 53 Protein dissociation and association,

Protein exit tunnel, in ribosomes, 274 Protein shuttle, 163 Protein synthesis, 245, 291f, 307f Protein-nucleic acid interaction, in

virus gene, 56

139, 149

189

22

structure, 47

173

198

virus, 53, 70

Page 431: Conformational Proteomics of Macromolecular Architecture

418 Index

Protein-protein interaction database,

Protein-protein interactions, 403f Proteins fold types in icosahedral

Puromycin, 247 Pyruvate dehydragenase complex,

PPiDB, 393

viruses, 58

179

Quasi-equivalence theory of virus structure, historical 28, 3.5,

Quasi-equivalence, concept, 21 1 Quasi-equivalent, bonding, 45 Quasi-equivalent, surface lattice, 8 1 Quasi-symmetry, 177

Radiation damage, 136 Randome, 26 Ratchet motion, in ribosomes, 291 Real space refinement, 29 1,297 Receptor binding domain, in

alphaviruses, 90, 94 Receptor interaction, binding

bnetics, 379, 382 Reductive acetylation, 179 Relay helix, in myosin, 345f Reovirus nucleocapsid protein fold,

Reovirus, trimers, 70 Resolution, of x-ray diffraction, 1 13 Retrovirus nucleocapsid protein fold,

Rhinoviruses, receptor kinetics, 382f Riboflavin synthase, 198 Riboflavin synthase, 199 Ribosome structure, historical, 249,

Ribosome, dynamics, 245f, 291f,

Ribosome, structure, 245f, 291f Ribosome, subunits, 252, 254,

58

58

300

300f

292

Ribosomes, 245f, 291f, 307f Rice yellow mottle virus, packing of

Rigor complex, 349 RNA-binding, in virus, 7 1 Rosetta stone sequences, 397 Rotary cap mechanism, 341

arms, 66

Salts, buried, in myosin motor, 355 SASE, Self-Amplified Spontaneous

Emission, 134 SBMV, see SCPMV SCPMV, assembly, 65 SelB, 314,315f Self assembly mechanism, bacterial

flagellum, 339f, CD animation Self-assembly, of ribosome, 17 If Self-assembly, of bacterial flagellum,

Self-assembly, of virus particles, 47,

Semliki Forest virus, antibody

Semliki Forest virus, crystals, 362,

Semliki Forest virus, fusion, 83 Semliki Forest virus, SFV, structure,

Sensor surface interactions, 379f Serine hydroxymethyltransferase, 1 85 Serine protease fold, in viral coat

Serine protease, of Sindbis virus, 56 SFV, see Semliki Forest virus Sindbis virus, 56 Sindbis virus, serine protease, 56 Single particles reconstruction,

Snapshot, cryoEM of ribosome, 302 Snelson, Kenneth, and tensegrity

Sobemovirus assembly, 67

333f

48,53f

binding, 379

366

85f

proteins, 57, 58

ribosome, 293

structures, 28

Page 432: Conformational Proteomics of Macromolecular Architecture

Index 419

Soft X-ray microscopy, 136 Southern bean mosaic virus, see

Southern cowpea mosaic virus, see

Space-group crystal, 16 Spherical viruses, historical, 12 Sphingolipids, in SFV fusion, 94 Spike, structure, in SFV, 90 SPR, see surface plasmon resonance Stalk-pore model, 81 Structural switching in TMV, 48 Substrate channeling, 173 Subunit interaction, 264, 298 Sugar beet virus, 22 Sugar deletion mutants, SFV, 92 Sulfurylases, active site, 227, 229 Sulfurylases, allosteric domain, 222,

Sulfurylases, oligomer structure, 227 Sulfurylases, structural comparison,

Sulfurylases, structural domains, 226 S upercoiling, polymorphic, 3 3 3 f Supersaturation, in crystal formation,

367 Surface dynamics, studied by SPR,

38 1 Surface plasmon resonance, SPR ,

376 SV40 polyomavirus VP1

configurations, 46,47 SV40, connecting arms, 68 Swinging arms, in giant enzymes,

Switch, in myosin, 345f, 350 Switching D- and L-forms, 338

southern cowpea mosaic virus

SCPMV

232

222,234,236

174, 180

Taylor cross-bridge cycle, 355 TBE, see tick born encephalitis virus TBMV, assembly, 65

Temperature factor, B, in x-ray diffraction, 1 13

Tensegrity, structures, 27,28, 1001 Tetrahydrofolate, 184 Tetraviridae, internal terminals, 63 tGTPases, functional cycle, 3 12, 320 tGTPases, functional properties, 3 13 Theory of quasi-equivalence, 43 Thermodynamics, in intermolecular

interaction, 380 Thiamin diphosphate, 175 Thioester linkage, 175 Three-dimensional bilayer, 123 Tick born encephalitis virus, TBE,

Tiling theory, for virus shell

Time-resolved dynamic states, in

TMV, helical symmetry, 5, 48 TMV, historical model, 4 TMV, RNA, 6 TMV, self assembly, 48 T-number, 36,46,56,410 Tobacco Mosaic Virus, see TMV Tomato Bushy Stunt Virus, see

Translation factors, 245, 294f,

Translation, in ribosome, 245, 294f,

Translational operator, as packing

Translocation, 29 If Transmembrane peptide, SFV, 91,93 Triangulation, number, T, 36,46, 56,

Triangulation, of sphere, 29, 33 tRNA mimicking factors, 3 10 tRNA, positioning in ribosome, 245f,

fusion, 84

organization, 4

crystals, 142

TBSV

309f

307

signal, 71

410

293f, 308

Page 433: Conformational Proteomics of Macromolecular Architecture

420 Index

tRNA, release from ribosome, 263,

Turnip crinkle virus, 7 1 Turnip Yellow Mosaic Virus, see

Two-dimensional crystals, 133, 137 TYMV, historical, 23 TYMV, Turnip Yellow Mosaic Virus,

295

TYMV

3f

Vicia villosa lectin, VVL, 96 VIPER, VIrus Particle ExploreR

Viral coat proteins, functions of, 55 Viral-RNA, protein interaction, 7 1 Viroporin, 9 1 Virus assembly, 47,48, 53f, 67, 89 Virus dynamics, revealed by SPR

Virus structure database, 403f

database, 403

analysis, 379f

Virus structure, historical, 3f Virus structure, principles of, 19,

Virus, as surface crystal, 15 Virus, crystallization of, 361 Vitamin biosynthesis, 198 VVL, see Vicia villosa lectin

403f

Water channel, 148 Water transport, 152 Watson, Jim, 8

X-ray crystallography of giant

X-ray crystallography of membrane

X-ray FEL, X-ray free electron laser,

enzymes, 198

proteins, 11 I f

133

Zink ions, 97