journal of structural biology - wolfson centre home...

9
UNCORRECTED PROOF 1 2 High-throughput expression and purification of membrane proteins 3 Filippo Mancia a , James Love b, * 4 a Department of Physiology and Cellular Biophysics, Columbia University, New York, NY, USA 5 b New York Structural Biology Center, New York, NY, USA 6 8 article info 9 Article history: 10 Received 6 January 2010 11 Received in revised form 11 March 2010 12 Accepted 17 March 2010 13 Available online xxxx 14 Keywords: 15 Membrane proteins 16 Detergent 17 Expression 18 High-throughput 19 Structure 20 Structural genomics 21 22 abstract 23 High-throughput (HT) methodologies have had a tremendous impact on structural biology of soluble pro- 24 teins. High-resolution structure determination relies on the ability of the macromolecule to form ordered 25 crystals that diffract X-rays. While crystallization remains somewhat empirical, for a given protein, suc- 26 cess is proportional to the number of conditions screened and to the number of variants trialed. HT tech- 27 niques have greatly increased the number of targets that can be trialed and the rate at which these can be 28 produced. In terms of number of structures solved, membrane proteins appear to be lagging many years 29 behind their soluble counterparts. Likewise, HT methodologies for production and characterization of 30 these hydrophobic macromolecules are only now emerging. Presented here is an HT platform designed 31 exclusively for membrane proteins that has processed over 5000 targets. 32 Ó 2010 Published by Elsevier Inc. 33 34 35 1. Introduction 36 Every biological process requiring exchange of information be- 37 tween cells or intracellular compartments or detection of and re- 38 sponse to either extracellular stimuli or cues or nutrients, relies for 39 success on one or more proteins embedded in the cell membranes 40 and endowed with the capability of accurately carrying out such 41 task. Membrane proteins comprise approximately 30% of all pro- 42 teins in both prokaryotic and eukaryotic organisms (Wallin and 43 von Heijne, 1998), and, not surprisingly, represent by far the most 44 thought-after pharmacological targets. However, despite their 45 importance, they are but a fraction of a percent of those with a 46 known structure. As of December 2009, 214 unique entries in the 47 protein data bank (PDB; Berman et al., 2003), only 0.5% of the total, 48 were accounted for by membrane-spanning proteins. Indeed, inte- 49 gral membrane proteins present formidable, albeit not insurmount- 50 able, challenges for structural analysis. The field has progressed 51 tremendously since the first result in three dimensions, by electron 52 crystallography at 7 Å, on bacteriorhodopsin (Henderson and Un- 53 win, 1975) and the first high-resolution structure, at 3 Å by X-ray 54 crystallography on a photosynthetic reaction center (Deisenhofer 55 et al., 1984), and membrane protein structures have been deter- 56 mined at an accelerated pace in recent years. Nevertheless, struc- 57 tural output for this class of macromolecules remains negligible in 58 comparison to their soluble counterparts. The initial objective of 59 the New York Consortium of Membrane Protein Structure 60 (NYCOMPS) was to construct an automated HT pipeline for integral 61 membrane protein production and preliminary characterization 62 that would achieve a comparable level of productivity to that 63 observed for soluble proteins in the initial phases of the Protein 64 Structure Initiative (PSI). To fulfill this goal, we designed and imple- 65 mented a centralized Protein Production Facility at the New York 66 Structural Biology Center (NYSBC) to implement cloning and screen- 67 ing activities. The methodologies underpinning this pipeline, pre- 68 sented in detail below and in schematic form in Fig. 1, were 69 developed over a period of years through iterative rounds of testing 70 and optimization and have so far resulted in 7000 cloned and 71 screened targets, yielding crystals from 30 of these, and 24 struc- 72 tures from six unique novel membrane protein. 73 1.1. Overview of the field 74 The natural association of membrane proteins with lipid bilayers, 75 and the resulting need of detergents for extraction and purification 76 complicate their structural analysis. Once diffracting membrane- 77 protein crystals are obtained, for example, the diffraction analysis 78 is typically as straightforward as it is for aqueous soluble macromol- 79 ecules. Problems that arise in the recombinant expression of mem- 80 brane proteins are even more limiting than difficulties in 81 purification and crystallization. Furthermore, the establishment of 82 a suitable recombinant expression system is typically a prerequisite 83 for successful structure determination, and represents the first bot- 84 tleneck of the entire process (Grisshammer and Tate, 1995). There 85 are only few examples present in high-abundance from natural 86 sources, and membrane proteins are most often intolerant to 1047-8477/$ - see front matter Ó 2010 Published by Elsevier Inc. doi:10.1016/j.jsb.2010.03.021 * Corresponding author. Fax: +1 212 939 0863. E-mail address: [email protected] (J. Love). Journal of Structural Biology xxx (2010) xxx–xxx Contents lists available at ScienceDirect Journal of Structural Biology journal homepage: www.elsevier.com/locate/yjsbi YJSBI 5784 No. of Pages 9, Model 5G 17 April 2010 ARTICLE IN PRESS Please cite this article in press as: Mancia, F., Love, J. High-throughput expression and purification of membrane proteins. J. Struct. Biol. (2010), doi:10.1016/j.jsb.2010.03.021

Upload: others

Post on 09-Sep-2019

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Journal of Structural Biology - Wolfson Centre Home Pagewolfson.huji.ac.il/purification/Course92632_2014/Membrane proteins... · 12 Accepted 17 March 2010 13 Available online xxxx

1

2

3

45

6

8

910111213

1415161718192021

2 2

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

Journal of Structural Biology xxx (2010) xxx–xxx

YJSBI 5784 No. of Pages 9, Model 5G

17 April 2010ARTICLE IN PRESS

Contents lists available at ScienceDirect

Journal of Structural Biology

journal homepage: www.elsevier .com/locate /y jsbi

OFHigh-throughput expression and purification of membrane proteins

Filippo Mancia a, James Love b,*

a Department of Physiology and Cellular Biophysics, Columbia University, New York, NY, USAb New York Structural Biology Center, New York, NY, USA

a r t i c l e i n f o a b s t r a c t

E23242526272829303132

Article history:Received 6 January 2010Received in revised form 11 March 2010Accepted 17 March 2010Available online xxxx

Keywords:Membrane proteinsDetergentExpressionHigh-throughputStructureStructural genomics

1047-8477/$ - see front matter � 2010 Published bydoi:10.1016/j.jsb.2010.03.021

* Corresponding author. Fax: +1 212 939 0863.E-mail address: [email protected] (J. Love).

Please cite this article in press as: Mancia, Fdoi:10.1016/j.jsb.2010.03.021

DPR

O

High-throughput (HT) methodologies have had a tremendous impact on structural biology of soluble pro-teins. High-resolution structure determination relies on the ability of the macromolecule to form orderedcrystals that diffract X-rays. While crystallization remains somewhat empirical, for a given protein, suc-cess is proportional to the number of conditions screened and to the number of variants trialed. HT tech-niques have greatly increased the number of targets that can be trialed and the rate at which these can beproduced. In terms of number of structures solved, membrane proteins appear to be lagging many yearsbehind their soluble counterparts. Likewise, HT methodologies for production and characterization ofthese hydrophobic macromolecules are only now emerging. Presented here is an HT platform designedexclusively for membrane proteins that has processed over 5000 targets.

� 2010 Published by Elsevier Inc.

33

T

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

UN

CO

RR

EC1. Introduction

Every biological process requiring exchange of information be-tween cells or intracellular compartments or detection of and re-sponse to either extracellular stimuli or cues or nutrients, relies forsuccess on one or more proteins embedded in the cell membranesand endowed with the capability of accurately carrying out suchtask. Membrane proteins comprise approximately 30% of all pro-teins in both prokaryotic and eukaryotic organisms (Wallin andvon Heijne, 1998), and, not surprisingly, represent by far the mostthought-after pharmacological targets. However, despite theirimportance, they are but a fraction of a percent of those with aknown structure. As of December 2009, 214 unique entries in theprotein data bank (PDB; Berman et al., 2003), only 0.5% of the total,were accounted for by membrane-spanning proteins. Indeed, inte-gral membrane proteins present formidable, albeit not insurmount-able, challenges for structural analysis. The field has progressedtremendously since the first result in three dimensions, by electroncrystallography at 7 Å, on bacteriorhodopsin (Henderson and Un-win, 1975) and the first high-resolution structure, at 3 Å by X-raycrystallography on a photosynthetic reaction center (Deisenhoferet al., 1984), and membrane protein structures have been deter-mined at an accelerated pace in recent years. Nevertheless, struc-tural output for this class of macromolecules remains negligible incomparison to their soluble counterparts. The initial objective ofthe New York Consortium of Membrane Protein Structure

84

85

86

Elsevier Inc.

., Love, J. High-throughput ex

(NYCOMPS) was to construct an automated HT pipeline for integralmembrane protein production and preliminary characterizationthat would achieve a comparable level of productivity to thatobserved for soluble proteins in the initial phases of the ProteinStructure Initiative (PSI). To fulfill this goal, we designed and imple-mented a centralized Protein Production Facility at the New YorkStructural Biology Center (NYSBC) to implement cloning and screen-ing activities. The methodologies underpinning this pipeline, pre-sented in detail below and in schematic form in Fig. 1, weredeveloped over a period of years through iterative rounds of testingand optimization and have so far resulted in �7000 cloned andscreened targets, yielding crystals from �30 of these, and 24 struc-tures from six unique novel membrane protein.

1.1. Overview of the field

The natural association of membrane proteins with lipid bilayers,and the resulting need of detergents for extraction and purificationcomplicate their structural analysis. Once diffracting membrane-protein crystals are obtained, for example, the diffraction analysisis typically as straightforward as it is for aqueous soluble macromol-ecules. Problems that arise in the recombinant expression of mem-brane proteins are even more limiting than difficulties inpurification and crystallization. Furthermore, the establishment ofa suitable recombinant expression system is typically a prerequisitefor successful structure determination, and represents the first bot-tleneck of the entire process (Grisshammer and Tate, 1995). Thereare only few examples present in high-abundance from naturalsources, and membrane proteins are most often intolerant to

pression and purification of membrane proteins. J. Struct. Biol. (2010),

Page 2: Journal of Structural Biology - Wolfson Centre Home Pagewolfson.huji.ac.il/purification/Course92632_2014/Membrane proteins... · 12 Accepted 17 March 2010 13 Available online xxxx

T

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

Fig. 1. Schematic representation of the NYCOMPS HT pipeline for the identificationof prokaryotic membrane proteins suitable for structural studies.

2 F. Mancia, J. Love / Journal of Structural Biology xxx (2010) xxx–xxx

YJSBI 5784 No. of Pages 9, Model 5G

17 April 2010ARTICLE IN PRESS

UN

CO

RR

EC

standard, non-affinity based purification methods such as ion ex-change and hydrophobic interaction chromatography. Membraneproteins have the tendency to perform poorly in ectopic expressionsystems (Grisshammer and Tate, 1995). This may be caused by a ser-ies of reasons, including toxicity of the foreign protein to the host,specific requirements or differences in the translocation, membraneinsertion machinery or in the lipid composition of the membranes(Mancia and Hendrickson, 2007), and possibly the limited amountof free surface available for packing large amounts of recombinantprotein. The situation is further aggravated for eukaryotic mem-brane proteins, given the complexity of the milieu in which theyevolved, and their frequent need for chaperones, for specific lipidsand for precise post-translational modifications such as glycosyla-tion, sulfonation, palmitoylation and other covalent modifications(Mancia and Hendrickson, 2007; Mancia et al., 2004). In addition,post-translational modifications add possible heterogeneity to thesample, which may hinder downstream crystallization success. Typ-ically, prokaryotic membrane proteins are expressed in Escherichiacoli, although systems based on Lactococcus lactis have been devel-oped and successfully employed (Kunji et al., 2003). The success ofE. coli is due to the fact that it is robust, economical, well-understood,readily expandable, and easily manipulated for labeling of selectedamino acids (such as methionine with selenium for phasing ofdiffraction data from crystals (Hendrickson, 1991). Eukaryotic coun-terparts, with notable exceptions (Grisshammer and Tate, 1995;Mancia and Hendrickson, 2007; Grisshammer et al., 1993), aredependent for their successful expression on matching or similarlycomplex systems such as yeast, baculovirus-infected insect cellsand mammalian cells (Mancia et al., 2004; Midgett and Madden,2007). In general, eukaryotic membrane proteins have been recalci-trant in expression at the scale needed for structural analysis(Mancia and Hendrickson, 2007), although high-resolution struc-tures from recombinant sources are finally emerging at a much-needed accelerating pace (for examples, see Gonzales et al., 2009;Hanson and Stevens, 2009; Kawate et al., 2009; Long et al., 2005;Sobolevsky et al., 2009; Tao et al., 2009).

Choice and optimization of a suitable expression system are byno means the only requirements for successful structure determi-

Please cite this article in press as: Mancia, F., Love, J. High-throughput exdoi:10.1016/j.jsb.2010.03.021

ED

PR

OO

F

nation of a membrane protein. Detergents and their micelles arenotorious poor substitutes of lipids and their bilayer structures, of-ten leading to destabilization, denaturation and aggregation(Wiener, 2004). Unfortunately, detergents are required to extractand purify the target protein. The choice of detergent is a keyparameter of the entire process, further complicated by the factthat the shorter the aliphatic chain of the detergent, the moredestabilizing the effect on the protein but the better becomes theprobability of crystallization and X-ray diffraction to high resolu-tion. Furthermore, a detergent required for high yield extractionmay not be optimal in preserving functionality or oligomeric state,and may also have a detrimental impact on crystallization (Wiener,2004). Therefore, different detergents may be required for the var-ious, distinct phases of the necessary processes leading to structuredetermination. Extensive screening and optimization steps arethus required. These tedious procedures are time consuming andexpensive due the cost of reagents and the success rate is inevita-bly low.

How can the probability of success for membrane protein struc-tures be maximized? Following a conventional approach, onecould envision optimizing expression, extraction and purificationconditions in a tailor-made approach to maximize yields and sta-bility of the given protein, without any or with minimal interven-tion on the gene. Structures of membrane proteins isolated fromnatural sources are inevitably confined to being pursued by thisapproach. Alternatively, the expression and purification protocolscan be fixed, and genetic variants of a protein of interest clonedand screened to select the subset bearing the highest probabilityof crystallization. Here, expression conditions are set to maximizeyields, and chosen based on experience and consensus. Purificationparameters are defined to select expressing proteins that abide toone or more characteristics thought to be indicative, suggestive ofor necessary for successful crystallization. These include, for exam-ple, a sharp elution profile from size-exclusion chromatography(SEC), or stability in a short-chain detergent (Lemieux et al., 2003).

Homology screening is a powerful approach, and is the basis ofmany structural genomics initiatives. Homologues provide naturalvariation that may be less toxic, better expressed, more stable (par-ticularly in short-chain detergents) and may even be more likely tocrystallize. Screening can also be performed on synthetic variants,such as mutants with the goal of selecting those with enhancedproperties like expression levels and thermal stability (Serrano-Vega et al., 2008; Warne et al., 2009).

A high-throughput (HT) screening platform is essential for thisapproach to succeed. For soluble proteins, this has been achieved,and suitable robust platforms extensively tested, and thoroughlyoptimized, leading to the determination of thousands of high-res-olution structures (Dessailly et al., 2009; Joachimiak, 2009). Fur-thermore, technologies developed by structural genomics centershave also been invaluable to the community at large, and on aver-age, costs associated with every structure determination havedropped dramatically (Terwilliger et al., 2009).

In the use of HT methodologies, membrane proteins are onceagain lagging behind their soluble counterparts. However, severalreports on every stage of the process have facilitated the designof platforms and the adaptation of HT techniques for membraneprotein cloning, expression screening and purification.

In an excellent review of collective methods used in the expres-sion and purification of over 10,000 soluble proteins (Graslundet al., 2008), a set of consensus protocols were presented and someof these procedures could be adaptable to the HT production ofmembrane proteins. This is particularly true for the initial cloningsteps, where ligation-independent cloning (LIC; Aslanidis and deJong, 1990) is very useful as it is rapid, economical, and efficient,it requires no restriction digestion of the PCR-amplified product,and also can be designed to include no additional amino acids in

pression and purification of membrane proteins. J. Struct. Biol. (2010),

Page 3: Journal of Structural Biology - Wolfson Centre Home Pagewolfson.huji.ac.il/purification/Course92632_2014/Membrane proteins... · 12 Accepted 17 March 2010 13 Available online xxxx

T

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

F. Mancia, J. Love / Journal of Structural Biology xxx (2010) xxx–xxx 3

YJSBI 5784 No. of Pages 9, Model 5G

17 April 2010ARTICLE IN PRESS

UN

CO

RR

EC

the transcript. LIC has been successfully used for large-scale mem-brane protein cloning efforts (Clark et al., xxxx).

For HT expression, the consensus points to E. coli being the pre-ferred host for prokaryotic proteins, although L. lactis (Kunji et al.,2003) and cell-free systems and have also been proposed (Liguoriet al., 2007; Schwarz et al., 2008). Economical small-scale incuba-tors have been developed that are useful for high-density growthof E. coli in deep-well blocks. Optical densities measured at600 nm approaching 10 units are possible with just 600 mL of cul-ture (Page and R., 2004). Cultures grown under these conditionsappear also to be scalable with growth in both specialized 96-tubeairlift fermenters (Lesley et al., 2002) and ultraflasks (Brodsky andCronin, 2006).

Determination of optimal expression, extraction, and purifica-tion parameters has been the focus of several studies on varyingnumbers of membrane protein targets, ranging from several tensto few hundreds. In two manuscripts, Dobrovetsky et al. outline aHT process for membrane protein expression utilizing a singleaffinity tag and promoter system, one extraction and purificationdetergent followed by ion-exchange chromatography or SEC as a fi-nal purification step (Dobrovetsky et al., 2005; Dobrovetsky et al.,2007). Using these techniques they were able to screen 280E. coli and Thermotoga maritima integral membrane proteins. Theauthors conclude that, in a manner analogous to the techniques in-volved in HT platforms for soluble proteins, similar approaches formembrane proteins will succeed, but with higher attrition rates.Lewinson et al. investigate many of the necessary parameters nec-essary of a structural genomics type approach to the production ofmembrane proteins (Lewinson et al., 2008), focusing on the pro-karyotic P-type transporter family (Yatime et al., 2009). Theauthors set out to express and purify multiple homologues, in mul-tiple strains, temperatures, affinity tags, promoters and extractionand purification detergents. They find that many factors appear toimpact the final outcome, but conclude that (i) the closer phylumof the target gene to the expression host, the better the possibleoutcome; (ii) the promoter may affect the amount of protein pro-duced and the membrane incorporation levels; (iii) the positionand nature of the affinity tag may have a profound impact onexpression levels; (iv) that only a small subset of extraction deter-gents results in a high percentage of success in solubilizing amajority of the membrane proteins, with n-dodecyl-b-D-maltopyr-anoside (DDM) being the most favored. In a similar study by Esha-ghi et al. the expression profile of 49 E. coli membrane proteins wasanalyzed in this host varying, tag position, expression strain andextraction detergent (Eshaghi et al., 2005). Gateway (Hartleyet al., 2000) was utilized for the cloning, which may not be idealfor structural studies due to the introduction of additional aminoacids at the protein termini. These parameters were assessed forsuccess by dot blot and gel filtration analysis of some of the targets.The authors show that many assays can be conducted in a multi-well plate assay, and hence a high throughput may be achieved.The authors also settled on Fos-choline 12 (FC-12) as their stan-dard extraction detergent, due to its good solubilization properties.However, care should be used before considering the Fos-cholinedetergents as the panacea for membrane protein studies, becauseit’s suitability as a crystallization detergent has yet to be widely ac-cepted (Klock et al., 2008). Indeed, a review of all membrane pro-tein structures deposited in the PDB suggests that a majority oftargets are solubilized in just a small subset of detergents, withDDM being the most commonly used for extraction and purifica-tion (Willis and Koth, 2008). The same review also concludes thatfor recombinant proteins poly-histidines are the most commonlyused engineered-affinity tags, and that in �50% of the cases, SECis employed as the final purification step.

Whilst a single, monodisperse elution peak from SEC does notguarantee success, this technique is a very informative, predictive

Please cite this article in press as: Mancia, F., Love, J. High-throughput exdoi:10.1016/j.jsb.2010.03.021

ED

PR

OO

F

method for likelihood of crystallization of membrane proteins(Wang et al., 2003), as it is for soluble proteins (Klock et al.,2008). The elution profile of a membrane protein from a SECcolumn equilibrated in a given detergent can provide a reliableestimate on their aggregation state, and, in general, on their‘‘well-being” in that surfactant. SEC can be readily adapted to HTmethods with micro-volume HPLCs fitted with autoloaders andappropriately sized columns. It can also be given added value bybeing coupled to static-light scattering and refractive index detec-tors, allowing quantitative evaluation of excess detergent, micellesize and aggregation state (Veesler et al., 2009). SEC analysis canbe streamlined further by eliminating other time-consuming puri-fication steps, and using an in-line fluorescence detector to moni-tor the elution profiles of GFP-fusions directly from minisculeamounts of detergent-solubilized cell extracts (Kawate and Gou-aux, 2006). Fluorescence can also be used to estimate expressionlevels of GFP-membrane protein fusions, as shown by Hannonet al. on of �300 proteins from 18 bacterial and archeal extremo-philes (Hammon et al., 2009). The authors find that after ‘bench-marking’ levels of fluorescence from the fusion tag to levels oftarget protein expression, they can easily detect membrane pro-teins produced in amounts suitable for structural studies. The onlyrequirements are that the terminus of the protein that the GFP isattached to remains in the reducing environment of the cytoplasm,so that the GFP fluorescence develops properly (Daley et al., 2005).This may render a percentage of membrane proteins not amenableto this technique. In addition, it may still be necessary to performstandard SDS–PAGE analysis to verify that the fluorescence signalrecorded is from the intact, full-length fusion, rather than fromtruncation products.

The studies reviewed above and our own experience, allowed usto construct an expression and screening platform for prokaryoticmembrane proteins, presented here.

2. Materials and methods

2.1. Target selection protocol

In-depth discussion and experimental details for the targetselection process can be found in Punta et al. (2009). The initial NY-COMPS target set comprised more than 300,000 annotated se-quences from the RefSeq collection (Pruitt et al., 2007) belongingto 96 fully sequenced prokaryotic genomes. Since most targetshave no experimental annotation linking them to the membrane,the prediction program TMHMM2 (Krogh et al., 2001) was usedto predict transmembrane helices (TMHs). Although predictionmethods are estimated to be very accurate, they will inevitablymake helix prediction mistakes. Therefore, we retained only pro-teins with P2 predicted TMHs. Following this step, redundancywas reduced by filtering out targets with exceedingly similar se-quences, guaranteeing that no two proteins in our dataset sharedmore than 98% pairwise sequence identity (CD-hit; Li and Godzik,2006). Furthermore, in order to minimize the probability of intro-ducing water-soluble non-integral membrane proteins into ourpipeline, sequences with two predicted TMHs, for which the posi-tion of the most N-terminal TMH overlapped with a predicted sig-nal peptide sequence where excluded. Indeed, the most commonmistake of TMH prediction programs is to predict an N-terminalTMH in place of a signal peptide (Moller et al., 2001). Finally, targetsequences were excluded in proteins that were predicted to havemore than 15 consecutive disordered residues and hence mightbe problematic for crystallization (Esnouf et al., 2006). The remain-ing 39,037 sequences constitute the ‘‘NYCOMPS98” dataset.

The targets to be cloned are selected from the NYCOMPS98dataset of membrane proteins. Targets are selected following a

pression and purification of membrane proteins. J. Struct. Biol. (2010),

Page 4: Journal of Structural Biology - Wolfson Centre Home Pagewolfson.huji.ac.il/purification/Course92632_2014/Membrane proteins... · 12 Accepted 17 March 2010 13 Available online xxxx

T

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

4 F. Mancia, J. Love / Journal of Structural Biology xxx (2010) xxx–xxx

YJSBI 5784 No. of Pages 9, Model 5G

17 April 2010ARTICLE IN PRESS

UN

CO

RR

EC

two-step process. (1) Valid targets that either constitute promisingcandidates for structure determination and/or are of utmost bio-logical interest are identified. These are referred to as ‘‘seeds”. (2)The seeds are expanded into families of typically homologous pro-teins that have a predicted membrane region structures similar tothat of the seed. Only sequences that are part of the NYCOMPS98set of valid targets were used for the expansion of the seeds.

NYCOMPS seeds were chosen according to two distinct tracksthat are referred to as ‘‘central selection” and ‘‘nomination”. Cen-trally selected seeds were selected from a list of proteins that havepreviously been successfully expressed in E. coli (Daley et al.,2005). On the other hand, nominated seeds were ‘‘hand-picked”by laboratories participating in the NYCOMPS consortium andadjunct members from the community. Most of these seeds arewell-characterized proteins of known function. One key differencebetween centrally selected and nominated seeds is that novelty isnot enforced on the latter set. Instead, the observed similarities toproteins in the PDB was checked, and reported to the nominatinggroup, which was ultimately responsible for the final decision onwhether or not to pursue that specific target.

The seed expansion procedure is the same for centrally selectedand nominated targets and is based on reciprocal sequence similar-ity in the predicted transmembrane (TM) region between the seedand the NYCOMPS98 set of target proteins. In particular, given aPSI-BLAST (Altschul et al., 1997) alignment between the seed anda NYCOMPS98 protein with E-value of <10�3, the requirement isthat more than 50% of the residues predicted to be in TMHs in bothproteins are aligned. After a seed is expanded into a family of pro-teins predicted to have similar TM cores, all family members aresubject to additional filters. Firstly, any centrally selected targetwith significant similarity in the predicted TM region to proteinsin the PDB (PSI-BLAST E-value <1 and alignment covering >25%of the target TM region) is filtered out. This filter is not appliedto nominated targets. Secondly, all candidates for which there isevidence that they might constitute individual subunits of het-ero-oligomeric complexes (using information extracted from Eco-Cyc; (Keseler et al., 2009) are removed from the list. Finally,there is a correction phase dedicated to ‘‘inconsistencies” withinthe families. Proteins for which the number of predicted TMHs dif-fers greatly from that of the seed are typically discarded, as well as,proteins that differ significantly in their length with respect to theseed. Additionally, proteins that align well with the consensusN-terminus of the family (when any such consensus can be identi-fied) but that have additional N-terminal amino acids, are typicallyexcluded, because they often constitute cases of proteins that mayhave been erroneously annotated. In-depth discussion and experi-mental details for the target selection process can be found inPunta et al. (2009).

2.2. Cloning procedures

Amplification primers are designed by ‘‘Primer prime’er”, a pro-gram for automated oligonucleotide construction developed by theNorth-Eastern Structural Genomics Consortium (NESG; Everettet al., 2004). Forward and reverse primers are supplied at a 5 lMconcentration in a pre-mixed form in 384-well blocks (IDT, Inc.).

In order to simplify our cloning process, we developed standardpNYCOMPS vectors for prokaryotic expression. These reagents andtheir sequence information are publicly available at the PSI Mate-rials Repository (http://psimr.asu.edu/). Each plasmid carries a cas-sette harboring kanamycin resistance and is based on the IPTGinducible T7 promoter pET vector system (Novagen, Inc.). pNY-COMPS plasmids encodes a Tev protease (Kapust et al., 2002)cleavable composite FLAG/deca-HIS affinity tag at either theN- or C-terminus. We have also introduced a ccdB ‘‘death gene”(Bernard and Couturier, 1992) within the plasmid to negatively

Please cite this article in press as: Mancia, F., Love, J. High-throughput exdoi:10.1016/j.jsb.2010.03.021

ED

PR

OO

F

select uncut or self-ligating vector thus abolishing background inthe bacterial transformation following annealing.

Ninety-six genomic DNA templates were purchased from ATCCand arrayed in a single 96-well plate at working concentration(10 ng/lL). A list of organisms from which the genomic DNAs wereobtained can be found in Punta et al. (2009). A Biomek FX liquidhandling robot (Beckman Coulter, Inc.) is used to assemble theamplification reaction in a 384-well plate by mixing a PCR mastermix, primers and required matching template DNA in a final vol-ume of 15 lL. KOD Hot Start DNA Polymerase (Novagen, Inc.) isused in the amplification reaction with one standard set of cyclingparameters. The high processivity and proofreading ability of thispolymerase ensures a success rate of greater than 95% withoutthe need for further optimization. Standard conditions for PCRare as follows: 1 � KOD Hot Start Buffer, 1.5 mM MgSO4, 0.2 mM(each) dNTPs, 0.3 lM (each) primers, 30 ng genomic templateDNA, 10% DMSO, 0.02 U/lL KOD Hot Start DNA Polymerase, andPCR Grade water to 15 lL. PCR reactions are carried out in 384-wellplates using standard cycling conditions: initial denaturation andactivation, 2 min at 95 �C, is followed by 40 cycles of denaturationfor 15 s at 95 �C, annealing for 30 s at 59 �C, and elongation for 30 sat 72 �C.

After PCR amplification, samples are removed from the thermo-cycler, and they are assayed for the presence or absence of prod-ucts of approximately the correct size by loading 2 lL of eachreaction onto a pre-cast, 96-well, ethidium bromide containing2% agarose gel (E-Gel�; Invitrogen, Inc.). Inserts are purified usingAgencourt Solid Phase Reversible Immobilization (SPRI) technol-ogy (Beckman Coulter, Inc.) in a 384-well format. SPRI uses selec-tive capture of nucleic acids onto magnetic micro-particles,which can readily be washed prior to an elution step containingpurified DNA (DeAngelis et al., 1995). Purified inserts are elutedin 15 lL of Agencourt RE buffer.

Constructs are generated by ligation-independent cloning (LIC);Aslanidis and de Jong, 1990), a technique which is both simple touse and highly cost effective. T4 DNA Polymerase is used to gener-ate complimentary single-stranded overhangs on both insert andlinearized vector. Following treatment, insert and vector are mixedand incubated at 22 �C for 1 h to allow spontaneous annealing ofcomplimentary single-stranded overhangs.

In preparation for T4 DNA Polymerase treatment, cloning vec-tors pNYCOMPS-LIC-ccdB-TFH10+(C-term.) and pNYCOMPS-LIC-ccdB-FH10T+(N-term.) are digested with restriction enzymes BfuAIor SnaBI, respectively. Following digestion, cut plasmids are run onpreparative 1% agrose TBE gels and the linearized plasmid is ex-cised from the gel and purified using the QIAquick Gel ExtractionKit (Qiagen, Inc.), eluted in low-EDTA TE buffer and diluted to aworking concentration of 50 ng/lL. LIC treatment reactions aregenerally performed in large batches, in 96-well thermocyclerplates. Standard LIC treatment conditions for the vectors are as fol-lows: 1 � New England Biolabs, Inc. (NEB) Buffer 2, 1 � NEB BSA,25 ng/lL vector, 2.5 mM dGTP (N-term vector) or dCTP (C-termvector), 0.0375 U/lL T4 DNA Polymerase, and PCR Grade water tofinal volume. Reactions are incubated at 22 �C for 1 h followed byheat inactivation at 75 �C for 20 min. Inserts are prepared in a man-ner similar to that of the cloning vectors. The only difference beingthat in order to make complimentary single-stranded overhangs,the purified PCR products are LIC treated in the presence of thecomplimentary nucleotide, dCTP for N-terminal inserts and dGTPfor C-terminal inserts. The LIC insert master mix is as follows:1 � NEB Buffer 2, 1 � NEB BSA, 2.5 mM dCTP (N-terminal inserts)or dGTP (C-terminal inserts), 0.0375 U/lL T4 DNA polymerase,and PCR Grade water to final volume. Eight microlitre of the stan-dard LIC insert master mix is combined with 2 lL of purified PCRproduct and incubated at 22 �C for 1 h followed by heat inactiva-tion at 75 �C for 20 min.

pression and purification of membrane proteins. J. Struct. Biol. (2010),

Page 5: Journal of Structural Biology - Wolfson Centre Home Pagewolfson.huji.ac.il/purification/Course92632_2014/Membrane proteins... · 12 Accepted 17 March 2010 13 Available online xxxx

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

F. Mancia, J. Love / Journal of Structural Biology xxx (2010) xxx–xxx 5

YJSBI 5784 No. of Pages 9, Model 5G

17 April 2010ARTICLE IN PRESS

After the vector and inserts have been LIC treated, 2 lL of vectorand 4 lL of insert are combined and incubated at 22 �C for 1 h.About 2 lL of 25 mM EDTA pH 8.0 is added to each annealing reac-tion and incubated at 22 �C for 5 min. Annealing reactions are typ-ically performed in either 96- or 384-well plates according to scaleof the experiment.

Following this step, 1–2 lL of circular plasmid is transformedinto 20 lL of phage-resistant competent cells (DH10B-T1R-1),and plated on selective agar containing 25 lg/mL kanamycin in24-well blocks. Blocks are incubated overnight at 37 �C. The fol-lowing day, 1 colony per target is picked manually and grown in800 lL of 2 � YT containing 25 lg/mL kanamycin in a 96-welldeep-well block at 37 �C in HT growth incubators (Vertiga; Thom-son Instrument Company, Inc.) for 16 h to amplify plasmid DNA.Agencourt CosMidPrep magnetic SPIR technology (Beckman Coul-ter, Inc.) is used to purify the DNA in an automated manner at arate of six 96-well blocks every 3 h. At this point, the identityand integrity of each construct is verified by sequencing by Agen-court Biosciences (Beckman Coulter, Inc.).

T526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

CO

RR

EC

2.3. Small-scale expression screens

In order to test for expression and purification potential of eachmembrane protein target, expression plasmids are transformedinto a phage resistant expression strain (BL21 (DE3) pLysS–T1R;Sigma–Aldrich, Inc.). About 0.6 mL cultures of each transformantare grown in 96-well deep-well block at 37 �C in a HT growth incu-bator to an OD600nm of 0.6–1. Protein expression is induced byaddition of IPTG to a final concentration of 1 mM, and growth is al-lowed to continue for an additional 4 h. Upon harvest, finalODs600nm are typically in the range between 5 and 20.

Cell pellets are harvested by centrifugation and resuspended ina standard lysis buffer that does not contain detergent. A 96-wellrobotic sonicator probe was constructed for the ad hoc purposeof lysing cells in each individual well of a 96-well block. The blockis maintained on a temperature-controlled platform during the en-tire sonication process.

Lysed cells are mixed with an n-dodecyl-b-D-maltopyranoside(DDM; Anatrace, Inc.) stock solution bringing the final concentra-tion of detergent to 2%. After an incubation period of 1 h at 4 �Cunder gentle agitation, the lysate is robotically transferred to a96-well filter plate, pre-loaded with metal-affinity resin (ThomsonInstrument Company, Inc. and GE lifesciences, Inc.). Lysates areincubated with the resin at 4 �C for a minimum of 1 h. Followingthis incubation step, lysates are removed by vacuum-assisted fil-tration, and the resin, which is retained in the filter plate, iswashed twice with 1 mL of wash buffer containing DDM at 0.02%and 50 mM imidazole. The purified proteins are then eluted witha high concentration of imidazole (500 mM). The proteins con-tained in the elution fraction are visualized by coomassie-blue R-250 staining of 26-well SDS–PAGE gels.

555

556

557

558

559

560

561

562

563

564

565

566

567

568

UN

2.4. Medium-scale expression tests

Plasmids harboring genes coding for proteins expressing at suit-able levels and purity as shown by the small-scale assay are re-ar-rayed in a 96-well format, transformed into expression strains, andthese grown to high density in a GNFermenter (GNF Systems, Inc.).Cell pellets harvested from this fermenter are treated via the sameprocesses as previously described for the small-scale expressionscreens, but the larger cell masses necessitate the use of 50 mLtubes and 10 mL drip columns, rather than 96-well blocks. Thesetubes and columns are conveniently arrayed in custom-made 96-well racks for processing. Purified proteins are eluted from the me-tal-affinity resin, and �1/40th of the volume is run on an SDS–

Please cite this article in press as: Mancia, F., Love, J. High-throughput exdoi:10.1016/j.jsb.2010.03.021

PAGE gel and proteins are detected, and yields estimated by stain-ing with coomassie blue.

ED

PR

OO

F

2.5. Detergent selection and stability assay

The eluted proteins from the previous step are each divided into4, 40 ll aliquots in a 96-well assay plate. The concentration ofDDM in the elution buffer is 0.02% w/v. The chosen detergents,b-octyl-glucopyranoside (b-OG), n-dodecyl-N,N-dimethylamine-N-oxide (LDAO), tetraethylene glycol monooctyl ether (C8E4) andDDM, are added, as 4 lL stock solutions, to 0.2% over the respectiveamount required to achieve a concentration of twice the CMC (i.e.the percentage for a 2 � CMC solution for b-OG is 0.5%, hence thisdetergent is added to a final concentration of 0.7%). This representsan excess of the second detergent and is meant to allow for at leastpartial replacement of the associated DDM molecules, carried fromthe metal-affinity chromatography elution step. After detergentaddition and mixing, the plate is incubated at 25 �C for 2 h. Subse-quently, the samples are centrifuged at 3000g for 20 min to pelletany precipitate that has formed and the clear supernatant is trans-ferred to the HPLC system. Samples are transferred to a thermo-stated holder, which is held at 4 �C, in an Agilent 1200 HPLCequipped with a 4.6 mm � 30 cm TSK-Gel SuperSW3000 column(Tosoh Biosciences, Inc.). The column is equilibrated with gel filtra-tion buffer containing 2 � CMC DDM. Using the autosampler, 5 lLof each sample is loaded onto the column and the elution from thecolumn monitored at 280 nm. The column has an approximate bedvolume of 5 mL and a flow rate of 0.25 mL/min is used. This processis repeated automatically until all of the samples have beenprocessed.

Alternatively, the protein of choice can be assayed in a 12 deter-gent screen where iterative injections (5 lL) of the eluted protein(in DDM buffer) are made onto a SEC column (4.6 mm � 30 cmTSK-Gel SuperSW3000 column) pre-equilibrated in the targetdetergent at 2 � CMC. The detergents used are DDM, n-decyl-b-maltopyranoside (DM), n-nonyl-b-maltopyranoside (NM), n-octyl-b-maltopyranoside (OM), OG, n-nonyl-b-glucopyranoside (NG),5-cyclohexyl-1-pentyl-b-D-maltopyranoside (Cymal-5), 4-cyclo-hexyl-1-butyl-b-D-maltopyranoside (Cymal-4), pentaethylene gly-col monooctyl ether (C8E5), C8E4, Fos-choline 10 (FC-10) andFC-12. Elution profiles are optionally monitored by UV absorbanceat 280nM, three angle static-light scattering and refractive indexdetection using a miniDAWN™ TREOS system and Optilab� rEX(Wyatt Instruments, Inc.).

3. Results and discussion

3.1. Target selection

Approximately 10,000 targets have been bio-informatically pro-cessed and selected from the ‘‘NYCOMPS98” dataset of �39,000integral membrane proteins, to date. Details can be found at thepublic databases TargetDB (http://targetdb.pdb.org/) and PepcDB(http://pepcdb.pdb.org/). The average number of membrane pro-teins we predicted in all 96 genomes is 24%, which is in-line withother predictions. This number varies from 19% (Methanocaldococ-cus jannaschii) to 30% (Clostridium perfringens) (Knight et al., 2004;Liu and Rost, 2001). Most of the targets selected are between 100and 500 amino acids in length and have between 2 and 12 pre-dicted TMHs. 81% of all targets align to one or more Pfam-A fami-lies (HMMER E-value <10�3), with these families collectivelycovering more than half of the target predicted TM region. Full de-tails of these experiments are published elsewhere (see Puntaet al., 2009).

pression and purification of membrane proteins. J. Struct. Biol. (2010),

Page 6: Journal of Structural Biology - Wolfson Centre Home Pagewolfson.huji.ac.il/purification/Course92632_2014/Membrane proteins... · 12 Accepted 17 March 2010 13 Available online xxxx

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

6 F. Mancia, J. Love / Journal of Structural Biology xxx (2010) xxx–xxx

YJSBI 5784 No. of Pages 9, Model 5G

17 April 2010ARTICLE IN PRESS

3.2. Automated cloning

As of winter 2009, 6667 cloning attempts have been conductedresulting in 5113 sequence-verified constructs (77% success). Fail-ures in cloning may result from any step in the process (PCR ampli-fication, purification of inserts, LIC treatments, transformation orsequencing failure). Failure may also at times results from bioinfor-matics selection of inexistent targets due to incorrect annotation ofsequences in public databases. Periodic re-processing of failedcloning efforts are made when sufficient numbers have built up,but these seem to recover only �5% of targets at most.

T

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

RR

EC

3.3. Small and medium expression and purification trails

The goal of the NYCOMPS pipeline for expression and purifica-tion of membrane proteins is to identify samples that not onlycan be over-expressed, but that are also suitable for structuralstudies. Therefore, the small-scale expression tests have been de-signed and are conducted not only to monitor expression levels un-der a fixed and optimized set of parameters, but also to identifytargets that can be purified to sufficient yields for the process tobe cost-effective. The threshold for selection at this stage is thatthe yields of the membrane protein extracted from a 0.6 mL culturein DDM and purified by metal-affinity chromatography be suffi-cient for detection on an SDS–PAGE stained with coomassie blue.Following this initial screening protocol, the lower expression limitis set to approximately 0.5 mg/L, a value deemed acceptable forcost-effective scale-up. DDM was selected as the ‘standard’ deter-gent due to its positive properties as determined by others (Lewin-son et al., 2008) and by in-house conducted optimizationexperiments (data not show), and cost effectiveness. All thesesmall-scale steps mirror the processes that are subsequently re-used at a production scale for structural studies. An SDS–PAGEgel from a representative experiment conducted at this scale isshown in Fig. 2.

A prerequisite for these experiments was to identify a methodfor lysing a large number of membrane protein samples in parallel,without causing denaturation or aggregation. Chemical lysis solu-tions are not ideal for membrane proteins as they frequently con-tain unsuitably harsh detergents. Lysis by French-pressure cellcannot be adapted to 96, 0.6 mL samples, and iterative rounds offreezing and thawing were deemed to be somewhat inefficient.Sonication was chosen as the most viable option. Plate sonicatorsor multi-head sonicators are commercially available, but the

UN

CO

Fig. 2. Representative results from small-scale expression tests. A 24-well SDS–PAGE gevolumes. An omni-present contaminant is marked.

Please cite this article in press as: Mancia, F., Love, J. High-throughput exdoi:10.1016/j.jsb.2010.03.021

ED

PR

OO

F

energy distribution has the tendency of being uneven, often lead-ing to over-heating of some samples and no lysis of others. Wetherefore designed a robotic sonicator that consists of an electron-ically controlled sonicator probe mounted on a robot arm that canenter sequentially each well of a 96-well deep-well block, energize,and move to the next well in turn (Fig. 3). The sonicator head iskept energized momentarily after leaving the liquid surface butstill within the confines of the block well to remove liquid attachedto the probe. As a result, no cross-well contamination could be de-tected by western blot (data not shown). The block is held on a me-tal platform at 2 �C to minimize heating and multiple, typicallythree at most, rounds of short bursts are used with a 13 minwell-to-well delay time to minimize over-heating. Processing of96 samples is achieved in 13–39 min according to the number ofrounds used. This robotic sonicator can also be used in conjunctionwith a robot-addressable 96-well ultra centrifuge to isolate mem-brane fractions in HT format, although this additional purificationstep was most often deemed unnecessary.

Protein stability and aggregation state in different, ‘‘crystalliza-tion-friendly” detergents are assayed by SEC following an HT pro-cedure. However, the amount of material necessary for theseexperiments requires a mid-scale growth culture. To this end, wehave made use of an airlift fermenter (GNFermenter, GNF systems,Inc.) that allows 96 independent cultures to be simultaneouslygrown to extremely high cell densities. The cell mass yield from�70 mLs of culture is typically equivalent to that obtained with�500 mL grown in a 2 L shake flask. Moreover, expression yieldsfor cells grown with the fermenter are comparable to thoseachieved in a 96-well deep-well block, thus minimizing issues ofscalability.

Typically, �23% of membrane protein targets pass the small andmid-scale selection steps. This success rate is in agreement withthat of other reports (Dobrovetsky et al., 2005). A representativegel of the material eluted from the metal-affinity chromatographystep for the mid-scale expression and purification step is shown inFig. 4. These samples are then passed onto the stability screens dis-cussed below.

3.4. Detergent selection and stability assays

The selection of purification detergent, and possibly moreimportantly the choice of crystallization detergent, is a criticaldecision in the entire process of membrane protein structuredetermination.

l stained with coomassie blue. Membrane proteins are purified from 0.6 mL culture

pression and purification of membrane proteins. J. Struct. Biol. (2010),

Page 7: Journal of Structural Biology - Wolfson Centre Home Pagewolfson.huji.ac.il/purification/Course92632_2014/Membrane proteins... · 12 Accepted 17 March 2010 13 Available online xxxx

UN

CO

RR

EC

T

652

653

654

655

656

657

658

659

660

661

662

663

664

665

666

667

668

669

670

671

672

673

674

Fig. 3. Picture of custom-made sonicator robot. Key features are indicated witharrows. Sample holder, pictured here, can be interchanged with a 96-well deep-wellblock or with a custom-made holder for 6, 50 mL centrifuge tubes.

Fig. 4. Representative results from mid-scale expression and purification tests Commasspipeline. About 2.5% of the total volume eluted from the metal-affinity chromatography stcontaminants differ from those present in the small-scale expression screens. This diffe

Fig. 5. Stability in short-chain detergents. SEC elution profiles of two different proteinsThe SEC experiments are performed with DDM included in the mobile phase at twice its(B), as indicated by presents of a large percentage of aggregated material which elutes a

F. Mancia, J. Love / Journal of Structural Biology xxx (2010) xxx–xxx 7

YJSBI 5784 No. of Pages 9, Model 5G

17 April 2010ARTICLE IN PRESS

Please cite this article in press as: Mancia, F., Love, J. High-throughput exdoi:10.1016/j.jsb.2010.03.021

RO

OF

The capability of a given protein to withstand a ‘harsh’ deter-gent correlates with its stability, and stable proteins are more ame-nable to structural investigation. Therefore, we wished to selectproteins that are stable at moderately elevated temperatures inlarge excesses of short-chain detergents. Proteins yielding elutionprofiles from SEC that are unperturbed by this treatment are clas-sified as stable, and are likely to perform well at production scale-up and/or crystallization. Proteins that are detergent-selective areclassified as workable. Proteins that do not behave well in anydetergent are either discarded or redirected to a rescue pathway,depending on the importance of the target. Two examples derivedfrom this assay are shown in Fig. 5. In one case, the protein appearsto withstand all conditions-albeit with a somewhat diminishedyield in C8E4 and to a lesser extent, in bOG (Fig. 5A). In contrast,a second protein appears to only tolerate DDM, as sample treatedwith other detergents elutes from SEC as an aggregate, in the voidvolume of the column (Fig. 5B). We typically perform this stabilityassay with three short-chain detergents (see legend to Fig. 5),while we include only DDM in the mobile phase. This reduces run-ning costs and time, thus increasing the throughput. Unfolding oraggregation of a membrane protein in a given short-chain deter-gent is typically an irreversible process, unlikely to be rescued byremoval of the agent in a DDM-only containing mobile phase.

ED

P

ie-stained SDS–PAGE gel of 24 purified membrane proteins form the medium-scaleep was loaded in each lane. Two contaminant bands are marked. Interestingly, these

rence could be due to different growth conditions of the bacteria.

treated with large excess of DDM (blue), C8E4 (green), LDAO (red) and b-OG (pink).CMC. Stability of the protein shown in (A) is apparent when compared to the one infter approximately 7 min, in the void volume of the column.

pression and purification of membrane proteins. J. Struct. Biol. (2010),

Page 8: Journal of Structural Biology - Wolfson Centre Home Pagewolfson.huji.ac.il/purification/Course92632_2014/Membrane proteins... · 12 Accepted 17 March 2010 13 Available online xxxx

TD

PR

OO

F675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716

717

718

719

720

721

722

723

Fig. 6. SEC-mediated detergent exchange. The elution traces are shown for one example protein run on a SEC column equilibrated with 12 different detergents in the mobilephase. The traces are shown in a staggered array, and the matching detergent is indicated above each. This protein yielded diffracting crystals.

8 F. Mancia, J. Love / Journal of Structural Biology xxx (2010) xxx–xxx

YJSBI 5784 No. of Pages 9, Model 5G

17 April 2010ARTICLE IN PRESS

OR

REC

A second detergent assay consists of iteratively injecting thepartially purified protein onto a SEC column pre-equilibriated inany one of 12 different detergent-containing mobile phases (seemethods section for list of detergents). Arrayed UV traces for oneprotein are shown in Fig. 6. Elution profiles are monitored by UVabsorbance at 280nM, and optionally also by three angle static-light scattering and refractive index detection. These data can beused to formulate an estimate of molecular weight, aggregationstatus and to determine the size of the protein/detergent micelleand build up a profile of detergent preference for a given protein.Adequate detergent exchange on the column often leads to a shiftin the retention time of the elution peak and this be confirmed bycalculations from the light scattering and refractive index detec-tion (Hayashi et al., 1989; Slotboom et al., 2008). These data alsoallow detection and quantification of multimeric species.

Approximately 40% of samples that enter the gel filtration assaysgive a ‘positive’ elution profile in at least one detergent. These pro-teins are distributed for scale-up and crystallization experiments.

724

725

726

727

728

729

730

731

732733734735736737738739740

UN

C

4. Conclusions

We present here a structural genomics pipeline for the HT tar-geting, cloning, expression, purification and biophysical character-ization in different detergent of integral membrane proteins toselect those most suitable for structural investigation. This pipelinehas been used to target and process thousands of integral mem-brane proteins and produces many suitable targets for scale up.This highly-automated pre-selection approach serves to concen-trate the resources for the labor-intensive downstream steps (crys-tallization and structure determination) on those proteins withhighest probability of success.

Retrospective data analysis of proteins that successfully emergefrom the screening processes provides precious feedback into ourselection procedures for continuous rounds of improvement. Fur-thermore, the technologies and processes described herein, could,

Please cite this article in press as: Mancia, F., Love, J. High-throughput exdoi:10.1016/j.jsb.2010.03.021

Eif deemed necessary, readily be modified to re-process targets un-der a different set of conditions.

The pipeline presented here has led to biologically-interestingstructures such as a bacterial homologue of the kidney urea trans-porter (Levin et al., 2009) and a pentameric formate channel(Waight and A.B., 2009), with several others on the horizon.

Finally, it may be worth noting that the NYCOMPS facilities areavailable to the general community via the Protein Structure Initia-tive community-nominated targets proposal system (http://cnt.psi-structuralgenomics.org/CNT/targetlogin.jsp).

Acknowledgments

The authors wish to thank the NYCOMPS team of researcherswho generated a substantial amount of the data presented in thismanuscript. We are particularly grateful to Marco Punta (ColumbiaUniversity and University of Munich) for the bioinformatics analy-sis, and to the following researchers at the NYCOMPS core labora-tory at the New York Structural Biology Center, for the cloning,expression and purification of the targets: Brian Kloss, PatriciaRodriguez, Arianne Morrison, Renato Bruni and Brandan Hillerich.We also wish to acknowledge Larry Shapiro and Wayne Hendrick-son for their invaluable contributions to the design and implemen-tation of this platform. This work was funded by NIGMSGM075026 (Wayne Hendrickson, P.I.).

References

Wallin, E., von Heijne, G., 1998. Genome-wide analysis of integral membraneproteins from eubacterial, archaean, and eukaryotic organisms. Protein Sci. 7,1029–1038.

Berman, H., Henrick, K., Nakamura, H., 2003. Announcing the worldwide ProteinData Bank. Nat. Struct. Biol. 10, 980.

Henderson, R., Unwin, P.N., 1975. Three-dimensional model of purple membraneobtained by electron microscopy. Nature 257, 28–32.

Deisenhofer, J., Epp, O., Miki, K., Huber, R., Michel, H., 1984. X-ray structure analysisof a membrane protein complex. Electron density map at 3 A resolution and a

pression and purification of membrane proteins. J. Struct. Biol. (2010),

Page 9: Journal of Structural Biology - Wolfson Centre Home Pagewolfson.huji.ac.il/purification/Course92632_2014/Membrane proteins... · 12 Accepted 17 March 2010 13 Available online xxxx

T

741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811 Q1812813814815816817818819820821822823824825826

827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912

F. Mancia, J. Love / Journal of Structural Biology xxx (2010) xxx–xxx 9

YJSBI 5784 No. of Pages 9, Model 5G

17 April 2010ARTICLE IN PRESS

UN

CO

RR

EC

model of the chromophores of the photosynthetic reaction center fromRhodopseudomonas viridis. J. Mol. Biol. 180, 385–398.

Grisshammer, R., Tate, C.G., 1995. Overexpression of integral membrane proteins forstructural studies. Q. Rev. Biophys. 28, 315–422.

Mancia, F., Hendrickson, W.A., 2007. Expression of recombinant G-protein coupledreceptors for structural biology. Mol. Biosyst. 3, 723–734.

Mancia, F., Patel, S.D., Rajala, M.W., Scherer, P.E., Nemes, A., Schieren, I.,Hendrickson, W.A., Shapiro, L., 2004. Optimization of protein production inmammalian cells with a coexpressed fluorescent marker. Structure (Camb) 12,1355–1360.

Kunji, E.R., Slotboom, D.J., Poolman, B., 2003. Lactococcus lactis as host foroverproduction of functional membrane proteins. Biochim. Biophys. Acta1610, 97–108.

Hendrickson, W.A., 1991. Determination of macromolecular structures fromanomalous diffraction of synchrotron radiation. Science 254, 51–58.

Grisshammer, R., Duckworth, R., Henderson, R., 1993. Expression of a ratneurotensin receptor in Escherichia coli. Biochem. J. 295 (Pt 2), 571–576.

Midgett, C.R., Madden, D.R., 2007. Breaking the bottleneck: eukaryotic membraneprotein expression for high-resolution structural studies. J. Struct. Biol. 160,265–274.

Gonzales, E.B., Kawate, T., Gouaux, E., 2009. Pore architecture and ion sites in acid-sensing ion channels and P2X receptors. Nature 460, 599–604.

Hanson, M.A., Stevens, R.C., 2009. Discovery of new GPCR biology: one receptorstructure at a time. Structure 17, 8–14.

Kawate, T., Michel, J.C., Birdsong, W.T., Gouaux, E., 2009. Crystal structure of theATP-gated P2X(4) ion channel in the closed state. Nature 460, 592–598.

Long, S.B., Campbell, E.B., Mackinnon, R., 2005. Crystal structure of a mammalianvoltage-dependent Shaker family K+ channel. Science 309, 897–903.

Sobolevsky, A.I., Rosconi, M.P., Gouaux, E., 2009. X-ray structure, symmetry andmechanism of an AMPA-subtype glutamate receptor. Nature 462, 745–756.

Tao, X., Avalos, J.L., Chen, J., MacKinnon, R., 2009. Crystal structure of the eukaryoticstrong inward-rectifier K+ channel Kir2.2 at 3.1 A resolution. Science 326,1668–1674.

Wiener, M.C., 2004. A pedestrian guide to membrane protein crystallization.Methods 34, 364–372.

Lemieux, M.J., Song, J., Kim, M.J., Huang, Y., Villa, A., Auer, M., Li, X.D., Wang, D.N.,2003. Three-dimensional crystallization of the Escherichia coli glycerol-3-phosphate transporter: a member of the major facilitator superfamily. ProteinSci. 12, 2748–2756.

Serrano-Vega, M.J., Magnani, F., Shibata, Y., Tate, C.G., 2008. Conformationalthermostabilization of the beta1-adrenergic receptor in a detergent-resistantform. Proc. Natl. Acad. Sci. USA 105, 877–882.

Warne, T., Serrano-Vega, M.J., Tate, C.G., Schertler, G.F., 2009. Development andcrystallization of a minimal thermostabilised G protein-coupled receptor.Protein Expr. Purif. 65, 204–213.

Dessailly, B.H., Nair, R., Jaroszewski, L., Fajardo, J.E., Kouranov, A., Lee, D., Fiser, A.,Godzik, A., Rost, B., Orengo, C., 2009. PSI-2: structural genomics to cover proteindomain family space. Structure 17, 869–881.

Joachimiak, A., 2009. High-throughput crystallography for structural genomics.Curr. Opin. Struct. Biol. 19, 573–584.

Terwilliger, T.C., Stuart, D., Yokoyama, S., 2009. Lessons from structural genomics.Annu. Rev. Biophys. 38, 371–383.

Graslund, S., Nordlund, P., Weigelt, J., Hallberg, B.M., Bray, J., Gileadi, O., Knapp, S.,Oppermann, U., Arrowsmith, C., Hui, R., Ming, J., dhe-Paganon, S., Park, H.W.,Savchenko, A., Yee, A., Edwards, A., Vincentelli, R., Cambillau, C., Kim, R., Kim,S.H., Rao, Z., Shi, Y., Terwilliger, T.C., Kim, C.Y., Hung, L.W., Waldo, G.S., Peleg, Y.,Albeck, S., Unger, T., Dym, O., Prilusky, J., Sussman, J.L., Stevens, R.C., Lesley, S.A.,Wilson, I.A., Joachimiak, A., Collart, F., Dementieva, I., Donnelly, M.I.,Eschenfeldt, W.H., Kim, Y., Stols, L., Wu, R., Zhou, M., Burley, S.K., Emtage, J.S.,Sauder, J.M., Thompson, D., Bain, K., Luz, J., Gheyi, T., Zhang, F., Atwell, S., Almo,S.C., Bonanno, J.B., Fiser, A., Swaminathan, S., Studier, F.W., Chance, M.R., Sali, A.,Acton, T.B., Xiao, R., Zhao, L., Ma, L.C., Hunt, J.F., Tong, L., Cunningham, K., Inouye,M., Anderson, S., Janjua, H., Shastry, R., Ho, C.K., Wang, D., Wang, H.,. Jiang, M.,Montelione, G.T., Stuart, D.I., Owens, R.J., Daenke,.S., Schutz, A., Heinemann, U.,Yokoyama, S., Bussow, K., Gunsalus, K.C. (2008). Protein production andpurification. Nat. Methods 5, 135–146.

Aslanidis, C., de Jong, P.J., 1990. Ligation-independent cloning of PCR products (LIC-PCR). Nucleic Acids Res. 18, 6069–6074.

Clark, K.M., Fedoriw, N., Robinson, K., Connelly, S.M., Randles, J., Malkowski, M.G.,Detitta, G.T., Dumont, M.E. Purification of transmembrane proteins fromSaccharomyces cerevisiae for X-ray crystallography. Protein Expr Purif.

Liguori, L., Marques, B., Villegas-Mendez, A., Rothe, R., Lenormand, J.L., 2007.Production of membrane proteins using cell-free expression systems. ExpertRev. Proteomics 4, 79–90.

Schwarz, D., Dotsch, V., Bernhard, F., 2008. Production of membrane proteins usingcell-free expression systems. Proteomics 8, 3933–3946.

Page, R., Moy, K., Sims, E.C., Velasquez, J., McManus, B., Grittini, C., Clayton, T.L.,Stevens, R.C. (2004). Scalable high-throughput micro-expression device forrecombinant proteins. Biotechniques 37, 364, 366, 368 passim.

Lesley, S.A., Kuhn, P., Godzik, A., Deacon, A.M., Mathews, I., Kreusch, A., Spraggon, G.,Klock, H.E., McMullan, D., Shin, T., Vincent, J., Robb, A., Brinen, L.S., Miller, M.D.,McPhillips, T.M., Miller, M.A., Scheibe, D., Canaves, J.M., Guda, C., Jaroszewski, L.,Selby, T.L., Elsliger, M.A., Wooley, J., Taylor, S.S., Hodgson, K.O., Wilson, I.A.,Schultz, P.G., Stevens, R.C., 2002. Structural genomics of the Thermotogamaritima proteome implemented in a high-throughput structuredetermination pipeline. Proc. Natl. Acad. Sci. USA 99, 11664–11669.

913

Please cite this article in press as: Mancia, F., Love, J. High-throughput exdoi:10.1016/j.jsb.2010.03.021

ED

PR

OO

F

Brodsky, O., Cronin, C.N., 2006. Economical parallel protein expression screeningand scale-up in Escherichia coli. J. Struct. Funct. Genomics 7, 101–108.

Dobrovetsky, E., Lu, M.L., Andorn-Broza, R., Khutoreskaya, G., Bray, J.E., Savchenko, A.,Arrowsmith, C.H., Edwards, A.M., Koth, C.M., 2005. High-throughput productionof prokaryotic membrane proteins. J. Struct. Funct. Genomics 6, 33–50.

Dobrovetsky, E., Menendez, J., Edwards, A.M., Koth, C.M., 2007. A robust purificationstrategy to accelerate membrane proteomics. Methods 41, 381–387.

Lewinson, O., Lee, A.T., Rees, D.C., 2008. The funnel approach to theprecrystallization production of membrane proteins. J. Mol. Biol. 377, 62–73.

Yatime, L., Buch-Pedersen, M.J., Musgaard, M., Morth, J.P., Lund Winther, A.M.,Pedersen, B.P., Olesen, C., Andersen, J.P., Vilsen, B., Schiott, B., Palmgren, M.G.,Moller, J.V., Nissen, P., Fedosova, N., 2009. P-type ATPases as drug targets: toolsfor medicine and science. Biochim. Biophys. Acta 1787, 207–220.

Eshaghi, S., Hedren, M., Nasser, M.I., Hammarberg, T., Thornell, A., Nordlund, P.,2005. An efficient strategy for high-throughput expression screening ofrecombinant integral membrane proteins. Protein Sci. 14, 676–683.

Hartley, J.L., Temple, G.F., Brasch, M.A., 2000. DNA cloning using in vitro site-specificrecombination. Genome Res. 10, 1788–1795.

Klock, H.E., Koesema, E.J., Knuth, M.W., Lesley, S.A., 2008. Combining thepolymerase incomplete primer extension method for cloning andmutagenesis with microscreening to accelerate structural genomics efforts.Proteins 71, 982–994.

Willis, M.S., Koth, C.M., 2008. Structural proteomics of membrane proteins: a surveyof published techniques and design of a rational high-throughput strategy.Methods Mol. Biol. 426, 277–295.

Wang, D.N., Safferling, M., Lemieux, M.J., Griffith, H., Chen, Y., Li, X.D., 2003. Practicalaspects of overexpressing bacterial secondary membrane transporters forstructural studies. Biochim. Biophys. Acta 1610, 23–36.

Veesler, D., Blangy, S., Siponen, M., Vincentelli, R., Cambillau, C., Sciara, G., 2009.Production and biophysical characterization of the CorA transporter fromMethanosarcina mazei. Anal. Biochem. 388, 115–121.

Kawate, T., Gouaux, E., 2006. Fluorescence-detection size-exclusionchromatography for precrystallization screening of integral membraneproteins. Structure 14, 673–681.

Hammon, J., Palanivelu, D.V., Chen, J., Patel, C., Minor Jr., D.L., 2009. A greenfluorescent protein screen for identification of well-expressed membraneproteins from a cohort of extremophilic organisms. Protein Sci. 18, 121–133.

Daley, D.O., Rapp, M., Granseth, E., Melen, K., Drew, D., von Heijne, G., 2005. Globaltopology analysis of the Escherichia coli inner membrane proteome. Science 308,1321–1323.

Punta, M., Love, J., Handelman, S., Hunt, J.F., Shapiro, L., Hendrickson, W.A., Rost, B.,2009. Structural genomics target selection for the New York consortium onmembrane protein structure. J. Struct. Funct. Genomics 10, 255–268.

Pruitt, K.D., Tatusova, T., Maglott, D.R., 2007. NCBI reference sequences (RefSeq): acurated non-redundant sequence database of genomes, transcripts andproteins. Nucleic Acids Res. 35, D61–D65.

Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E.L., 2001. Predictingtransmembrane protein topology with a hidden Markov model: application tocomplete genomes. J. Mol. Biol. 305, 567–580.

Li, W., Godzik, A., 2006. Cd-hit: a fast program for clustering and comparing largesets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659.

Moller, S., Croning, M.D., Apweiler, R., 2001. Evaluation of methods for theprediction of membrane spanning regions. Bioinformatics 17, 646–653.

Esnouf, R.M., Hamer, R., Sussman, J.L., Silman, I., Trudgian, D., Yang, Z.R., Prilusky, J.,2006. Honing the in silico toolkit for detecting protein disorder. ActaCrystallogr. D Biol. Crystallogr. 62, 1260–1266.

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.,1997. Gapped BLAST and PSI-BLAST: a new generation of protein databasesearch programs. Nucleic Acids Res. 25, 3389–3402.

Keseler, I.M., Bonavides-Martinez, C., Collado-Vides, J., Gama-Castro, S., Gunsalus, R.P.,Johnson, D.A., Krummenacker, M., Nolan, L.M., Paley, S., Paulsen, I.T., Peralta-Gil, M.,Santos-Zavaleta, A., Shearer, A.G., Karp, P.D., 2009. EcoCyc: a comprehensive viewof Escherichia coli biology. Nucleic Acids Res. 37, D464–D470.

Everett, J.K., Acton, T.B., Montelione, G.T., 2004. Primer Prim’er: a web based serverfor automated primer design. J. Struct. Funct. Genomics 5, 13–21.

Kapust, R.B., Tozser, J., Copeland, T.D., Waugh, D.S., 2002. The P1’ specificity oftobacco etch virus protease. Biochem. Biophys. Res. Commun. 294, 949–955.

Bernard, P., Couturier, M., 1992. Cell killing by the F plasmid CcdB protein involvespoisoning of DNA-topoisomerase II complexes. J. Mol. Biol. 226, 735–745.

DeAngelis, M.M., Wang, D.G., Hawkins, T.L., 1995. Solid-phase reversibleimmobilization for the isolation of PCR products. Nucleic Acids Res. 23, 4742–4743.

Knight, C.G., Kassen, R., Hebestreit, H., Rainey, P.B., 2004. Global analysis ofpredicted proteomes: functional adaptation of physical properties. Proc. Natl.Acad. Sci. USA 101, 8390–8395.

Liu, J., Rost, B., 2001. Comparing function and structure between entire proteomes.Protein Sci. 10, 1970–1979.

Hayashi, Y., Matsui, H., Takagi, T., 1989. Membrane protein molecular weightdetermined by low-angle laser light-scattering photometry coupled with high-performance gel chromatography. Methods Enzymol. 172, 514–528.

Slotboom, D.J., Duurkens, R.H., Olieman, K., Erkens, G.B., 2008. Static-light scatteringto characterize membrane proteins in detergent solution. Methods 46, 73–82.

Levin, E.J., Quick, M., Zhou, M., 2009. Crystal structure of a bacterial homologue ofthe kidney urea transporter. Nature 462, 757–761.

Waight, A.B., Love, J., Wang, D.N. (2009). Structure and mechanism of a pentamericformate channel. Nat. Struct. Mol. Biol.

pression and purification of membrane proteins. J. Struct. Biol. (2010),