partial acetylation of lysine residues improves intraprotein cross-linking

10
Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking Xin Guo, ,‡ Pradipta Bandyopadhyay, ‡,# Birgit Schilling, § Malin M. Young, | Naoaki Fujii, Tiba Aynechi, R. Kiplin Guy, Irwin D. Kuntz, and Bradford W. Gibson* ,‡,§ Department of Pharmaceutics and Medicinal Chemistry, University of the Pacific, Stockton, California 95211, Department of Pharmaceutical Chemistry, University of California, San Francisco, California 94143, Buck Institute for Age Research, Novato, California 94945, Sandia National Laboratories, Livermore, California 94551-0969, and St. Jude Children’s Research Hospital, Memphis, Tennessee 38105 Intramolecular cross-linking coupled with mass spectro- metric identification of cross-linked amino acids is a rapid method for elucidating low-resolution protein tertiary structures or fold families. However, previous cross- linking studies on model proteins, such as cytochrome c and ribonuclease A, identified a limited number of peptide cross-links that are biased toward only a few of the potentially reactive lysine residues. Here, we report an approach to improve the diversity of intramolecular protein cross-linking starting with a systematic quantita- tion of the reactivity of lysine residues of a model protein, bovine cytochrome c. Relative lysine reactivities among the 18 lysine residues of cytochrome c were determined by the ratio of d 0 and acetyl-d 3 groups at each lysine after partial acetylation with sulfosuccinimidyl acetate followed by denaturation and quantitative acetylation of remaining unmodified lysines with acetic-d 6 anhydride. These lysine reactivities were then compared with theoretically derived pK a and relative solvent accessibility surface values. To ascertain if partial N-acetylation of the most reactive lysine residues prior to cross-linking can redirect and increase the observable Lys-Lys cross-links, partially acetylated bovine cytochrome c was cross-linked with the amine- specific, bis-functional reagent, bis(sulfosuccinimidyl)- suberate. After proteolysis and mass spectrometry analy- sis, partial acetylation was shown to significantly increase the number of observable peptides containing Lys-Lys cross-links, shifting the pattern from the most reactive lysine residues to less reactive ones. More importantly, these additional cross-linked peptides contained novel Lys-Lys cross-link information not seen in the non- acetylated protein and provided additional distance con- straints that were consistent with the crystal structure and facilitated the identification of the proper protein fold. Recently, intramolecular protein cross-linking coupled with identification of peptide cross-links using mass spectrometry and computational modeling (MS3D) 1-3 has emerged as an alternative strategy for determining protein folding. Compared with X-ray crystallography and NMR, MS3D offers structures of lower resolution (4-5 Å rms) but is more rapid and less demanding on sample quantity and purity. Moreover, a unique feature of intramolecular cross-linking technology is that the reactions can be carried out under physiological conditions, low concentrations, and/or in mixed systems, such as the presence or absence of a ligand or a secondary binding protein. Such features make MS3D potentially suitable for the high throughput structure determina- tion of large numbers of proteins discovered from genomic and proteomic projects. 4,5 In theory, MS3D should be a broadly applicable methodology, as long as the protein carries, on the majority of its surface, sufficient residues that can be cross-linked to generate N/10 informative distance constraints, where N is the number of amino acids in the protein. 2 Soluble proteins generally have a number of exposed primary amines including lysine side chains and, when not blocked, the primary amino group at the N-terminus. Each of these amino groups is potentially available for amine-specific cross-linking using reagents such as bis(sulfosuccinimidyl)suberate (BS 3 ). 6 BS 3 is one of the most frequently used of the many commercially available cross-linking reagents and is a homobifunctional, water-soluble, and membrane-impermeable reagent containing an N-hydroxysul- fosuccinimide (sNHS) ester at each end of an eight-carbon spacer arm. NHS esters selectively acetylate primary amines at pH 7-9 to form stable amide bonds, along with release of the sNHS leaving group. However, recent studies 3,7-9 using NHS-based cross-linking reagents including BS 3 yielded only a very limited number of * To whom correspondence should be addressed. E-mail: bgibson@ buckinstitute.org. Phone: 415 209 2032. Fax: 415 209 2231. University of the Pacific, Stockton. University of California, San Francisco. § Buck Institute for Age Research. | Sandia National Laboratories. St. Jude Children’s Research Hospital. # Current address: School of Information Technology, Jawaharlal Nehru University, New Delhi, India 110067. (1) Albrecht, M.; Hanisch, D.; Zimmer, R.; Lengauer, T. In Silico Biol. 2002, 2, 325-337. (2) Young, M. M.; Tang, N.; Hempel, J. C.; Oshiro, C. M.; Taylor, E. W.; Kuntz, I. D.; Gibson, B. W.; Dollinger, G. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 5802-5806. (3) Kruppa, G. H.; Schoeniger, J.; Young, M. M. Rapid Commun. Mass Spectrom. 2003, 17, 155-162. (4) Elkin, P. L. Mayo Clin. Proc. 2003, 78, 57-64. (5) Attwood, T. K.; Miller, C. J. Biotechnol Annu. Rev. 2002, 8,1-54. (6) Kotite, N. J.; Staros, J. V.; Cunningham, L. W. Biochemistry 1984, 23, 3099- 3104. (7) Taverner, T.; Hall, N. E.; O’Hair, R. A.; Simpson, R. J. J. Biol. Chem. 2002, 277, 46487-46492. (8) Pearson, K. M.; Pannell, L. K.; Fales, H. M. Rapid Commun. Mass Spectrom. 2002, 16, 149-159. Anal. Chem. 2008, 80, 951-960 10.1021/ac701636w CCC: $40.75 © 2008 American Chemical Society Analytical Chemistry, Vol. 80, No. 4, February 15, 2008 951 Published on Web 01/18/2008

Upload: bradford-w

Post on 20-Feb-2017

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

Partial Acetylation of Lysine Residues ImprovesIntraprotein Cross-LinkingXin Guo,†,‡ Pradipta Bandyopadhyay,‡,# Birgit Schilling,§ Malin M. Young,| Naoaki Fujii,‡ Tiba Aynechi,‡R. Kiplin Guy,⊥ Irwin D. Kuntz,‡ and Bradford W. Gibson*,‡,§

Department of Pharmaceutics and Medicinal Chemistry, University of the Pacific, Stockton, California 95211, Department ofPharmaceutical Chemistry, University of California, San Francisco, California 94143, Buck Institute for Age Research,Novato, California 94945, Sandia National Laboratories, Livermore, California 94551-0969, and St. Jude Children’s ResearchHospital, Memphis, Tennessee 38105

Intramolecular cross-linking coupled with mass spectro-metric identification of cross-linked amino acids is a rapidmethod for elucidating low-resolution protein tertiarystructures or fold families. However, previous cross-linking studies on model proteins, such as cytochrome cand ribonuclease A, identified a limited number of peptidecross-links that are biased toward only a few of thepotentially reactive lysine residues. Here, we report anapproach to improve the diversity of intramolecularprotein cross-linking starting with a systematic quantita-tion of the reactivity of lysine residues of a model protein,bovine cytochrome c. Relative lysine reactivities amongthe 18 lysine residues of cytochrome c were determinedby the ratio of d0 and acetyl-d3 groups at each lysine afterpartial acetylation with sulfosuccinimidyl acetate followedby denaturation and quantitative acetylation of remainingunmodified lysines with acetic-d6 anhydride. These lysinereactivities were then compared with theoretically derivedpKa and relative solvent accessibility surface values. Toascertain if partial N-acetylation of the most reactive lysineresidues prior to cross-linking can redirect and increasethe observable Lys-Lys cross-links, partially acetylatedbovine cytochrome c was cross-linked with the amine-specific, bis-functional reagent, bis(sulfosuccinimidyl)-suberate. After proteolysis and mass spectrometry analy-sis, partial acetylation was shown to significantly increasethe number of observable peptides containing Lys-Lyscross-links, shifting the pattern from the most reactivelysine residues to less reactive ones. More importantly,these additional cross-linked peptides contained novelLys-Lys cross-link information not seen in the non-acetylated protein and provided additional distance con-straints that were consistent with the crystal structure andfacilitated the identification of the proper protein fold.

Recently, intramolecular protein cross-linking coupled withidentification of peptide cross-links using mass spectrometry and

computational modeling (MS3D)1-3 has emerged as an alternativestrategy for determining protein folding. Compared with X-raycrystallography and NMR, MS3D offers structures of lowerresolution (4-5 Å rms) but is more rapid and less demanding onsample quantity and purity. Moreover, a unique feature ofintramolecular cross-linking technology is that the reactions canbe carried out under physiological conditions, low concentrations,and/or in mixed systems, such as the presence or absence of aligand or a secondary binding protein. Such features make MS3Dpotentially suitable for the high throughput structure determina-tion of large numbers of proteins discovered from genomic andproteomic projects.4,5 In theory, MS3D should be a broadlyapplicable methodology, as long as the protein carries, on themajority of its surface, sufficient residues that can be cross-linkedto generate ∼N/10 informative distance constraints, where N isthe number of amino acids in the protein.2

Soluble proteins generally have a number of exposed primaryamines including lysine side chains and, when not blocked, theprimary amino group at the N-terminus. Each of these aminogroups is potentially available for amine-specific cross-linking usingreagents such as bis(sulfosuccinimidyl)suberate (BS3).6 BS3 is oneof the most frequently used of the many commercially availablecross-linking reagents and is a homobifunctional, water-soluble,and membrane-impermeable reagent containing an N-hydroxysul-fosuccinimide (sNHS) ester at each end of an eight-carbon spacerarm. NHS esters selectively acetylate primary amines at pH 7-9to form stable amide bonds, along with release of the sNHS leavinggroup. However, recent studies3,7-9 using NHS-based cross-linkingreagents including BS3 yielded only a very limited number of

* To whom correspondence should be addressed. E-mail: [email protected]. Phone: 415 209 2032. Fax: 415 209 2231.

† University of the Pacific, Stockton.‡ University of California, San Francisco.§ Buck Institute for Age Research.| Sandia National Laboratories.⊥ St. Jude Children’s Research Hospital.

# Current address: School of Information Technology, Jawaharlal NehruUniversity, New Delhi, India 110067.(1) Albrecht, M.; Hanisch, D.; Zimmer, R.; Lengauer, T. In Silico Biol. 2002,

2, 325-337.(2) Young, M. M.; Tang, N.; Hempel, J. C.; Oshiro, C. M.; Taylor, E. W.; Kuntz,

I. D.; Gibson, B. W.; Dollinger, G. Proc. Natl. Acad. Sci. U.S.A. 2000, 97,5802-5806.

(3) Kruppa, G. H.; Schoeniger, J.; Young, M. M. Rapid Commun. Mass Spectrom.2003, 17, 155-162.

(4) Elkin, P. L. Mayo Clin. Proc. 2003, 78, 57-64.(5) Attwood, T. K.; Miller, C. J. Biotechnol Annu. Rev. 2002, 8, 1-54.(6) Kotite, N. J.; Staros, J. V.; Cunningham, L. W. Biochemistry 1984, 23, 3099-

3104.(7) Taverner, T.; Hall, N. E.; O’Hair, R. A.; Simpson, R. J. J. Biol. Chem. 2002,

277, 46487-46492.(8) Pearson, K. M.; Pannell, L. K.; Fales, H. M. Rapid Commun. Mass Spectrom.

2002, 16, 149-159.

Anal. Chem. 2008, 80, 951-960

10.1021/ac701636w CCC: $40.75 © 2008 American Chemical Society Analytical Chemistry, Vol. 80, No. 4, February 15, 2008 951Published on Web 01/18/2008

Page 2: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

peptide cross-links per protein, even though the proteins underinvestigation (e.g., cytochrome c, ribonuclease A, and ubiquitin)carried many potentially modifiable lysine pairs that might beexpected to react based on distance considerations. Compoundingthis problem was the fact that very few unique distance constraintscould be derived from the peptide cross-links because many ofthe experimentally observed cross-linked peptides were variationsof modification of the same limited subset of lysine residues.10

In this report, we address the challenge of diversifying thecross-linking reactions to obtain sufficient distance constraints forprotein fold determination. Since a cross-linking reagent firstmodifies the most reactive amino acid residues,11 we hypothesizethat the uneven reactivity of the lysine residues may be a limitingfactor for the diversity of cross-linking reactions in the previousstudies, with the most reactive residues quenching most of thecross-linking reagent. As cytochrome c has often been used as amodel protein for intramolecular cross-linking reactions,3,8-10,12 weused this protein to determine the relative reactivities of its 18amino groups to acetylation reactions and then to generate setsof intramolecular cross-links for the protein that had first under-gone partial acetylation. These data were then compared withcalculated pKa and relative solvent accessibility surface values ofthe amino groups. Our aim is to better understand the reactivityof lysine residues in proteins and find ways to improve theexperimental cross-linking profile.

EXPERIMENTAL SECTIONMaterials. Bovine heart cytochrome c and reagents for protein

chemistry including iodoacetamide and DTT were obtained fromSigma (St. Louis, MO). Sequencing-grade modified trypsin (por-cine, Promega, Madison, WI) and Pepsin (from porcine gastricmucosa, Sigma, St. Louis, MO) were utilized for in-solutiondigestion reactions. Sulfosuccinimidyl acetate (sNHS-Ac) and bis-(sulfosuccinimidyl)suberate (BS3) were purchased from Pierce/Thermo-Fisher (Rockford, IL). The deuterated acetylation reagent,acetic-d6 anhydride, was obtained from Cambridge Isotopes(Andover, MA). High-performance liquid chromatography (HPLC)solvents, such as acetonitrile (ACN) and water were obtained fromBurdick & Jackson (Muskegon, MI).

Partial Acetylation of Bovine Cytochrome c by sNHS-Ac.Bovine heart cytochrome c was dissolved in buffer (20 mM Na2-HPO4, 150 mM NaCl, pH 7.5) to a final protein concentration of40 µM, which was then aliquoted into 100 µL aliquots for thefollowing acetylation reactions. sNHS-Ac was dissolved with 5 mMsodium citrate buffer (pH 5, 4 °C) into a 100 mM stock solution.The sNHS-Ac stock solution was immediately diluted to yieldreaction solutions of 1, 2, 5, 10, 20, 50, and 100 mM). For each ofthese seven concentrations, a 10 µL aliquot was mixed with a 100µL aliquot of the 40 µM cytochrome c solution, yielding an sNHS-Ac molar equivalent of 2.5, 5, 12.5, 25, 50, 125, 250, respectively,relative to protein concentration. The reaction mixtures wereagitated and incubated at 4 °C for 4 h.

Denaturation, Alkylation, and Subsequent ExhaustiveAcetylation.13,14 An aliquot of each partially acetylated cytochromec reaction mixture (55 µL) was mixed with 33.3 µL of acetonitrileand 9.2 µL of 100 mM DTT. The mixture was heated at 60 °C for1 h and then cooled to room temperature. Iodoacetamide (8.5 µL,0.1 g/mL in water) was added to the mixture followed byincubation in the dark at 37 °C for 1 h. To quantitatively acetylateall remaining lysine residues not reacted by sNHS-Ac treatment,the denatured and alkylated protein was treated with acetic-d6

anhydride under conditions that minimize acetylation of hydroxyresidues (i.e., half-saturated sodium acetate buffer). Each of thedenatured protein solutions was mixed with 70 µL of saturatedNaOAc buffer and then mixed with 3.5 µL of acetyl-d6 anhydride.The mixture was agitated on a vortex mixer at room temperaturefor 30 min, and its pH was adjusted to 8.0 with 10 N NaOH.Another aliquot (6 µL) of acetyl-d6 anhydride was added to eachsample mixture, and the agitation was repeated at room temper-ature.

Mass Spectrometric Analysis of d0- and d3-Labeled N-Acetylated Lysines. To ensure a high protein coverage, thevarious cytochrome c samples that had been N-acetylated andcysteine alkylated were subjected to proteolytic digestion withpepsin. First, each protein mixture was acidified with 20 µL offormic acid, diluted with 460 µL of water, and left at 4 °C overnightto ensure the complete hydrolysis of the excess acetic-d6 anhy-dride. These protein mixtures were then adjusted to pH 3 withformic acid, and an aliquot of pepsin (24 µL of 7 µg/mL solutionin water) was added to each sample followed by agitation in thedark (room temperature, 16 h). An 18 µL aliquot of each pepsindigestion sample was mixed with 2 µL of 1% TFA, each, anddesalted using C4 zip-tips (Millipore, Bedford, MA). The concen-tration of acetonitrile in the zip-tip eluants (10 µL each) waslowered by a 10-fold dilution with water followed by a 10-foldreconcentration on a Speed-Vac apparatus (Savant, Thermo-FisherScientific, San Jose, CA). The proteolytic peptide mixtures werethen acidified with an equal volume of 0.2% formic acid andanalyzed by reverse-phase nano-HPLC-ESI-MS/MS. Briefly, pep-tides were separated on an Ultimate nanocapillary HPLC systemequipped with a PepMap C4 nano-column (75 µm i.d. × 15 cm)(Dionex, Sunnyvale, CA) and a CapTrap Micro guard column of0.5 µL bed volume (Michrom, Auburn, CA). Peptide mixtures wereloaded onto the guard column and washed with the loading solvent(0.05% formic acid, flow rate: 20 µL/min) for 5 min, thentransferred onto the analytical C18-nanocapillary HPLC columnand eluted at a flow rate of 300 nL/min using the followinggradient: 2% solvent B in A (from 0 to 5 min) and 2-60% solventB in A (from 5 to 55 min). Solvent A consisted of 0.05% formicacid in 98% H2O/2% ACN, and solvent B consisted of 0.05% formicacid in 98% ACN/2% H2O. The column eluant was directly coupledto a QSTAR Pulsar i quadrupole orthogonal TOF mass spectrom-eter (MDS SCIEX, Concorde, Canada) equipped with a Protanananospray ion source (ProXeon Biosystems, Odense, Denmark).Mass spectra (ESI-MS) and tandem mass spectra (ESI-MS/MS)were recorded in positive-ion mode with a resolution of 12000-15000 full-width half-maximum.

(9) Dihazi, G. H.; Sinz, A. Rapid Commun. Mass Spectrom. 2003, 17, 2005-2014.

(10) Schilling, B.; Row, R. H.; Gibson, B. W.; Guo, X.; Young, M. M. J. Am. Soc.Mass Spectrom. 2003, 14, 834-850.

(11) Novak, P.; Young, M. M.; Schoeniger, J. S.; Kruppa, G. H. Eur. J. MassSpectrom. (Chichester, Eng.) 2003, 9, 623-631.

(12) Kalkhof, S.; Ihling, C.; Mechtler, K.; Sinz, A. Anal. Chem. 2005, 77, 495-503.

(13) Shen, S.; Strobel, H. W. Arch. Biochem. Biophys. 1993, 304, 257-265.(14) Shen, S.; Strobel, H. W. Arch. Biochem. Biophys. 1992, 294, 83-90.

952 Analytical Chemistry, Vol. 80, No. 4, February 15, 2008

Page 3: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

Data Analysis and Quantitation of Concentration-Depend-ent Lysine N-Acetylation. The HPLC-MS/MS raw data files ofprotein acetylation experiments using 250 equiv of sNHS-Ac weresearched with an in-house Mascot 2.1 database search engineprogram (Matrix Science)15 to obtain a library of identifiedacetylated peptides (search parameters for pepsin digestion: noenzyme specificity, 100 ppm mass accuracy). A subpopulation ofthe observed acetylated peptides (Supporting Information TableS1) was selected with good ion abundances and signal-to-noiseratios in both their MS and MS/MS spectra. This subset ofacetylated peptides was then used for quantitation and determi-nation of the Ac% values of cytochrome c lysine residues that weresubjected to the stepwise sNHS-Ac/acetic-d6 anhydride experi-ments described above. The mono-isotopic ions of the resultingd0-acetylated peptides and their d3-acetylated isotopic counterpartswere extracted from the total ion chromatograms (TIC) using theQSTAR Analyst QS software (see Supporting Information FiguresS1 and S2 as examples). Under our HPLC separation and MS ionextraction conditions, the do- and d3-labeled peptides coeluted. Theion abundance data were transferred onto a Microsoft Excelworksheet, and small contributions from (minimally) overlappingisotopic ions clusters (e.g., d0 vs d3) were subtracted to obtainthe ratio of the d0- and d3-acetylated peptide pairs. For a peptidewith one lysine side chain, the molecular ions containing an acetyl-d0 versus an acetyl-d3 group were observed with a mass differenceof 3 Da, and these ion pairs were used to calculate relative ionabundances (see Supporting Information Figure S1). For a peptidecontaining more than one lysine (or also containing the N-terminalAc-Gly residues), the relative ion abundances of peptides contain-ing acetyl-d0 and acetyl-d3 modifications were calculated based onmolecular ions that were separated by n × 3Da, where n was thenumber of acetyl groups (see Supporting Information Figure S2).The relative abundance was then normalized to yield the percent-age of acetylation of the peptide by sNHS-Ac. For some of thelysine residues, peptides with only one lysine residue weredetected and the percentage of acetylation of that residue equalsthe percentage of acetylation of the peptide (Supporting Informa-tion Figure S1). Other residues were observed only in peptideswith multiple lysine groups, and their percentage of acetylationwas calculated by using the percentages of acetylation of morethan one (overlapping) peptide. For example, the acetylationpercentage of Lys-86 was calculated by the following formula:

where A%K86 is the acetylation percentage of Lys-86, A%83-94 is theacetylation percentage of peptide 83-94 (AGIKKKGEREDL),and A%87-96 is the acetylation percentage of peptide 87-96(KKGEREDLIA).

Cross-Linking of Bovine Cytochrome c. The cross-linkingreactions were carried out as previously reported2,10 with minormodifications. For cross-linking with no prior acetylation, cyto-chrome c from bovine heart was dissolved with 20 mM Na2HPO4,150 mM NaCl, pH 7.5, into a 10 µM solution, and reacted with a50 mol equiv excess of freshly prepared BS3 solution (5 mMcitrate, pH 5.0/10 mM BS3). For cross-linking reactions following

partial acetylation, the acetylated bovine cytochrome c in20 mM Na2HPO4, 150 mM NaCl, pH 7.5 was diluted by the samebuffer to 10 µM and reacted with a 50 mol equiv excess of freshlyprepared BS3 solution (5 mM citrate, pH 5.0/10 mM BS3). Eachcross-linking reaction mixture was incubated for up to 24 hat 4 °C and quenched with an aliquot of 1 M Tris‚HCl at pH 8.0to a final concentration of 10 mM of Tris‚HCl.

Mass Spectrometry and Identification of Cross-LinkedPeptides. Tryptic digestion of bovine cytochrome c proceededat 37 °C with a trypsin/protein ratio of 1:20 (wt/wt). After 16 h,another aliquot of trypsin was added, and digestion continued foran additional 2 h. The enzymatic digestion was quenched withphenylmethanesulfanyl fluoride (PMSF). The tryptic hydrolysate,consisting of both unmodified and modified peptides, was thenanalyzed by LC/MS under the same conditions as those for analyz-ing the digestions of exhaustively acetylated bovine cytochromec. The Automated Spectrum Assignment Program (ASAP; http://roswell.ca.sandia.gov/∼mmyoung/asap.html) was used to firstassign the MS spectral data as previously described.2 The pro-posed cross-linked peptides and other types of modified peptidesas proposed by ASAP were then further interrogated by analyzingtheir corresponding MS/MS spectra using the MS2Assign pro-gram (http://roswell.ca.sandia.gov/∼mmyoung/ms2assign.html).10

For final confirmation, MS/MS spectra were inspected “manually”using criteria for evaluation similar to those published by Link etal.16 such as the presence of a sequence tag of several contiguousb- and/or y-ions, considering as well the general patterns of cross-linked peptide fragmentation as previously described.10

Calculation of Lysine pKa and Solvent Accessibility inCytchrome c. Assuming that the sterically unhindered, free-baseform of lysine side chains are the reactive species toward acylation,we calculated the pKa and the relative solvent accessible surfacearea (SAS) of all 18 lysine residues in bovine cytochrome c toexplore their effects on the lysine reactivity toward acetylation.The pKa calculations were carried out with the DELPHI softwarepackage,17 using a structure of the horse-heart cytochrome c (pdbID 1HRC), which has 97% sequence similarity to the bovinecytochrome c used in our acetylation/cross-linking experiments(see below). SAS calculations were performed using the DMSprogram package with a probe radius of 4.0. The relative SAS ofall lysine residues were determined. The DMS program packagewas obtained from the Computer Graphics Lab, University ofCalifornia, San Francisco.

Constraint-Based Protein Structure Prediction and Mod-eling. To generate a 3D structural model of bovine cytochromec, we first used the 123D+ threading algorithm18 (http://123d.ncifcrf.gov/123D+.html) to generate sequence-structurealignments for cytochrome c against a fold family library of 1125unique proteins (hobohm_97_45.cath).19 The threading calculationwas performed using the following parameters; Costmatrix ) day-hoff, Match ) 0.00, Mismatch ) 0.00, Gapinsert ) 15.00, Gapex-tend ) 3.00, Window length ) 3, Fold Library ) hobohm97_45.cath,Homology-Level ) 0.00, Used Mode ) global, Cost Function )

(15) Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Electrophoresis1999, 20, 3551-3567.

(16) Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.;Garvik, B. M.; Yates, J. R., III. Nat. Biotechnol. 1999, 17, 676-682.

(17) Honig, B.; Nicholls, A. Science 1995, 268, 1144-1149.(18) Alexandrov, N. N.; Nussinov, R.; Zimmer, R. M.; In Pacific Symposium on

Biocomputing ‘96; Hunter, L.; Klein, T. E., Eds.; World Scientific PublishingCo.: Singapore, 1996; pp 53-72.

(19) Hobohm, U.; Sander, C. Protein Sci. 1994, 3, 522-524.

A%K86 ) 3A%83-94 - 2A%87-96

Analytical Chemistry, Vol. 80, No. 4, February 15, 2008 953

Page 4: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

threading by contact capacity. The top 20 models were then re-ordered according to constraint error violations, which we havepreviously defined as the extent of the model violation of the cross-link-derived distance constraints.2 Structural alignments wereperformed using the DaliLite server (http://www.ebi.ac.uk/DaliLite/).20

RESULTS AND DISCUSSIONTo investigate the acetylation differences of protein amine

groups, we first developed a strategy for systematically evaluatingthe relative reactivity of all lysine residues using bovine cyto-chrome c as a model protein (see Scheme 1). Lysine residues ofbovine cytochrome c were partially acetylated with differentequivalents (defined as molar ratio of the reagent and the protein)of sulfosuccinimidyl acetate (sNHS-Ac), which carries the samelysine-selective acetylation group sNHS as the bifunctional cross-linking reagent BS.3 After this titrated initial acetylation with sNHS-Ac, the unreacted amino groups remaining from the partiallyacetylated lysine residues were exhaustively acetylated13,14 underdenaturing conditions with deuterated (d6) acetic anhydride.Therefore, on any single lysine, one expects to see differentamounts of d0 and d3 N-acetyl modifications, whose ratio isreflective of the initial (partial) reaction with sNHS-Ac-d0.

To obtain comprehensive lysine coverage at the peptide levelfor mass spectrometry analysis, we carried out an exhaustiveprotease digestion with pepsin of the fully acetylated proteins.These pepsin digestions were then analyzed by nano-HPLC-MS/MS to provide unambiguous peptide assignments by MS/MS andto simultaneously measure the relative isotopic abundance of theprecursor ions relative to their differing deuterium content. Acomplete list of acetylated peptides that were used for quantitationpurposes and details of their mass spectrometric identification isprovided as Supporting InformationTable S1, as well as a fulldisplay of their respective Mascot annotated spectra (see Sup-

porting Information Figure S9). The ratios of the abundances ofthe molecular ions of all lysine-containing peptides ratios werethen used to calculate the percentage of acetylation (Ac%) of eachlysine residue at different molar equivalents of sNHS-Ac (Figure1A and Supporting Information Figures S1 and S2). Since theN-terminal glycine of native bovine cytochrome c is fully acetylatedprior to any reactions, its Ac% was used as a positive internal(20) Holm, L.; Park, J. Bioinformatics 2000, 16, 566-567.

Scheme 1. Experimental Determination ofRelative Reactivity of Lysine Residues inCytochrome c

Figure 1. Acetylation of bovine cytochrome c: (A) Percentage ofacetylation (Ac%, y axis) of lysine residues and N-terminus of bovinecytochrome c (x axis) at different equivalences of sNHS-Ac rangingfrom 0 to 250 equiv of sNHS-Ac, defined as mole of sNHS-Ac permole of protein (z axis). An averaged Ac% is reported for Lys-99 andLys-100 (99/100). As the N-terminus of the protein is fully acetylated(post-translational modification in the native protein), it serves hereas a positive control. (B) Uneven distribution of lysine acetylation onbovine cytochrome c. Key: Top 6, the 6 most acetylated lysineresidues; Least 6, the 6 least acetylated lysine residues; Mid 6, therest of the 6 of 18 lysine residues not listed in either top 6 or least 6.Percentage of total acetylation is defined as the sum of Ac% of the6 lysine residues divided by the sum of Ac% of all the 18 lysineresidues times 100%. (C) CR of lysine residues by percentage ofacetylation (Ac%) at different equivalences of sNHS-Ac. See eq 1for definition of CR.

954 Analytical Chemistry, Vol. 80, No. 4, February 15, 2008

Page 5: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

control, which showed values close to 100% at all equivalents ofsNHS-Ac (see Figure 1A).

From the experimental data, one can easily see that N-acetylation of specific lysine residues was biased to a subset oflysine residues (see Figure 1B), especially at low equivalences ofacetylation reagent sNHS-Ac. At 2.5 equiv of sNHS-Ac, forexample, the 6 most reactive of the 18 lysines in bovine cyto-chrome c, i.e., Lys-13, 22, 25, 72, 86, and 87, accounted for 55% ofall the acetylation events whereas the 6 lysines showing the lowestlevel of acetylation, i.e., Lys-5, 7, 39, 73, 99, and 100, accountedfor only 12% of all the lysine acetylations. At higher equivalencesof sNHS-Ac, the acetylation pattern became more evenly distrib-uted, with the less reactive lysine residues accounting for a largershare of the acetylations. For example, at 25 equiv of sNHS-Ac,where about approximately half (47%) of all the lysine residueswere acetylated, the six least acetylated lysine residues accountedfor 21% of all the lysine acetylation while the most acetylated 6lysine residues accounted for about 50% of all the acetylation.Moreover, the ranking of the lysine residues by Ac% changes atdifferent equivalences of sNHS-Ac (Figure 1C). Figure 1C showslarger changes occurring at low sNHS-Ac equivalences (5 and 12.5equiv), as indicated by a larger value of CR (change of reactivity)as defined by eq 1, where Ri,j is the rank by Ac% (1 indicates thehighest Ac% and 17 the lowest Ac%) of a lysine residue i (exceptfor Lys-99 and Lys-100, of which an averaged Ac% was determinedand ranked) at a certain equivalence (j) of sNHS-Ac, and Ri,j-1 isthe rank of the residue i by Ac% at the next lower equivalence ofsNHS-Ac (j - 1) applied to bovine cytochrome c.

To better understand our experimental values of bovinecytochrome c lysine reactivities to acetylation by sNHS-Ac, wecarried out a theoretical calculation of lysine pKa and relative SASvalues. As described in the Experimental Section, the pKa valuesfor each of the 18 lysine residues were calculated using themethod described by Honig and Nicholls,17 based on the crystalstructure of the horse-heart cytochrome c (pdb ID 1HRC), whichhas 97% sequence similarity to bovine cytochrome c. In a similarfashion, the relative SAS values for each lysine residue were alsocalculated using a DMS program package. The lysine residueswere grouped as most reactive (red), medium reactive (green),and least reactive (blue) based on their lysine Ac% at 50 equivsNHS-Ac as shown in Figure 1A. Lysine residues with Ac% greaterthan 80% are considered as most reactive, between 50 and 80% asmedium reactive, and less than 50% as least reactive.

As shown in Figure 2, the theoretically calculated pKa and SASvalues for each individual lysine residue in cytochrome c aredisplayed in relation to their experimentally determined acetylationreactivities. One arrow, pointing from right to left, predicts anincrease of lysine reactivity with decreasing pKa values. Thesecond arrow, pointing from bottom to up, predicts an increaseof lysine reactivity with increasing surface accessibility. Thecalculated lysine pKa’s show a better correlation to the experi-mental reactivities than the SAS values. The correlation of lysinereactivities to their predicted surface accessibilities is modest atbest, while the majority of the most reactive species clearly had

the lowest pKa values. For example, residues Lys-87 (pKa 6.6) andLys-13 (pKa 6.7) were considered most reactive at 50 equiv sNHS-Ac with lysine Ac% > 80%. In contrast, Lys-86 (pKa 6.5) and Lys-25 (pKa 6.7) were considered medium reactive at 50 equiv sNHS-Ac, despite starting out with the highest reactivity at the lowestlevel of acetlation reagent (2.5 equiv sNHS-Ac). Lysine residueswith higher pKa values were mostly observed to be slowlyacetylated, matching expectations of a correlation between pKa

and lysine reactivity. Prime examples of this correlation areresidues Lys-5 (pKa 7.7) and Lys-99 (pKa 8.1), which are consid-ered least reactive based on results shown in Figure 1A. Two otherlysine residues that were determined to have slow reactivity toacetylation, i.e., Lys-73 and Lys-100, exhibited still moderately highpKa values at 7.3 and 7.2, respectively. However, there areexceptions and borderline cases where one cannot simply relyon the pKa values for reactivity prediction, as rates are clearlyinfluenced by other factors. For example, Lys-7 (pKa 6.8) showedslow reactivity at 2.5 equiv sNHS-Ac, but at higher levels of theacetylation reagent it was considered medium reactive (Ac% at50 equiv reagent ) 50-80%) and correlates somewhat better tothe low pKa value.

Changes in the Ac% rank of various lysine residues in bovinecytochrome suggest that partial acetylation of the protein priorto cross-linking could improve the availability of the less reactivelysine residues to amine-specific cross-linking reagents, thusfacilitating more diverse and more informative cross-linkingreactions. To examine this possibility, bovine cytochrome c wasacetylated with 5 or 12.5 equiv of sNHS-Ac, followed by cross-linking with 50 equiv of BS3 (Figure 3, Table 1, and SupportingInformation Table S3). The cross-linking of bovine cytochrome cusing 50 equiv of BS3 without prior acetylation is also reportedfor comparison. The low equivalences of sNHS-Ac were chosen,as they induce large changes in the relative reactivity of lysineresidues (large CR values, Figure 1C) and would impose the least

CR ) ∑i)1

17

(Ri,j - Ri,j-1)2 (1)

Figure 2. Relationship between pKa, relative SAS, and experimen-tally determined reactivities of individual lysine residues in horse-heartcytochrome c toward acetylation (also see Supporting InformationTable S2 for more detailed information). The crystal structure of horse-heart cytochrome c was used as a surrogate for bovine cytochromec, which shares 97% sequence identity. Lysine reactivity as deter-mined from Figure 1A is indicated by color coding and assigned basedon the Ac% at 50 equiv. sNHS-Ac: lysines are marked in red (mostreactive, Ac% > 80%), green (medium reactive, Ac% ) 50-80%),and blue (least reactive, Ac% < 50%).

Analytical Chemistry, Vol. 80, No. 4, February 15, 2008 955

Page 6: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

perturbation of the tertiary protein structure.13,14 No intermolecularcross-linking (i.e., dimers) was detected by gel electrophoresisanalyses under such acetylation/cross-linking conditions (Sup-porting Information Figure S3), suggesting that low-level acety-lation of cytochrome c did not induce any aberrant protein-proteininteractions at these working concentrations.

Cross-linking of bovine cytochrome c using BS3 cross-linkingreagent without any prior modification of lysine residuesyielded a large number of type 0 cross-linked peptides (one endof the cross-linker modifies the protein and the other end ishydrolyzed) as well as several cross-links between closelyspaced lysine residues (less than two amino acid residues apartin primary sequence), neither of which offers informativedistance constraints.2,10 Only two informative distance constraints(Lys-7 to Lys-27 and Lys-39 to Lys-53) were determined from twomass spectrometry-verified cross-linked peptides (see Table 1).The small number of distance informative cross-linked peptidessupports our hypothesis that the uneven reactivity of proteinresidues is a limiting factor for the diversity of cross-linkingreactions.

Acetylation of bovine cytochrome c with sNHS-Ac (5 or 12.5molar equiv) prior to cross-linking led to the identification of aseries of novel cross-linked peptides. All these new cross-links(as well as the Lys-acetylation sites discussed earlier) wererigorously confirmed by tandem mass spectrometry. Tandemmass spectra of all potential cross-linked peptides were analyzedusing the MS2Assign program10 and then inspected “manually”using several criteria as described in the Experimental Sectionbefore inclusion, including the presence of sequence tags.16

Indeed, three additional informative distance constraints notobserved in the original cytochrome c cross-linking experimentswithout prior acetylation could now be unequivocally assigned(Table 1 and Figure 3). Combining all cross-linking experiments(with and without prior acetylation), a total of five useful distanceconstraints resulting from “long” sequence stretch spanning cross-

links have now been identified. For completion, a complete list ofall BS3-modified peptides containing type 1 or 2, including closelyspaced lysine residues that contain no useful distance information,are listed in the Supporting Information Table S3.

Specific examples of these new cross-linked peptides are shownin Figure 4. For example, Figure 4A shows a novel peptideconsisting of K39TGQAPGFSYTDANK53 (R peptide) and K100-ATNE104 (â peptide) cross-linked between Lys-39 and Lys-100.Fragment ions from both peptides were oberved, displaying adominant y series (ynR and ynâ). Figure 4B shows a new peptideconsisting of G(Ac)1DVEK5GK7 (R peptide) and K87K(Ac)-GER91 (â peptide) cross-linked between Lys-5 and Lys-87. Thistype 2 cross-linked peptide also displays an extensive fragmention series from both peptide chains. This latter peptide is ofparticular interest as it contains two neighboring lysine residues,Lys- 87 and Lys-88, only one of which is acetylated (Lys-88), thusdemonstrating the influence of N-acetylation on directing cross-linking reactions. Further MS/MS spectra of novel cross-linkedpeptides are shown in the Supporting Information, i.e., Figure S4(cross-link between Lys-7 and Lys-100), Figure S5 (differentpeptide as above with cross-link between Lys-5 and Lys-87), FigureS6 (cross-link between Lys-5 and Lys-8, no distance constraint),and Figure S7 (cross-link between Lys-72 and Lys-73, no distanceconstraint). MS/MS spectra for peptides proving the cross-linksLys7-Lys27 and Lys39-Lys53, respectively, (no prior acetylation)were previously published by Schilling et al.10

The beneficial effects of acetylation become evident in the factthat several of the new cross-links that have formed upon prioracetylation of cytochrome c result from lysine residues that wereidentified in the differential acetylation experiments to have lowreactivities (see Figure 1A and 2). Two such residues, Lys-5 andLys-100, also have relatively high predicted pKa values. Thus, itseems, such residues became available for cross-linking only afterpartial acetylation of other residues in bovine cytochrome c. Inaddition, prior partial acetylation can block one of two lysineresidues that are close in primary sequence and thereby preventthe two from cross-linking with each other. This will in turnfacilitate the unblocked lysine residue to cross-link with alternateresidues, reactions that would potentially lead to informativedistance constraints. This may have been the case for the lysineclusters Lys-5,7,8, Lys-86,87,88, and Lys-99,100.

Cross-linking with prior acetylation by 12.5 equiv of sNHS-Acyielded the same four distance constraints identified from cross-linking with prior acetylation by 5 equiv of sNHS-Ac. All fivedistance constraints from all experiments combined with orwithout acetylation (see Table 1) are consistent with the recentlyreported X-ray crystal structure of cytochrome c (PDB ID 2B4Z),indicating that the acetylation and cross-linking conditions did notperturb the native fold of the protein.

Protein fold identification (Table 2) was carried out in a two-step threading process as previously reported.2 First, we carriedout sequence threading with 123D+18 of bovine cytochrome c tofind the best 20 structural models out of a fold family database of1125 proteins sharing less than 45% sequence identity.19 Fivecytochrome c proteins are ranked first to fourth and eighth.Second, we then re-ranked the models by their agreement withthe cross-link-derived distance constraints, which resulted inranking of the cytochrome c proteins from first to fifth as shown

Figure 3. Distance constraints between lysine residues of bovinecytochrome c derived from peptide cross-link hits (see Table 1 andSupporting Information Table S3). The position (x and y values, wherey g x) of each symbol denotes the residue numbers of two lysineresidues whose R-carbons are separated by e23.85 Å, the maximumdistance allowed with fully extended cross-linker and lysine sidechains. The type 0 modifications (dead-end lysine acylations) fall ontothe diagonal line of y ) x. Informative distance constraints carry yvalues larger than x + 4 and are presented as points above the reddotted line.

956 Analytical Chemistry, Vol. 80, No. 4, February 15, 2008

Page 7: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

in Table 2. The threading models were scored by using eq 2,2

where n is the number of constraints; di is the distance CR-CR inthe model for the two residues in constraint i; and d0 is 23.85 Å,the maximum CR-CR through-space distance between BS3-cross-linked lysines.

With only the two spatial constraints derived from cross-linkingwithout prior N-acetylation, the re-ranking reports zero constraintviolation for the initially identified top nine models from fivedifferent protein families, insufficient for definitive fold identifica-tion. When the 3 additional spatial constraints obtained from cross-linking with prior acetylation of cytochrome c were included (thena total of 5 distance constraints), all 5 cytochrome c proteins werere-ranked to top 5 of the 20 structural models. Compared to theinitial ranking of the top 20 models by sequence threading, there-ranking based on all the 5 spatial constraints confirmed thecytochrome c fold by lowering two alternative fold models (â-spectrin, 1DRO, PH domain-like; pseudoazurin, 1PAZ, cupredoxin-like) out of the top 6 list. All top six ranked structures after theconstraint-based reordering are alignable with both 2B4Z, a crystalstructure of bovine cytochrome c, and 1CRC, a crystal structureof chicken cytochrome c, with RMSD values of less than 3 Å.None of the lower-scoring structures are alignable with 2BEZ or1CRC except for â-spectrin, which is barely alignable with 2B4Z(3.8 Å RMSD over 46 amino acid residues) but not alignable with1CRC.

The only one non-cytochrome c protein among the top sixmodels is 1KTE of glutaredoxin. Glutaredoxin has a differentfunction than cytochrome c and less than 10% sequence identitywith 2BEZ. However, its structure is alignable with both 2B4Z(2.7 Å RMSD over 42 amino acid residues, Figure 5) and 1CRC(2.8 Å RMSD over 46 amino acid residues). Since the goal ofMS3D is fold identification rather than functional annotation orsequence alignment, we consider 1KTE (glutaredoxin) a success-

ful hit and a demonstration of the robustness of the MS3Dmethodology.

CONCLUSIONTaken together, the variable and uneven nucleophilic reactivity

observed in this study among protein lysine amino groups to NHSester cross-linking reagents can be identified as a limiting factorfor the low diversity of intramolecular protein cross-linking thatis typically observed. As determined by partial (incomplete)reactions with the water-soluble amine-specific acetylation reagentsNHS-Ac, it was clear that there are reactive “hot-spots” that candominate the reaction profile and therefore place severe restric-tions on the observable lysine-lysine cross-links. Part of thevariation in this reactivity can be directly attributed to pKa

differences among lysine residues, which can vary over 2 units.Other factors are likely to contribute as well, such as stericconsiderations and relative solvent accessibility. Partial acetylationof the most reactive lysines prior to cross-linking was shown hereto be a simple and effective strategy to improve the diversity ofthe cross-linking reactions and thereby to generate sufficientdistance constraints needed for reliable protein fold identification.Nonetheless, it was the combination of cross-linking reactions,ranging from unmodified to low and moderate levels of priorprotein acetylation, that provided the most complete data set ofuseful distance constraints for proper fold family recognition.

After completion of this current study, we became aware of apaper in press by Lee et al.21 that used high-resolution Fouriertransform mass spectrometry (FTMS) coupled to ultrahigh-pressure chromatography to identify cross-linked peptides fromhorse-heart cytochrome c after treatment with BS3, the sameamine-specific bifunctional cross-linking reagent used in our study.It is interesting to note that they identified significantly more cross-links than had previously been reported for horse-heart cyto-chrome c,8 presumably due to the increased dynamic range ofFTMS and its associated higher mass accuracy and dynamic range

(21) Lee, Y. J.; Lackner, L. L.; Nunnari, J. M.; Phinney, B. S. J. Proteome Res.2007, 6, 3908-3917.

Table 1. Observed Lys-Lys Cross-Links and Corresponding Cross-Linked Peptides from Bovine Cytochrome c, withand without Prior Acetylation, Containing Useful Distance Constraintsa

equivAc-sNHS X-link typeb

X-linkpositionsc sequence number peptide sequences

modifi-cations M (exp) M (theor) m/z (z)

∆Mppm

0d type 2 K7-K27 6-8, 26-38 HKTGPNLHGLFGR-GKK BS3 1901.97 1902.08 635.00 (3) 530d type 1 K39-K53 39-55 KTGQAPGFSYTDANKNK BS3 1963.93 1963.98 655.65 (3) 245 type 2 K5-K87 1-7, 87-88 G(Ac)DVEKGK-KK BS3, Ac 1185.62 1185.68 396.22 (3) 465 type 2 K7-K100 6-8, 100-104 GKK-KATNE BS3 1030.53 1030.58 516.27 (2) 515 type 2 (K7 or K8)

-K1006-13, 100-104 GKKIFVQK-KATNE or

GKKIFVQK-KATNEBS3 1687.89 1687.97 563.64 (3) 48

5 type 2 K39-K100 39-53, 100-104 KTGQAPGFSYTDANK-KATNE

BS3 2283.02 2283.12 762.02 (3) 41

5 type 1 K39-K53 39-55 KTGQAPGFSYTDANKNK BS3 1963.86 1963.98 982.94 (2) 6212.5 type 2 K5-K87 1-7, 87-91 G(Ac)DVEKGK-

KK(Ac)GERBS3, 2xAc 1569.79 1569.85 524.27 (3) 41

12.5 type 2 K7-K100 6-8, 100-104 KATNE-GKK BS3 1030.54 1030.58 516.28 (2) 3812.5 type 2 K39-K100 39-53, 100-104 KTGQAPGFSYTDANK-

KATNEBS3 2283.02 2283.12 762.01 (3) 43

12.5 type 1 K39-K53 39-55 KTGQAPGFSYTDANKNK BS3 1963.91 1963.98 982.96 (2) 36

a All reactions used 50 molar equiv of the cross-linking reagent, BS3, relative to the total number of lysine residues (18) in cytochrome c withvariable equivalents of sNHS-Ac (with x ) 0, 5, and 12.5 equiv acetyl reagent). b Cross-link type 2 (inter-peptide cross-link) and type 1 (intra-peptide cross-link). c X-link positions in bold represent significant and new distance constraints that formed after cross-linking of partially acetylatedbovine cytochrome c. d Values as previously reported by Schilling et al.10

∑i)1

n {0 if di < d0di - d0 if di > d0

(2)

Analytical Chemistry, Vol. 80, No. 4, February 15, 2008 957

Page 8: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

Figure 4. Representative tandem mass spectra of novel, cross-linked peptides identified only in the partially acetylated cytochrome c showingtype 2 cross-links (interpeptide cross-link between two peptide chains). (A) ESI-MS/MS spectrum of peptides K39TGQAPGFSYTDANK53 (R-peptide) and K100ATNE104 (â-peptide) cross-linked between Lys-39 and Lys-100. The [M + 3H]3+ signal at m/z 762.023+ (Mexp ) 2283.02) wasselected as the precursor ion. (B) ESI-MS/MS spectrum of peptides G(Ac)1DVEK5GK7 (R-peptide) and K87K(Ac)GER91 (â-peptide) cross-linkedbetween Lys-5 and Lys-87. The [M + 3H]3+ signal at m/z 524.273+ (Mexp ) 1569.79) was selected as the precursor ion. Two additional acetylgroups are located on the N-terminal residues Gly-1 and Lys-88. (Note: for cross-linked peptides, the larger of the two peptides is referred toas the R-peptide and the smaller as the â-peptide according to the nomenclature of Schilling et al.10).

958 Analytical Chemistry, Vol. 80, No. 4, February 15, 2008

Page 9: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

of ion detection. It should be pointed out, however, that horse-heart cytochrome c has an additional lysine residue comparedwith the bovine form used here, which increases both the numberof potential Lys-Lys cross-links and tryptic cleavage sites.Nonetheless, a comparison of our results (after taking into accountthe nonhomologous Lys-60) shows that the absolute number ofinformative cross-links obtained after FTMS analysis was higher,although there remained unique cross-links observed only in ourdata set both before and after prior N-acetylation, e.g., Lys39-Lys53and Lys7-Lys100, respectively.

The publication of the work by Lee et al.21 also provides anopportunity to better understand the two dynamic range limitationsin cross-linking experiments: one, the ability to detect cross-linked

peptides covering many orders of magnitude in relative molarabundance and two, the relative reactivity of the combinatorialset of paired lysine residues in a given protein that minimallysatisfy the appropriate distance constraint. The study by Lee andcolleagues addressed the former issue and clearly demonstratedthe advantages of optimizing chromatographic separation andmass spectrometry sensitivity and mass accuracy. In contrast, ourresults using prior acetylation takes aim at the second dynamicrange limitation by redirecting cross-linking away from the mostreactive and entropically favored lysine pairs to those where atleast one of the paired members is less reactive. Alternatively,the new cross-linked lysine pairs found only in the protein afterpartial N-acetylation could be the result, at least in some cases,of overcoming an entropic disadvantage due to a larger distancethat the partially reacted and tethered cross-linking reagent needsto explore, a strategy which would be predicted to reduce thenumber of redundant and often uninformative lysine cross-linksbetween closely spaced lysine residues. Clearly, a combinationof the two approaches, prior-N-acetylation followed by better-optimized mass spectrometry and separation analysis, has thepotential to be a very powerful improvement in overall experi-mental design.

In regards to our choice of lysine modifying reagent (sNHS-Ac), it may also be possible to use other types of pretreatmentregimens, such as succinylation, which might have the additionalvalue of maintaining a charged site at lysine residues (althoughin this particular example, positive to negative) and thereforemaintaining protein solubility and tertiary structure. Of course,care needs to be taken when carrying out any protein modification,including N-acetylation and bifunctional cross-linking, as suchmodifications have the potential to alter protein structure. Indeed,

Table 2. Top 20 Threading Models Ranked by Distance Constraint Errora

protein name PDB ID fold family %IDthreading

rankberror [Å] by

two constraintscerror [Å] by

five constraintsd

ferricytochrome C 1ccr cytochrome C 57.52 1 (1) 0.00 0.00cytochrome c2

(Rhodospirillum rubrum)3c2c cytochrome C 31.90 2 (2) 0.00 0.00

cytochrome c2(Rhodobacter sphaeroides)

1cxc cytochrome C 26.98 3 (3) 0.00 0.00

cytochrome c2(Paracoccus denitrificans)

1cot cytochrome C 28.46 4 (4) 0.00 0.00

cytochrome c6 1ctj cytochrome C 18.18 5 (8) 0.00 0.00glutaredoxin 1kte thioredoxin fold 9.76 6 (7) 0.00 0.23NTRC receiver domain 1ntr flavodoxin-like 9.23 7 (16) 0.00 0.80pseudoazurin 1paz cupredoxin-like 6.35 8 (6) 0.99 0.99pseudoazurin 1pmy cupredoxin-like 7.94 9 (9) 1.14 2.11pleckstrin 1pls PH domain-like 7.58 10 (20) 2.27 2.27profilin Ib 1acf profilin-like 8.66 11 (19) 0.00 3.02heat-labile enterotoxin 1ltsD OB-fold 4.20 12 (15) 3.61 3.61flavodoxin 5nul flavodoxin-like 9.29 13 (11) 1.30 3.76kedarcidin 1akp IgG-like â-sandwich 3.97 14 (14) 0.00 5.32â-spectrin 1dro PH domain-like 8.94 15 (5) 12.97 12.97RANTES 1hrjA IL8-like 2.86 16 (18 2.80 13.30rous sarcoma virus protease 2rspA acid proteases 6.77 17 (13) 14.08 14.08CD2, first domain 1cdb IgG-like â -sandwich 7.56 18 (17) 4.65 20.50ipoyl domain of

dihyrolipoamideacetyltransferase

1iyv barrel-sandwichhybrid

6.48 19 (10) 6.12 34.01

designed zinc finger protein 1meyF zinc finger design 8.62 20 (12) 6.55 45.88

a Constraint error is the extent of model violation of the cross-link-derived distance constraints, as defined by eq 2. b Ranking is listed as afterre-ordering with five distance constraints data and in parentheses, the original ranking. c Distance constraints obtained from cross-linking experiments(50 equiv BS3). d Distance constraints obtained from cross-linking experiments (50 equiv BS3) without and in addition, with prior partial proteinacetylation (5 and 12.5 equiv of sNHS-Ac).

Figure 5. Structural alignment of bovine cytochrome c (2B4Z, gray)and glutaredoxin (1KTE, cyan). The RMSD is 2.68 Å over 42R-carbons.

Analytical Chemistry, Vol. 80, No. 4, February 15, 2008 959

Page 10: Partial Acetylation of Lysine Residues Improves Intraprotein Cross-Linking

this concern was one of the reasons we carried out bothN-acetylation and bifunctional cross-linking at several concentra-tions, as well as assessing for possible changes in proteindimerization, which could yield potentially confounding interpro-tein cross-links, by 1D gel electrophoresis. Overall, these findingshave profound implications in the development of the MS3Dmethodology for the determination of tertiary structure and fold-family assignments into a more broadly applicable strategy forprotein structure characterization.

ACKNOWLEDGMENTThis work is supported by NSF Grant CHE-0118481. We thank

Mr. Abraham Lo for technical assistance.

SUPPORTING INFORMATION AVAILABLE

Tandem mass spectra of cross-linked and acetylated peptidesobtained from cytochrome c, corresponding tables of acetylatedand cross-linked peptides (details of mass spectrometric identifica-tion), and a table showing cytochrome c lysine pKa and SAS values.This material is available free of charge via the Internet at http://pubs.acs.org.

Received for review August 1, 2007. Accepted November16, 2007.

AC701636W

960 Analytical Chemistry, Vol. 80, No. 4, February 15, 2008