immunoglobulin heavy chain gene expression … · vhsegments ofthree clones (am4.1, am92.1,...

13
Immunoglobulin Heavy Chain Gene Expression in Peripheral Blood B Lymphocytes Chichi Huang,* A. Keith Stewart,* Robert S. Schwartz,* and B. David Stollar* *Department ofBiochemistry, Tufts University Schools ofMedicine, Dental Medicine, and Veterinary Medicine and Sackler School of Graduate Biomedical Sciences, and tDivision ofHematology/Oncology, Department ofMedicine, New England Medical Center Hospitals, Boston, Massachusetts 02111 Abstract cDNA libraries for IgM heavy chain variable regions were pre- pared from unmanipulated peripheral blood lymphocytes of two healthy people. Partial sequencing of 103 clones revealed VH gene family use and complete CDR3 and JH sequences. The libraries differed in the two subjects. In one person's cDNA the VH5 family was overexpressed and the VH3 family underex- pressed relative to genomic complexity. In the second person's cDNA, VH3 was most frequently expressed. In both libraries, JH4 was most frequent. VH segments of several clones were closely related to those in fetal repertoires. However, there was also evidence of mutation in many cDNAs. Three clones dif- fered from the single nonpolymorphic VH6 germline gene by 7-13 bases. Clones with several differences from VH5 germline gene VH251 were identified. CDR3 segments were highly di- verse. JH portions of several CDR3's differed from germline JH sequences. 44% of the clones had DH genes related to the DLR and Dxp families, most with differences from germline se- quences. In 11 DLR2-related sequences, several base substitu- tions could not be accounted for by polymorphism. Thus, circu- lating IgM-producing B cell populations include selected clones, some of which are encoded by variable region gene seg- ments that have mutated from the germline form. (J. Clin. Invest. 1992. 89:1331-1343.) Key words: antibody. B cell cDNA library . diversity , repertoire Introduction The extensive diversity of antibody variable regions is due in large measure to the division of germline coding regions into segments, e.g., the VH, DH, and JH segments which together encode the heavy chain variable region (1, 2). Random combi- nations of the V gene segments give the immune system a vast potential repertoire. In the mouse, for example, the potential repertoire exceeds 109, and perhaps 10'°, different antigen bind- ing sites (3). But because there are only 108 B cells in a mouse, only certain elements of the potential repertoire are repre- sented at any given time in the actual repertoire of the animal. Our understanding of how B cells use the tremendous capacity Address reprint requests to Dr. Stollar, Department of Biochemistry, Tufts University School of Medicine, 136 Harrison Avenue, Boston, MA 02111. Receivedfor publication 11 September 1991 and in revisedform 25 November 1991. of the potential repertoire to generate the actual repertoire is limited. Results of previous studies suggest that the actual, or ex- pressed, immunoglobulin repertoire is not simply a random representation of the germline V gene potential. Nonrandom VH gene utilization is especially marked in the early stages of fetal development in both mice and humans (4-10), in malig- nant B cells (1 1, 12), in CD5' B cells (13, 14), and in autoanti- body-forming B cells (15, 16). However, the lack of informa- tion about the B cell repertoire in normal adults makes it diffi- cult to assess the significance of the restricted use of immunoglobulin V genes during development and in disease. It is not known, for example, whether the preferential expres- sion of VH5 and VH6 gene families early in ontogeny (4) is a peculiarity of fetal B cells, or whether the B cell repertoire of normal adults can also manifest such a bias. Until recently, investigations of the human B cell repertoire were, for technical reasons, confined to EBV-transformed B cell clones, neoplastic B cells, and a relatively small number of hybridoma-produced monoclonal antibodies (17, 18). In situ hybridization with VH gene probes can greatly increase the number of B cells that can be surveyed (19, 20), and polymer- ase chain reaction (PCR)'-based analyses have increased the number even more (21-26). Nevertheless, all these methods introduce their own bias. For example, neither the mitogen- stimulated B cells used for most in situ hybridization studies nor EBV-transformed B cells are representative of the entire population (27). cDNA amplification by PCR has allowed anal- ysis of CDR3 sequences of human immunoglobulin cDNA populations (21, 24, 26), and it has been used in mice to esti- mate the frequency of rearrangement or expression of members of a given gene family (23, 28). But the lack of univer- sal primers that would enable unbiased amplification of all V gene families has limited the scope of the PCR technique for studies of the expressed immunoglobulin repertoire. We have recently described a sensitive method for amplify- ing the variable regions of immunoglobulin cDNAs of all VH families in a diverse mixture of B cells (29). The cDNA is am- plified without using primers from variable region sequences, thus avoiding technical bias in the selection of amplified cDNA populations. The representative sampling allowed by the method permits analysis of immunoglobulin genes expressed by unmanipulated B cells, and gives a "snapshot" of the actual immunoglobulin repertoire at a given time. We report here an analysis of 103 unique clones from IgM libraries obtained by this method from two normal healthy adults. The clones were from cDNA libraries prepared from peripheral blood lympho- cytes that were not stimulated in vitro, and whose only manipu- lation was centrifugation through Ficoll-Hypaque. 1. Abbreviation used in this paper: PCR, polymerase chain reaction. B Lymphocyte Immunoglobulin cDNA Libraries 1331 J. Clin. Invest. © The American Society for Clinical Investigation, Inc. 0021-9738/92/04/1331/13 $2.00 Volume 89, April 1992, 1331-1343

Upload: danghanh

Post on 25-Aug-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Immunoglobulin Heavy Chain Gene Expressionin Peripheral Blood B LymphocytesChichi Huang,* A. Keith Stewart,* Robert S. Schwartz,* and B. David Stollar**Department of Biochemistry, Tufts University Schools of Medicine, Dental Medicine, and Veterinary Medicineand Sackler School of Graduate Biomedical Sciences, and tDivision of Hematology/Oncology,Department of Medicine, NewEngland Medical Center Hospitals, Boston, Massachusetts 02111

Abstract

cDNAlibraries for IgM heavy chain variable regions were pre-pared from unmanipulated peripheral blood lymphocytes oftwo healthy people. Partial sequencing of 103 clones revealedVHgene family use and complete CDR3and JH sequences. Thelibraries differed in the two subjects. In one person's cDNAtheVH5 family was overexpressed and the VH3 family underex-pressed relative to genomic complexity. In the second person'scDNA, VH3 was most frequently expressed. In both libraries,JH4 was most frequent. VH segments of several clones wereclosely related to those in fetal repertoires. However, there wasalso evidence of mutation in many cDNAs. Three clones dif-fered from the single nonpolymorphic VH6 germline gene by7-13 bases. Clones with several differences from VH5 germlinegene VH251 were identified. CDR3segments were highly di-verse. JH portions of several CDR3's differed from germline JHsequences. 44% of the clones had DHgenes related to the DLRand Dxp families, most with differences from germline se-quences. In 11 DLR2-related sequences, several base substitu-tions could not be accounted for by polymorphism. Thus, circu-lating IgM-producing B cell populations include selectedclones, some of which are encoded by variable region gene seg-ments that have mutated from the germline form. (J. Clin.Invest. 1992. 89:1331-1343.) Key words: antibody. B cellcDNA library . diversity , repertoire

Introduction

The extensive diversity of antibody variable regions is due inlarge measure to the division of germline coding regions intosegments, e.g., the VH, DH, and JH segments which togetherencode the heavy chain variable region (1, 2). Randomcombi-nations of the V gene segments give the immune system a vastpotential repertoire. In the mouse, for example, the potentialrepertoire exceeds 109, and perhaps 10'°, different antigen bind-ing sites (3). But because there are only 108 B cells in a mouse,only certain elements of the potential repertoire are repre-sented at any given time in the actual repertoire of the animal.Our understanding of how B cells use the tremendous capacity

Address reprint requests to Dr. Stollar, Department of Biochemistry,Tufts University School of Medicine, 136 Harrison Avenue, Boston,MA02111.

Receivedfor publication 11 September 1991 and in revisedform 25November 1991.

of the potential repertoire to generate the actual repertoire islimited.

Results of previous studies suggest that the actual, or ex-pressed, immunoglobulin repertoire is not simply a randomrepresentation of the germline V gene potential. NonrandomVH gene utilization is especially marked in the early stages offetal development in both mice and humans (4-10), in malig-nant B cells (1 1, 12), in CD5' B cells (13, 14), and in autoanti-body-forming B cells (15, 16). However, the lack of informa-tion about the B cell repertoire in normal adults makes it diffi-cult to assess the significance of the restricted use ofimmunoglobulin V genes during development and in disease.It is not known, for example, whether the preferential expres-sion of VH5 and VH6 gene families early in ontogeny (4) is apeculiarity of fetal B cells, or whether the B cell repertoire ofnormal adults can also manifest such a bias.

Until recently, investigations of the human B cell repertoirewere, for technical reasons, confined to EBV-transformed Bcell clones, neoplastic B cells, and a relatively small number ofhybridoma-produced monoclonal antibodies (17, 18). In situhybridization with VH gene probes can greatly increase thenumber of B cells that can be surveyed (19, 20), and polymer-ase chain reaction (PCR)'-based analyses have increased thenumber even more (21-26). Nevertheless, all these methodsintroduce their own bias. For example, neither the mitogen-stimulated B cells used for most in situ hybridization studiesnor EBV-transformed B cells are representative of the entirepopulation (27). cDNAamplification by PCRhas allowed anal-ysis of CDR3 sequences of human immunoglobulin cDNApopulations (21, 24, 26), and it has been used in mice to esti-mate the frequency of rearrangement or expression ofmembers of a given gene family (23, 28). But the lack of univer-sal primers that would enable unbiased amplification of all Vgene families has limited the scope of the PCRtechnique forstudies of the expressed immunoglobulin repertoire.

Wehave recently described a sensitive method for amplify-ing the variable regions of immunoglobulin cDNAs of all VHfamilies in a diverse mixture of B cells (29). The cDNAis am-plified without using primers from variable region sequences,thus avoiding technical bias in the selection of amplified cDNApopulations. The representative sampling allowed by themethod permits analysis of immunoglobulin genes expressedby unmanipulated B cells, and gives a "snapshot" of the actualimmunoglobulin repertoire at a given time. Wereport here ananalysis of 103 unique clones from IgM libraries obtained bythis method from two normal healthy adults. The clones werefrom cDNA libraries prepared from peripheral blood lympho-cytes that were not stimulated in vitro, and whose only manipu-lation was centrifugation through Ficoll-Hypaque.

1. Abbreviation used in this paper: PCR, polymerase chain reaction.

B Lymphocyte Immunoglobulin cDNA Libraries 1331

J. Clin. Invest.© The American Society for Clinical Investigation, Inc.0021-9738/92/04/1331/13 $2.00Volume 89, April 1992, 1331-1343

Methods

Preparation of cDNALibraries from human peripheral blood lympho-cytes. cDNA libraries were prepared from peripheral blood lympho-cytes as described previously (29). Lymphocytes were centrifugedthrough a Ficoll-Hypaque medium and washed with PBS; they werenot further manipulated before preparation of RNA. Double-strandedcDNA was synthesized from total cellular RNA according to themethod of Gubler and Hoffman (30) and blunt-ended with T4 DNApolymerase. The primer for cDNAsynthesis was complementary to asequence within the Cu1 region (29). Two steps of PCRamplificationwere performed, as described previously (29). The first step was primedby oligonucleotide linkers attached to the ends of the double-stranded(ds) cDNA. The products were ligated into Ml3mpl9 RF DNA. Asecond amplification used a downstream-nested Cu primer and an up-stream primer within the M13 vector DNA. The second PCRproductswere again ligated to M13 RFDNA. This ligation mixture was trans-formed into DH5abacteria to form the cDNA library for screening.

Analysis of the libraries. Libraries were screened for hybridizationwith a degenerate human JH gene oligonucleotide probe and VHfamily-specific oligonucleotide probes (17). M13 plaques were lifted ontoGeneScreen membranes, which were then prehybridized, hybridized,and washed as described (31). Inserts in M13phage were sequenced bychain termination with dideoxynucleoside triphosphates and Sequen-ase (US Biochemical Co., Cleveland, OH). For full V region analysis,sequencing was performed with two or three different primers, givinglarge overlaps that verified sequencing accuracy. Sequences were com-pared to those in the human Genbank database with the FAsTA pro-gram of the GCGsoftware package. The BESTFIT, LNEUP, and TRANS-LATE programs were used for further sequence analysis.

Results

Amplified IgM cDNA libraries were prepared from RNAofunstimulated peripheral blood lymphocytes from two healthyadult donors (35 and 36 yr old). In both libraries, > 85%of theplaques hybridized with a degenerate JH oligonucleotide probe.Sequencing of randomly picked JH-positive clones began fromthe 5'-end of the CMA region and continued through the JH,CDR3and at least the FR3. Complete V region sequences wereobtained in selected cases. The sequence data allowed assign-ment of VH families and full analysis of DHand JH gene seg-ments. All clones discussed below had unique CDR3 se-quences. All but four of the sequences corresponded to func-tional rearrangements with open reading frames through theVH, CDR3, and JH segments. A total of 103 clones from thetwo libraries were examined.

Use of VHgenefamilies in the normal adult repertoire. The54 randomly picked clones of the first normal subject (Au)included more VH1 than VH3 family genes-28% vs. 24% (Ta-ble I). This result was surprising because the VH3 gene familyhas the greatest genomic complexity and was the most frequentfamily detected in studies of expressed VHgenes from 104- and1 30-d fetal liver cells (5, 6), in adult peripheral B cells examinedby in situ hybridization (19, 20), and in EBV-transformed Bcells (17, 33). The higher frequency of VHl than VH3 familygenes in the AMcDNA library was confirmed by hybridizationto plaque lifts with FR3 specific oligonucleotide probes; withthis assay, 35% and 25% of 400 JH-positive clones weremembers of the VHl and VH3 families, respectively.

Another notable feature of the AMlibrary was that the two-member VHS family was highly represented (Table I), occur-ring in 10 (19%) of the 54 sequenced clones. The high represen-tation of this small family was confirmed in two different IgM

Table I. VHGene Family Usage in pcDNA clones

Library

VHgene Au TA Germline genefamily (n = 54) (n = 49) complexity*

1 28 25 332 4 4 113 24 49 404 15 17 125 19 6 46 5 0 1t 5 0 ?

* From Berman et al. (32).VVHgenes with < 78% identity to members of VHI to VH6 families.

libraries prepared from the same RNAsample. Results with thetwo AMu preparations are combined in Table I. 3 of the 54 AMclones were related to the single germline VH6gene. The distri-bution of VH gene family usage in the AM library was at theborderline of being significantly different from that expectedfrom the genomic complexity of the families (32) (P - 0.05).

In contrast with the AM library, the TMu library was a closerreflection of the genomic complexity of VHgene families; VH3members were most frequent, and the VH5 family was notprominent (Table I). Statistically, the distribution of VHgeneusage in the TMAlibrary was not different than expected from thegenomic complexity of the gene families (P > 0.05). Only 4 ofthe 103 IgM sequences in both cDNA libraries could be as-signed to the VH2 family. The frequency of expression of genesof the VH2 family (- 4%), which is estimated to contain fivegenes (34), was confirmed by plaque hybridization with aVH2-specific FR3 oligonucleotide probe (17) (not shown). Ourresult is consistent with previous observations (19).

A distinct subgroup or a possible new VHgenefamily. TheVH segments of three clones (AM4. 1, AM92. 1, and AM2.2) dif-fered substantially from any known member of the VHl to VH6families. These three clones were similar to each other in theVH segment but each used a different D gene, so they weredistinct clones. Their VH region sequences had 78% overallidentity with a known VHl gene, 20P3 (5), which was the mostclosely related gene among reported members of the 6 VHgenefamilies. Their FRl and FR2 sequences were, in fact, 93%iden-tical to highly conserved VH1 gene sequences. However, theyhad only 67% identity with any known VHl sequence in CDR2and FR3 (Fig. 1). These three VHsequences had closest overallidentity (96%) with that of a previously described autoantibodywith dual rheumatoid factor and anti-DNA activity, Ab47,which was considered to be a subgroup of the VHl family (18)(Fig. 1). PCRamplification of nonlymphoid genomic DNAwas used to test whether related genes were present in the germ-line or whether these novel sequences may have arisen from asomatic process such as gene conversion (35). A sequence re-lated to the three new clones was indeed found in the nonlym-phoid genomic DNAof the donor for the AM library (data notshown). One primer for this PCRwas in the unique region ofthe FR3 and the other was in the VHl-like FR1. This combina-tion of primers amplified a product of appropriate size fromgenomic DNA. By contrast, a control combination of the

1332 Huang et al.

1GCAACAGGTGCCCACTCCCAGGTGCAGCTGGT

--C- A----------C-- - A ---- --- - - - - - - -

CCAATCTGGGTCTGAGTTGAAGAAGCCTGGGGCCTCAGTGAAAATTTCCTGCGAGACTTCTTGATACACCTTCACTAGCTAG--------------------- ----------------- GG---- C--A- -G----G-------------------G--------------------- ----------------- GG-------- A-G -----G-------------------G--------------------- ----------------- GG-------- A-G -----G-G- - - -C-G- -G ----- G-----G------------------------- GG-C------ A-G -----G------------ CG- - - -

CDR1

Ai2.2144.1

A492. IAb4720P3

4A2.2A44.1A492.1Ab4720P3

4A2.2A44.1A492. 1Ab4720P3

TGCTATGAATTGGGTGCGACAGGCCCCTGGACAAGGGCTTGAGTGGATGGGATGGATCAACACCAACACTGGGAGTCCAAC

------ G----------- C------------------------------------- --------------- -AC ----CTA-- - -C-C--------------------------------------------------- C-T -- --G- --TGGCA- -- A

CDRI CDR2

TTATGCCCAGGGCTTCACAGGACGGTTTGTCTTCTCCTTGGACACCTCTGTCAGCACGGCATATCTTCAGATCAGCAGCCT

G---------------------------------------------------------- A-------C---- A- - -AAG- -TCAG- -CA- -G-CAC-A-GA- -AG ----- G- -CA ----- A- -C- -CA-GG- -C-G ---- G- -

CDR2

AAAGGCTGAGGACACTGCCGTGTATTACTGTGCGAGA

C---------------------------- C- -- -AG--A----C-----------G-------------- G

G-GAT ---- -C---G--------- -CG

unique FR3 primer with a VH3 FRI primer did not yield an

amplification product.Expressed VHgenes in circulating B cells have mutations in

CDRJ and CDR2. For several clones in which partial se-

quences were very similar to those of known germline VHgenes, sequencing was extended at least through the CDR1.

Three clones from the AM libraries (AM34.2, AM46.2, andAM51.1) differed from the single, highly conserved germlineVH6 gene by 7, 13, and 7 bases (Fig. 2 a). To ensure that thesevariations were indeed mutations, the germline VH6gene of theAM donor was cloned and sequenced. It was identical to thepublished sequence of germline VH6. In AMu34.2, five of theseven VHbase substitutions were in CDR2and 2 were in FR2.The JH segment had one difference from germline JH5; thischange was in the 5'-end, which forms part of CDR3. The DHsegment could not be assigned to a known germline gene.

In AM46.2, seven differences from VH6 were clusteredwithin an 1-base segment in FR3,2 The other differences were

scattered, with two in FR and four in CDRsequences. TheCDR3-encoding portion of its JH gene segment differed by one

base from the germline JH4 sequence. The heavy chain variableregion codons of clone AM46.2, therefore, contained 12 basesubstitutions from germline VH and JH gene segments; five ofthose differences were in CDRs. The D segment of this clonediffered by three bases from a 17-base portion of a germline

2. The clustering suggested that these differences may have arisen froma hybridization artifact that can arise when related genes are present(see reference 75). However, the 11 -base pair segment involved was notcloser to any other VHfamily, and the rest of the VHstructure, on bothsides of the cluster, had a sequence characteristic of VH6.

Figure 1. Three novel VHsequences in the AMcDNA library, comparedwith the sequence of au-

toantibody Ab47 de-scribed previously bySanz et al. (18) and thegermline VHl gene 20P3(5). The AM and TM se-

quences in Figs. 1 and 2have been submitted toGenbank, and have beenassigned accession num-

bers M82889 to M82899.

DNl gene. The third VH6-related clone, AM51.1, differed byseven bases from the germline VH6 sequence. One differencewas in CDR1, one was in CDR2, and five were in frameworkregions. The CDR3-encoding region of the JH segment differedby one base from a germline JH5 sequence. The D gene seg-

ment of AM51.1 could not be assigned to a known germline DHsequence.

Five clones were very closely related to either VH251 or

VH32, the two functional germline members of the small andminimally polymorphic VH5 family (Fig. 2, b and c). Twoclones from the AM library (Au59.1 and Au99. 1) differed by 15(AM59. 1) and 5 (AM99. 1) positions from VH251. The substitu-tions tended to occur in the hypervariable regions; 10 of the 20substitutions in these two clones were in either the CDRl or

CDR2. The D segment of A59. 1 could not be assigned, butthat of AM99.1 had a six-base sequence identical to part ofDQ52. Clone AMu2.1 in the AM library differed at one position (inCDR2) from VH32, had an unassignable Dsegment, and had a

JH segment differing by one base in the CDR3-encoding por-tion from JHl (Fig. 2 c).

There were fewer differences from VH5 germline genes

among the TMu library clones. One clone in this library, TM 16,differed by a single base from VH251. Its Dsegment differed bytwo bases from a 15-base portion of the DxpI sequence. It hadonly one "N" base, at the DHJ junction, and it had an unmu-

tated JH2 gene. A second TM clone, TMO, differed from VH251at four positions (one in CDR1). Its CDR3had an 11-basesequence identical to a portion of DXP'1, and its JH6 sequencehad one base change, which was in the CDR3-encoding por-tion of the gene.

cDNA of gene VH26 (also termed VH18/2 and 30P1 [36])was represented in two clones selected at random for sequenc-

B Lymphocyte Immunoglobulin cDNA Libraries 1333

4j2.2A;4.1A492. 1Ab4720P3

4I2. 2A44.1A492.1Ab4720P3

1V86 GGTGTCCTGTCACAGGTACAGCTGCAGCAGTCAGGTCCAGGACTGGTGAAGCCCTCGCAGACCCTCTCACTCACCTGTGCCAAu3 4 .2 --------------------- - - - - - - - - - - - - - -

4Aj46.2 ----------------------------------- G--------------------------------------------- GAju51. 1 --------------------------------------- A---------------------------- G--------

VH6 G V L S Q V Q L Q Q S G P G L V K P S Q T L S L T C A IA434.2A446.2 V4A51.1 R-

VH6 TCTCCGGGGACAGTGTCTCTAGCAACAGTGCTGCTTGGMCTGGATCAGGCAGTCCCCATCGAGAGGCCTTGAGTGGCTGGGMGGACATA434.2 --------------------------------------------C---- G-------------------------Aju46 .2 ---------------------- T-------------------------------------------------------------------As51.1 --------A------------ C---------------------------A----------------------------------------

V86 S G D S V S S N S A A W N W I R Q S P S R G L E W L G R T YA434.2- PA446.2A451.1 T

CDR1 CDR2

VH6 ACTACAGGTCCAAGTGGTATAATGATTATGCAGTATCTGTGAAAAGTCGAATAACCATCAACCCAGACACATCCAAGAACCAGTTCTCCCA434.2 ------------------T-----G----------------G--G-A------------------------------------------A446.2 --C_-____________________C-_________________T-________ TA--A--TGTT_------------------------Aju51 .1 ----- G------------------------------ ------------------------------------------------------

VH6 Y R S K W Y N D Y A V S V K S R I T I N P D T S K N Q F S LA434.2 --- F G --- E GA446.2 H- ----- N ---I--V-A4u51.1 -G

CDR2

VH6 TGCAGCTGAACTCTGTGACTCCCGAGGACACGGCTGTGTATTACTGTGCMGAN D NA434.22 ----------------------------------------------------- GGGAGAGATGGCTACA (*)4A46.2 ----------------------------------------------------- GATCCA TATAGCAtCAa&TGG (DN1)451.1 ------------T---------------------------------------- GAGGCGGGGAGGGCCACACAG (*)

CDR3VH6 Q L N S V T P E D T A V Y Y C A RA434.2A446.2A4u51.1 - - - F

Vl,6 inA434 .2 TcCGACTCCTGGGGCCA&GGAACCCTGGTCACCGTCTCCTCA(JH55)Ajs46 .2 TACgTTGACTcCTGGGGCCAgGGMCCCTGGTCACCGTCTCCTCA(J,4)Ais51. 1 CTGGTTCGACcCCTGGGGCCtgGGMCCCTGGTCACCGTCTCCTCA(JH5) a

CDR3

1VH32 GGAGTCTGTGCCGAAGTGCAGCTGGTGCAGTCCGGAGCAGAGGTGAAAAAGCCCGGGGAGTCTCTGAGGATCTCCTGTAAGGA42.1 ----------------------------------------------------------------------------------

V32 G V C A E V Q L V Q S G A E V K K P G E S L R I S C K G42.1

VH32 GTTCTGGATACAGCTTTACCAGCTACTGGATCAGCTGGGTGCGCCAGATGCCCGGGAAAGGCCTGGAGTGGATGGGGAGGATTGATA42.1 --------------------------------------------------------------------------------------

V32 S G Y S F T S Y W I S W V R Q N P G K G L E W M G R I D42.1 - - - - - - - - - -- -

CDR1 CDR2

VH32 CCTAGTGACTCTTATACCAACTACAGCCCGTCCTTCCAAGGCCACGTCACCATCTCAGCTGACAAGTCCATCAGCACTGCCTACCTA42.1 ----------------------------T---

Figure 2. The heavy chainV32 P S D S Y T N Y S P S F Q G H V T I S A D K S I S T A Y L variable region cDNAse-A42.1- - L quences of clones in the

CDR2 AM1 and TM libraries con-

VH32 GCAGTGGAGCAGCCTGAAGGCCTCGGACACCGCCATGTATTACTGTGCGAGAN D N taining VHsegments re-

A42.1 ---------------------------------------------------- CGGGGCTTCAATGGCCAACTGATTTT(*) lated to (a) the singleCDR3 germline VH6 gene, (b,

V32 Q W S S L K A S D T A M Y Y C A R opposite page) the germ-AA2.1 -line VH5 family gene

VH25 1, and (c) the germ-JH C line VH5 family gene

AIL2.1 C TGGGGCCAGGGaACCCTGGTCACCGTCTCCTCA(Jml ) V VH32.

1334 Huang et al.

1GGAGTCTGTGCCGAGGTGCAGCTGGTGCAGTCTGGAGCAGAGGTGAAAMAGCCCGGGGAGTCTCTGAAGATCTCCTGTMAGGGTTCTGGATACAGCTTTACCAG

---------------------------------------------------------G--------------C----------------- G--C

G V C A E V Q L V Q S G A E V K K P G E S L K I S C K G S G Y S F T S------ - - - - - - - - - - - - R - - - -A - - - - - S T

--T

--NCDR1

CTACTGGATCGGCTGGGTGCGCCAGATGCCCGGGAAAGGCCTGGAGTGGATGGGGATCATCTATCCTGGTGACTCTGATACCAGATACAGCCCGTCCTTCCMA---------------- T ------------------------------ AC--------- -C-C---- --------------- AC------------------

YWI GWVRQMPGKGLEWMGII YP GDSDTRYSPSFQ

A F E

CDR1 CDR2

GGCCAGGTCACCATCTCAGCCGACAAGTCCATCAGCACCGCCTACCTGCAGTGGAGCAGCCTGAAGGCCTCGGACACCGCCATGTATTACTGTGCGAGA----- - T------------------------------------------------------------------------- C-T ----------------------------T--------------------------------------

-~~~~~~~~~~~~~~~~~A--------------------------------------------------------------- A-

G Q V T I S A D K S I S T A Y L Q W S S L K A S D T A M Y Y C A R- - - - F - - - - - - - - - - - - - - - - - - - - - - - - F - - -

N-H-N

N

TTTGCGACTC

CTTTCTT

DCTCGGGTGGCCGGGCCAGGAACAATACC

CTGGGGGGTaCGGGGCaTTAT

TATGGTTCGGGCDR3

N

C

GAGGAC

(*)

(DQ52)(DXP'1)(DXP'1)

JHACTTTGACTgg TGGGGCCAgGGAACCCTGGTCACCGTCTCCTCA(JH4)

CTACTACTACGGTATACGTCTGGGGcaAAGGGACCACGGTCACCGTCTCCTCA(JH6)GGTACTTCGATCTCTGGGGCCGTGGCACCCTGGTCACTGTCTCCTCA(JH2)

TACTACTaAtTACTACGGTATGGACGTCTGGGGGCAAGGGACCACGGTCACCGTCTCCTCA(JH6)CDR3

Figure 2 (Continued)

b

ing. It was identified by hybridization of membrane lifts withCDR and CDR2oligonucleotide probes. Two clones isolatedon this basis differed by one and three bases from the VH18/2gene (not shown); VH18/2 is a highly conserved gene (36).

Biased use and diversity of the JH4 gene. The JH4 gene was

overrepresented in the cDNA libraries of both subjects. It oc-curred in 48% of all sequenced clones (Fig. 3). Use of other JHgenes differed in the two subjects. For example, JH5 occurred in24%of the Aju sequences and 13%of the Tu sequences. TheJHland JH2 families were present infrequently, an observation alsomade by others (19, 21).

Considerable diversity was found in the 5' end of the 49sequenced JH4 gene segments (Fig. 4). This region of the genecontributes to the CDR3 of the heavy chain. The diversityarose both from variation in the point of JH4 joining to the restof the CDR3and from nucleotide differences from the germ-

line sequence. Apart from a previously recognized polymor-phic G/A variation (37), 15/28 Au clones and 11/21 Tju clonescontained substitutions in JH4, ranging from one to three innumber, that cannot be accounted for by polymorphism. Most(81%) of the substitutions were clustered in the CDR3-codingregion of the gene and most led to amino acid substitutions(Fig. 4 b).

Heterogeneity of DHregions. The DHsegment sequences inboth libraries were highly diverse (Figs. 5-7). Parts of some

CDR3sequences, ranging in length from 6 to 26 bases, were

identical to portions of known germline DH gene segments;such germline sequences were found in 10/54 Au clones (18%)and in 16/49 Tg clones (30%). The DHregion sequences of 12other Au clones (21%) and 9 other Tu clones (19%), from 15 to29 bases long, had only one or two differences from germlineDHgenes. The remaining 56 clones had more substantial differ-

B Lymphocyte Immunoglobulin cDNALibraries 1335

V8251Au59. 1AIL99. 1T016TpO

VH251A&59. 1A.99. 1Tj1 6TAO

VH251Au59.1AF99. 1Ti1 6TIO

VH251AP59. 1A#99. 1T;,16TFO

VH251Aft59. 1AF99. 1Tj16TpO

VH251A#59. 1Aju99. 1Tj16TAO

VH251A&59.1A,99. 1T;16T11O

Aju59.1AF99. 1Tj16TpO

60

54

48

42

* 3603 300

* 24

18

1 2

6

02 3 4

JH Gone Families

Ap

I.-:STp

5 6

Figure 3. Preferential usage of JH4 genes in the 54 Au and 49 Tgclones of the cDNA libraries. JH gene use was assigned on the basisof the complete JH sequence for each clone.

ences from known germline DHsequences (Fig. 5 a). 21 (Fig. 6)were not assigned to known DHgenes because they had eitherno identifiable sequence identity or, in some cases, becausethey had <75% identity with a known DHgene.

Direct comparisons between observed and germline se-

quences were possible with the DLRandDxp families, for whichthe expected germline members are known (38, 39), and withthe single DQ52 gene (40). The majority of the expressedmembers of these families in both Cu libraries contained differ-ences, ranging from one to five bases, from germline sequences

(Fig. 5). All assignable clones used only part of the germline DHgene segments, and N insertions were observed in all of them(Figs. 5 and 7). The average length of N at the VH-DH junctionswas 5.7 and at the DH-JH junctions was 4.7 bases. Amongclones with long N regions, eight CDR3 sequences might beaccounted for by D-D or D-DIR fusion (Fig. 7).

DHgene usage was not random. In the combined libraries,the DLR andDxp gene families were used with high frequency;these two families accounted for 54% of the assignable clones(44% of all clones). Sequences related to the DLR2 gene alonewere present in 11 clones; sequences related to DK4 and DN1also occurred at high frequency (Fig. 5). The DQ52 gene seg-

ment, which is overrepresented in human fetal liver (5), was

present only once in the AMlibrary and twice in the TM library.

Discussion

The sampling procedure. Wehave used a sensitive cDNA/PCRcloning method to examine usage of Ig heavy chain variableregion genes in peripheral blood B cells of two normal adultdonors. The procedure uses no variable region primers and,therefore, does not itself bias the V gene sampling (29). More-over, since the lymphocytes were not stimulated in vitro, theresults provide an insight into the V gene repertoire of circulat-ing B cells in their native state at the time blood is drawn.

We do not know whether all B cells synthesize enoughmRNAfor a cell to be scored in this analysis. In humans, many

of the circulating human B cells appear to be resting cells. Only0.1-1% of peripheral blood mononuclear cells synthesize

mRNAat levels that can be detected by in situ hybridization(19, 41). PCRhas a high sensitivity and probably samples alarger population than is detected by in situ hybridization.

The inherent error in PCR-based sequencing. The totalnumber of nucleotides in the fully sequenced JH4, VHS, andVH6 genes was 4,065, among which there were 96 differencesfrom germline sequences, for a rate of 24 bases per 1,000 (4X 10-4 per nucleotide incorporated in the two PCRsteps of 30cycles each). That frequency of base substitutions is muchhigher than the error rate of the PCRtechnique, which is - 5X IO- per nucleotide incorporated, both in the reported experi-ence of others (25, 42, 43) and in our own experience. Forexample, several clones in different libraries from one individ-ual that we studied had identical CDR3sequences and wereprobably multiple copies of a single cDNA. The substitutionfrequency among those sequences was 1/300 bases (2 X 10-'per nucleotide incorporated), a level at which PCRerror couldnot be distinguished from clonal divergence.

VH gene family usage. Previous studies of VH gene familyusage by circulating B cells from human adults, carried out byin situ hybridization with VH family-specific oligonucleotideprobes, have drawn different conclusions, perhaps because ofvariations in technique and in sampling procedures. In the ex-periments of Guigou et al. (19), there were differences amongindividuals, but an average pattern of VHgene family expres-sion by unstimulated cells could be defined. VH3 family geneswere the most frequently expressed, and VHgene family usagecorrelated roughly with their genomic complexity. Zouali andTheze (20) averaged results of protein A-stimulated B cellsfrom eight adults. They observed that the VH gene familieswere not represented in a random way; the VHI family wasunder-represented, whereas the VH3 family was overrepre-sented relative to genomic complexity.

The results of our study emphasize that there are indeeddifferences among single samples from different normal indi-viduals, as Guigou et al. (19) found. In the cDNAlibrary of onesubject (Abu), a nonrandom representation was seen, with dis-proportionate representation of VH5, and fewer than expectedVH3 gene family members. The Asulibrary, which also con-tained 3 VH6 members, resembles, in its overall composition, afetal CQ library (5, 6). Further study will be required to deter-mine whether that pattern is stable for the donor of the Aulymphocytes. It is possible that the nonrandom VHdistributionin this library reflects an unknown, recent immunizing stimu-lUS. VH gene usage in the cDNAlibrary from the TI donor, bycontrast, more closely paralleled the genomic complexity ofthefamilies; however, that single library does not exclude a contin-uously changing pattern of VH gene usage.

The three novel sequences, with FRI and FR2 sequencescharacteristic of VH1 genes and unique CDR2 and FR3 se-quences, along with Ab47 (18), may represent a distinct sub-group of the VHl family, as suggested by Sanz et al. (18), or anew VH gene family. A closely related combination of se-quences exists in the germline DNA, as shown by PCRamplifi-cation. This subset of genes may have arisen from a gene con-version or recombination in evolutionary time rather than as asomatic event. The clones in this group have 78% overall se-quence identity with the mouse immunoglobulin gene VH9.

The normal VH gene repertoire contains genes used by fe-tuses andfor autoantibodies. Table II summarizes the findingsin seven clones (sequenced at least from CDR1 to the end ofFR3) with 97% or more identity to members of the set of VH

1336 Huang et al.

-7

LO f--n" F-: ..

CDR3

TACTTTGACTAC___________

TGGGGCCAAGGAACCCTGGTCACCGTCTCCTCA

--T--------- --------G--G---------------------________ --------G---------A--------------

-___--------G----------______________------CT- --------G--------________________

--A-------- --------G---T----------------T------C-- --------G---------_______________

---C---G-------------G--------________-_______

bJH4

As4. 1At49.1Aj52.1AM59.1A#61.1A#70.1At90O.1Au92.1Aj94.1At95. 1AM96. 1Aul100. 1AM2.2AM29. 2AM3.2AM31.2Aju37.2AM39.2Au4.2AM40.2AM42.2AM44.2AFt45.2Ayu46.2Aju47.2A#52.2Agu6.2Ayu8. 2Tjs5T;&10TA17TM19TA20T1t2 1Tu22TM23TM24TA29Ty41T;L42TFt61T;74TM75TA76TM84Tjt87TM90TM98Tu100

CDR3

TACTTTGACTACY F D Y

_ - w

_ - w

_ - F

- F_ - H

LG -y -

_

L -L - -

-V-s

- w

-GF

LG-

D D -

L - L

L - -

L- H

- VG-S

TGGGGCCAAGGAACCCTGGTCACCGTCTCCTCAWGQGTLV TV S S

_ _ _

-~~ -

- -A- _S.

Figure 4. (a) JH4-related base sequences in clones of the Au and TA cDNA libraries. Most differences from the germline JH4 gene occur in theportion that contributes to CDR3. (b) Translated amino acid sequences of JH4-related segments of cDNAclones. Most of the base substitutionsin the CDR3portion lead to amino acid substitutions.

genes, such as 58P2, that has been a feature ofthe immunoglob-ulin V gene repertoire of fetal B cells. Five of the seven VHgenes represented in these clones are known to be used to formautoantibodies such as rheumatoid factor, cold agglutinins,and anti-DNA and anti-cardiolipin antibodies (VH251,21/28,FL2-2, VH6, and VH32 (37). To those we can add the threegenes closely related to Ab47, a rheumatoid factor/anti-DNAantibody. VHgenes with one and three base differences fromthe germline VH26, used in anti-DNA autoantibodies, were

also identified in the Al library.B cells capable of forming such autoantibodies are highly

represented among human-human hybridomas (36), EBV-transformed cells (13, 14, 44), and B-cell malignancies (1 1, 12).Many of them, like those listed in Table II, use VH genes thatare also expressed by fetal B cells, with few or no mutations

from the germline VH, DH, or JH components. These results are

compatible with the conclusion, drawn from studies of EBV-transformed B cells, that cDNAs associated with IgM autoanti-bodies are highly represented in the normal B cell repertoire(44). Somesuch immunoglobulins, encoded by VHgenes withfew mutations, may bind to both autoantigens and foreign an-

tigens such as bacterial polysaccharides (45, 46).Evidence that circulating B cells have undergone selection.

Several lines of evidence, when taken together, strongly suggestthat many IgM+ B cells in the circulating blood are not naive,but instead have undergone selection and clonal expansion.Four aspects of our results support that conclusion: overrepre-sentation of V region gene families, or of individual V regiongene segments; somatic mutation of V genes; the high fre-quency of replacement substitutions compared to silent muta-

B Lymphocyte Immunoglobulin cDNA Libraries 1337

aJH4AM4.1AFs49.1A#52.1AM59.1A#61. 1A#70O.1AM9O.1A#92.1AA94.1AM95. 1A#96.1AA100. 1AM2.2AM29.2AyL3.2A#31.2A#37.2A#39.2AM4.2AA40O.2AM42.2AM44.2AM45.2AM46. 2A#47. 2Ay52.2AM6.2AM8. 2TM5TM1OTys17Tt19Tyt2OTM21Tys22TA23TA24TM29TA41TM42TM61Tyu74TM75TA76TM84Tyu87TM9OTM98TM100 L

I

I

-----------

-----

---------

----- G--T--------

---------

--A-GG---

Ia DA'TGACTACAGTMCTAC

4106.1 GACCCCCCCA -------G-CG-----492.1 T ------T-G-----A437.2 ACGAGTTT -- --G--G-TIL17 ACCCGCTACGG - -- -G- -G- - -Tp92 ACACCAGAGA --G--------

DKOGTGGATATAGTGGCTACGATTAC

TIL41.2 GATTT --C--A -----

D I

TTGGAGGCCGTTACTTCCG

TT

DK4GTGGATACAGCTATGGTTAC

--A ---A-------A -------- - T-

--- -CA ------ T---- -T- - -G- - - -

CCCCcGAGCTACATTTT

CATGATC

DM1GGTATAACTGGAACTAC

-----A- - --C--C-

CTGGTCT 2TGGGGT 3TAGACGGATCCTAC

Dm2GGTATAACCGGAACCAC

-...T- -

-C ---- -------

Dm5GGTATAACTGGAACAAC

TA82 GGTTTAG --G---------

GAAGGTGGGGGTTGGTGGCGCCCTCCCAAATCCGATCCAATAACGCACG

GGGTTCCGACCCGAAAAGGCAAACCGGAGG

T5L58 A

T5L61 CGAG

Du1GGGTATAGCAGCAGCTGGTAC

--------TG-G- - - -AC- -

-C----------

-- - - -- -T--AG- ---

DN4GAGTATAGCAGCTCGTCC- - - -C--------

-C--

D,12..... .AGCCTCCGGAGCCCCCGCAGAGACCC...

-----TA--A-C-T--

D,12.... . . AGCCCAGCCCCCCACCCAGGAG

--- -A-T----

TTTAC

TCGGAGTTTAAAC

CT

GTTCAGT

TCGAGGAGTCC

TTGGC

TTGAAG

Figure 5. (a and b) Relation-ship of CDR3base sequences

of clones in the A, and T,cDNA libraries to knowngermline DHgenes. Numeri-cally annotated clones: 1, Tg56may be assigned equally wellto DK4, DKl, or DK5; 2, cloneA,u29.2 may be assignedequally well to DMl or DM5; 3,clone AulO3.1 may be as-

signed equally well to DM1,DM2, or DM5; 4, clone Ag47.2/r indicates that the sequence

is reversed. Position 20 inDXP, I in b has been reportedas C (39) or A (38).

tions; and the clustering of nucleotide changes in hypervariableregions. Setting aside the question of preferential utilization ofcertain VH genes in pre-B cells-which occurs in fetal life (6,47, 48)-the biased representation of certain groups of Vgenes,

or the repeated use of individual or highly related Vgenes in therepertoire point to the effect of ligand selection on the popula-

tion (23). This was found for VHgene families in the AA library,where VH5 genes were present out of proportion to their ex-

pected frequency.There are probably more than 30 human DHgenes (37, 38).

Thus any individual DHgene in an unbiased population shouldhave a frequency of less than 1/30 (< 3.3%). Another indica-

1338 Huang et al.

TGTATATATTATATTGACGAGAGATCTTMGATACTGAGTGTCCG

44.2435.2451.2TIL1OTLI 2TL13TL56TL84

A429.2A41 03. 1Tp75

GGCAAGTCTTCGGGCCGA

447.2/rTL24TL59

GCCCGCCTATGAGGCGGACCGGT

TGCCCCCATC4

GGTATGAT

437.1481.1A485.1494.1A4I31.2A446.2T;L20T3L47

A498.1A432.2Tg4

D

AGGATATTGTAGTGGTGGTAGCTGCTACTCC-C------- A--C - G--

------C-----G---

-C--T ----------- C--A ------ A-----------A--C------

------C------T--------T-----C-G---

---TC--------------------------C----_-------------

----CC C---

J

GGGCMACGGAAACCTTGGGGTCTTTTGGGATCCTCCGCGGGAAGMGGGTTTCAGGGGAGGAGCCTAGATCGTCCCATCTCTTGGMCCGAGCACATC

TI86 CTGG

Aju22. 1AA91.1Aju93.1A44O.2Aj45 .2

CGATGTCGTGGACTCCCTCTAAGACTA

AA99.1 TTTGCGACTCTM76 CCA

A496. 1A4u60.2TgO0Tgs6Tj&16Tgs73TM74Tgs90

Ajs44.2Tjs1T;&60

DLR3AGCATATTGTGGTGGTGATTGCTATTCC

-- -C- A - -C- -

DLR4MGGATATTGTAGTAGTACCAGCTGCTATGCC

-G-----------C---G-C------------------------

----A------T-GA--------------------G------------

-G--------------------

DQ52CTAACTGGGGA

---C----

DxrlGTATTACTATGGTTCGGGGCGTTATTATAAC

-A-----A ---A-------------A----_________---------__--- C-

---A -----A-----T--------------____-----------A-

-------A---- C-

GGMCGAGGATGGCACTTTCTTCATGGGGA

CGGAGGC

D111GTATTACGATATTTTGACTGGTTATTATAAT

__________________________GGCTCCCGMTTATTGGTGGGGGCGG

Ts29 CTTs91 C

Tgs42TI46Tjs83Ts87Tg100

GCATGCTCCCGAGGCCGCCGCGAGAAAC

Dq,2GTATTATGATTACGTTTGGGGGAGTTATGCTTATACC

---C-T-------------------CG--C--C---

D1p3GTATTACTATGATAGTAGTGGTTATTACTAC

-G-------A-G---

__ _____-_---------------- G----------- G-GG---

CTCCTT

TGAGCGGGGGGGG

CcCC

CTGGCAGTG

ACGATCAGGACCAGMGCATCGG

CGGGCTTTCCCCCTAAAACCTT

CGGCTTCGGACA

GGGCATGT

GGATCGGCG

C

GTATTACGATTTTTGGAGTGGTTATTATACC---C-G-----------A ------------------_____

tion of selection in the libraries we tested is the over-representa-tion of the DLR2 genes, present in 6 of the 54 AM clones (1 1%)and 5 of the 49 TM clones (10%). The assignment of DLR2 genes

is possible because all five members of the DLR family are

known (38, 39).

C

TACCC Figure S (Continued)

Gu et al. (23) have analyzed members of a large VH gene

family (J558) expressed by B cells from three unimmunizedCB.20 mice. In contrast to populations of pre-B cells, whichexpressed the -100 J558 genes randomly, populations of ma-

ture surface IgM+ splenic B cells were found to express preferen-

B Lymphocyte Immunoglobulin cDNA Libraries 1339

I

Ajs41. 1Ajs78.1Aju95.1As1 .2Au3 .2AA8.2TA19TM22TM26TIA9TI&98/r

CATTCCCCTGGAGAGGGGCGGACTGGT

AGACGACTGTGMAAGGTCCAGGATACCCC

At0l. 1AiulO.2Tgs2l

A41.1 CCCCGTGACTTATGGGTCCACGATA42.1 CTGGGGCCAGGGAACCCTGGTCACCGTCTCCTCAA45.1 GGTTCGACCCCTGGGGCCAGGGAACCCTGGTCACCA449.1 TGGCCCGGCCAACCCGACTCCTCGCAACAGA4L51.1 GAGGCGGGGAGGGCCACACAGA452.1 GATCCTTTGAAGTCCGCGGA459.1 CTCGGGTGGCCGGGCCAGGAACAATACCA4L61.1 GTTGTAGGCGAAGTAAACTTTGGGAAAGTTGCGTTTTA4L70.1 TTCGCCGACGATGATCCCGAGA4L73.1 CGGGGCTTCAATGGCCAACTGATTTTA490.1 ATTTGCGGAATTAAGAACTGGCTCGGCCCA4100.1 CGGGGCTCGGCTGGTACAGGGTAA46.2 GTATTGTTCCTGGCCCGGCACCCGAGA434.2 GGGAGAGATGGCTACAA442.2 TACTTCGCGCAAA4L43.2 ACAGACAGGCAGTACGAATpi 1 CGAGGGCCAATCACGGTGGTAACTCCGGAGGTGCTIL25 GTCGATCCAGGATAACAGTGGCTGAAATGGACTK50 GCCAAGGACCGGCTGT1L69 GAGGACATGGTJL70 ATCGTCGAGTCTTTGAGTACC

Figure 6. CDR3base sequences that could not be assigned to germlineDHgene segments. These cDNAs do not contain a portion with morethan 75% identity with known germline DHgenes.

tially certain members of the J558 family. This finding is analo-gous to ours, above, and to the repeated expression of VH18/2,a member of the VH3 family, in humans (17). On the basis oftheir studies in the mouse, Gu et al. (23) proposed that theperipheral B cell population contains many B cells that haveundergone ligand selection, perhaps soon after they emergefrom the bone marrow. A similar conclusion can be drawnfrom our studies.

J D

DA A Dccc3TATAGTGG;CTACTCACCACC--- T-------- ---A---GA44. 1 GC

Somatic mutation of V genes is a cardinal manifestation ofclonal selection of B cells (49-56). In the case of the VHgenesegments we analyzed, definitive evidence of somatic mutationwas found in the case of the 3 VH6clones (Fig. 2 a). These threeclones differed from Au's own germline sequence by 7(Ag34.2), 13 (Au46.2), and 7 bases (Ag5 1.1). The AM34.2 andAg5 1.1 clones also had evidence of somatic mutation in theirCDR3sequences.

The sequences of the two functional VH5 genes, VH25 1 andVH32, are remarkably consistent in the human germline (37).The nucleotide sequences of all five examples of VH5 familygenes in both libraries differed from VH25 1 and VH32 by 1-15bases (Fig. 2 b). Given the conservation of VH25 1 and VH32 inthe germline, it is highly likely that the variations we observedcan be attributed to somatic mutation.

Adding to the evidence from the VHsequences for somaticmutation is the finding that the CDR3portions of many JH4sequences in the Au library differ from germline genes in a waythat cannot be explained by polymorphism (Fig. 4). Whereaspolymorphic sites, such as the G -> A substitution in JH4, areidentical in all clones from the same subject (Fig. 4), somaticmutations of V genes are typically clone-specific. Furthermore,the variations from the JH4 germline sequence were not ran-dom, but clustered at the 5' end of the gene; of the 42 bases thatdiffer from the germline (not counting the polymorphic G-0 Asubstitution), 81% occur at the 5' end of the gene, in the regionthat contributes to the CDR3.

A mechanism other than polymorphism is also required toaccount for the several DHsequences that are closely related toDLR and Dxp (Fig. 5). Even if subject Au had polymorphic

N

1/r:AC

GGGAC

D1,4 DNTATTACGATTTTTGGAGTGGTTATTACAGCTCGTCC----T-----------------C--- -C-T------

DM5 DN1TGGAACAACGCAGCAGCTGGTAC

- ---- --TG-----A439 .2 GGCGGAGGA

DN DIAATAGCAGCTCGTCC TTTGGGG

A452.2 ACG ---A----------GTCAC-------

DN1TATAGCAGCAGCTGGTAC--------TG--------Tp5 GATCA

Dj1xGAGGCCCC--T ----- TTTCG

T1s7 GATCTAACCTCTCT

DM1 DquGTATAGCAGCAGCCTAACGGGGG_- - - ---__---_ GTTTTCGGGAGAT

DXrl/r DN1ATAATMCGCCCCGAAGGGTATAGCAGCAGCTGGTAC---TGC---G------A-A---------TG--------

DLR4 D.2/rTTGTAGTAGTACCAGCTGCTATGCGGCTTGTGGGCG----------------A------- ----G-CA----

CGGCG

AGTAC

Figure 7. CDR3base sequences that may be accounted for by D-D or D-DIR fusions. These include examples in which one of the fused segmentsis reversed (DLR3/r, D'xpl/r, and DIR2/r).

1340 Huang et al.

A4u2.2 ACA GGGCTGA

TAGACG

T1s15 G

TM23 GT

Table II. cDNAs Closely Related to Germline VHGenes

Percent Related No. of basesClone No. identity gene VH family sequenced

TIA 16 99.6 VH251 5 292TM59 99.1 M60 2 222TM73 99.1 58P2 4 230TM74* 99.1 21/28 1 229Tg49 97.3 FL2-2 1 218A5Sl.l 97.8 VH6 6 315AMu2.1 99.7 VH32 5 303

* Clone TA74 has the same Nand DHsequence as the anti-DNA au-toantibody 21/28 (see reference 76).

differences from the published DLR2germline sequence in bothalleles-an unlikely proposition because a sequence identicalto DLR2 was found in one clone-that would account for onlytwo of the six Aq genes related to DLR2.

Apart from the evidence compiled from the nucleotide se-quences themselves, a cogent argument for the occurrence ofsomatic mutation in circulating IgM' B cells is that the major-ity of the base substitutions we observed were not silent butresulted in a changed amino acid sequence. In studies of V genesequences of antibodies produced during the secondary re-sponse of the mouse to several different classes of antigens,mutations causing amino acid substitutions ("replacementmutations") were found to exceed silent mutations by far, andwere characteristically located in the CDRs(50, 52, 53, 57-64).That pattern is striking in the JH4 gene of the Au library. Of the42 base differences from the germline sequence found among49 JH genes (not counting the polymorphic G -- A substitu-tion), 5 were silent and 37 were replacement variants; of those37 replacement variants, 31 (84%) occurred in the 5' CDR3-coding region of the JH genes (Fig. 4 b).

A similar, but less striking picture emerges from analysis ofthe seven VH5 and VH6 genes. The total number of base varia-tions from germline sequences was 54; ofthose, 41 were replace-ment and 13 were silent (of the latter, 10 occurred in the threeVH6 genes). And of the 22 amino acid replacements in the VHSgenes, 50%occurred in either CDRl orCDR2. Although frame-work mutations can affect antibody binding properties (65),mutations in the CDRs, which are largely responsible for theligand-binding surface of the immunoglobulin molecule, arethe principal molecular signs of clonal selection. It is thushighly likely that the mutations we observed are a reflection ofthe selective effect of a ligand on the circulating B cell popula-tion.

These findings, when viewed as a whole, suggest that li-gand-selected IgM+ B cells not only circulate in the blood ofnormal adults but they may comprise a substantial fraction ofthe B lymphocytes in human blood. They could correspond tolong-lived memory cells that have been rescued from pro-grammed cell death by contact with antigen (66). Indeed, it islikely that the B lymphocyte dies soon after it completes itsdifferentiation, as a result of apoptosis, unless it undergoes se-lection by antigen (67). Our finding of somatic mutation incirculating B cells is of interest because B cells engaged in re-sponses to specific antigens are generally thought to reside inthe germinal centers of the spleen and lymph nodes (68). How-

ever, it was recently shown that during the secondary immuneresponse of the mouse to horseradish peroxidase, B cells havebeen found to leave the germinal centers, enter the circulation,and seed the bone marrow where they mature into antibody-producing plasma cells (69). Presumably, those cells under-went at least the initial stages of antigen selection, although themolecular evidence to support that conclusion is presentlylacking.

Another noteworthy aspect of our results is that V regiongenes in a CQ library showed evidence of somatic mutation. Inthe experiments of Gu et al. (23), no somatic mutations werefound among 44 complete V region sequences in CQg librariesfrom young unimmunized mice; Manser and Gefter (70) alsofound no somatic mutations in naive mice. Even so, it isknown that IgM antibodies can be encoded by mutated V re-gion genes (71), and that somatic mutation can be detectedvery early in the immune response (63), independently ofheavy chain class switching (72).

The molecular signs of clonal selection in circulating B cellssuggest that the selective ligand was encountered after the pre-Bcell stage of differentiation, when the maturing B cell hasrearranged its V genes (73) and expressed at least a surfaceheavy chain. In the steady state, such B cells could representlong-lived circulating memory cells that provide an early de-fense against microbial reinvasion; or, in some instances theymay represent selection of the repertoire by idiotypes or anti-idiotypes (74). In either case, any analysis of V gene repertoiresin disease will have to take into account the variations in com-position and structure of the normal repertoire.

Acknowledgments

This research was supported by grants AI-19794 and AI-26450 fromthe National Institutes of Health. A. K. Stewart is a fellow of the Medi-cal Research Council of Canada.

References

1. Tonegawa, S. 1983. Somatic generation of antibody diversity. Nature(Lond.). 302:575-581.

2. Alt, F. W., T. K. Blackwell, and G. D. Yancopoulos. 1987. Development ofthe primary antibody repertoire. Science (Wash. DC). 238:1079-1087.

3. Berek, C., and C. Milstein. 1988. The dynamic nature of the antibodyrepertoire. Immunol. Rev. 105:5-26.

4. Cuisinier, A. M., V. Guigou, L. Boubli, M. Fougereau, and C. Tonnelle.1989. Preferential expression of VH5 and VH6 immunoglobulin genes in earlyhuman B-cell ontogeny. Scand. J. Immunol. 30:493-497.

5. Schroeder, H. W., Jr., J. L. Hillson, and R. M. Perlmutter. 1987. Earlyrestriction of the human antibody repertoire. Science (Wash. DC). 238:791-793.

6. Schroeder, H. W., Jr., and J. Y. Wang. 1990. Preferential utilization ofconserved immunoglobulin heavy chain variable gene segments during humanfetal life. Proc. NatL. Acad. Sci. USA. 87:6146-6150.

7. Yancopoulos, G. D., S. V. Desiderio, M. Paskind, J. F. Kearney, D. Balti-more, and F. W. Alt. 1984. Preferential utilization of the most JH-proximal VHgene segments in pre-B-cell lines. Nature (Lond.). 311:727-733.

8. Malynn, B. A., G. D. Yancopoulos, J. E. Barth, C. A. Bona, and F. W. Alt.1990. Biased expression of JH-proximal VHgenes occurs in the newly generatedrepertoire of neonatal and adult mice. J. Exp. Med. 171:843-859.

9. Cuisinier, A. M., F. Fumoux, D. Moinier, L. Boubli, V. Guigou, M. Milili,C. Schiff, M. Fougereau, and C. Tonnelle. 1990. Rapid expansion of humanimmunoglobulin repertoire (VH, Vkappa, Vlambda) expressed in early fetalbone marrow. NewBiol. 2:689-699.

10. Guigou, V., B. Guilbert, D. Moinier, C. Tonnelle, L. Boubli, S. Avrameas,M. Fougereau, and F. Fumoux. 1991. Ig repertoire of human polyspecific antibod-ies and B cell ontogeny. J. Immunol. 146:1368-1374.

11. Kipps, T. J., E. Tomhave, L. F. Pratt, S. Duffy, P. P. Chen, and D. A.Carson. 1989. Developmentally restricted immunoglobulin heavy chain variableregion gene expressed at high frequency in chronic lymphocytic leukemia. Proc.NatL. Acad. Sci. USA. 86:5913-5917.

B Lymphocyte Immunoglobulin cDNA Libraries 1341

12. Kipps, T. J., B. A. Robbins, A. Tefferi, G. Meisenholder, P. M. Banks, andD. A. Carson. 1990. CD5-positive B-cell malignancies frequently express cross-reactive idiotypes associated with IgM autoantibodies. Am. J. Pathol. 136:809-816.

13. Casali, P., and A. L. Notkins. 1989. Cloning the human B-cell repertoirewith EBV: polyreactive antibodies and CD5+ lymphocytes. Annu. Rev. Im-munol. 7:513-535.

14. Schutte, M. E. M., S. B. Ebeling, K. E. Akkermans, F. H. J. Gmelig-Mey-ling, and T. Logtenberg. 1991. Antibody specificity and immunoglobulin VHgene utilization of human monoclonal CD5' B cell lines. Eur. J. Immunol.21:1115-1121.

15. Logtenberg, T., F. M. Young, J. H. Van Es, F. H. J. Gmelig-Meyling, andF. W. Alt. 1989. Autoantibodies encoded by the most JH-proximal human immu-noglobulin heavy chain variable region gene. J. Exp. Med. 170:1347-1355.

16. Pascual, V., J. Andris, and J. D. Capra. 1990. Heavy chain variable regiongene utilization in human antibodies. Int. Rev. Immunol. 5231:231-238.

17. Guillaume, T., D. B. Rubinstein, F. Young, L. Tucker, T. Logtenberg,R. S. Schwartz, and K. Barrett. 1990. Individual VHgenes detected with oligonu-cleotide probes from the complementarity-determining regions. J. Immunol.145:1934-1945.

18. Sanz, I., P. Casali, J. W. Thomas, A. L. Notkins, and J. D. Capra. 1989.Nucleotide sequences of eight human natural autoantibody VH regions revealsapparent restricted use of VH families. J. ImmunoL. 142:4054-4061.

19. Guigou, V., A. M. Cuisinier, C. Tonnelle, D. Moinier, M. Fougereau, andF. Fumoux. 1990. Human immunoglobulin VHand VK repertoire revealed byin situ hybridization. Mol. Immunol. 27:935-940.

20. Zouali, M., and J. Theze. 1991. Probing VH gene-family utilization inhuman peripheral B cells by in situ hybridization. J. Immunol. 146:2885-2864.

21. Yamada, M., R. Wasserman, B. A. Reichard, S. Shane, A. J. Caton, andG. Rovera. 1991. Preferential utilization of specific immunoglobulin heavy chaindiversity and joining segments in adult human peripheral blood B lymphocytes.J. Exp. Med. 173:395-407.

22. Blaison, G., J.-L. Kuntz, and J.-L. Pasquali. 1991. Molecular analysis ofVkIII variable regions of polyclonal rheumatoid factors during rheumatoid arthri-tis. Eur. J. Immunol. 21:1221-1227.

23. Gu, H., S. Tarlinton, W. MUller, K. Rajewsky, and I. Fdrster. 1991. Mostperipheral B cells in mice are ligand selected. J. Exp. Med. 173:1357-137 1.

24. Bangs, L. A., I. E. Sanz, and J. M. Teale. 1991. Comparison of D, JH, andjunctional diversity in the fetal, adult, and aged B cell repertoires. J. Immunol.146:1996-2004.

25. Feeney, A. J. 1990. Lack of N regions in fetal and neonatal mouse immu-noglobulin V-D-J junctional sequences. J. Exp. Med. 172:1377-1390.

26. Sanz, I. 1991. Multiple mechanisms participate in the generation of diver-sity of human H chain CDR3regions. J. Immunol. 147:1720-1729.

27. Crain, M. J., S. K. Sanders, J. L. Butler, and M. D. Cooper. 1989. Epstein-Barr virus preferentially induces proliferation of primed B cells. J. Immunol.143:1543-1548.

28. Decker, D. J., N. A. Boyle, J. A. Koziol, and N. R. Klinman. 1991. Theexpression of the Ig Hchain repertoire in developing bone marrow B lineage cells.J. Immunol. 146:350-361.

29. Huang, C., and B. D. Stollar. 1991. Construction of representative immu-noglobulin variable region cDNA libraries from human peripheral blood lym-phocytes without in vitro stimulation. J. Immunol. Methods. 141:227-236.

30. Gubler, U., and B. Hoffman. 1983. A simple and very efficient method forgenerating cDNA libraries. Gene. 25:263-269.

31. Trepicchio, W., Jr., and K. J. Barrett. 1987. Eleven MRL-lpr/lpr anti-DNAautoantibodies are encoded by genes from four VHgene families: a poten-tially biased usage of VHgenes. J. Immunol. 138:2323-233 1.

32. Berman, J. E., S. J. Mellis, R. Pollock, C. L. Smith, H. Suh, B. Heinke, C.Kowai, U. Surti, L. Chess, C. R. Cantor, et al. 1988. Content and organization ofthe human Ig VHlocus: definition of three new VHfamilies and linkage to the IgCHlocus. EMBO(Eur. Mol. Biol. Organ.) J. 7:727-738.

33. Logtenberg, T. 1990. Properties of polyreactive natural antibodies to selfand foreign antigens. J. Clin. Immunol. 10:137-140.

34. Walter, M. A., U. Surti, M. H. Hofker, and D. W. Cox. 1990. The physicalorganization of the human immunoglobulin heavy chain gene complex. EMBO(Eur. Mol. Biol. Organ) J. 9:3303-3313.

35. Clarke, S. H., and S. Rudikoff. 1984. Evidence forgene conversion amongimmunoglobulin heavy chain variable region genes. J. Exp. Med. 159:773-782.

36. Dersimonian, H., A. Long, D. Rubinstein, B. D. Stollar, and R. S.Schwartz. 1990. VHgenes of human autoantibodies. Int. Rev. Immunol. 5:253-264.

37. Pascual, V., and J. D. Capra. 1991. Humanimmunoglobulin heavy-chainvariable region genes: organization, polymorphism, and expression. Adv. Im-munol. 49:1-74.

38. Ichihara, Y., H. Matsuoka, and Y. Kurosawa. 1988. Organization of hu-man immunoglobulin heavy chain diversity gene loci. EMBO(Eur. Mol. Biol.Organ) J. 7:4141-4150.

39. Buluwela, L., D. G. Albertson, P. Sherrington, P. H. Rabbitts, N. Spurr,and T. H. Rabbitts. 1988. The use of chromosomal translocations to study hu-

man immunoglobulin gene organization: mapping DHsegments within 35 kb ofthe Cu gene and identification of a new DH locus. EMBO(Eur. Mol. Biol.Organ) J. 7:2003-2010.

40. Siebenlist, U., J. V. Ravetch, S. Korsmeyer, T. Waldmann, and P. Leder.1981. Humanimmunoglobulin Dsegments encoded in tandem multigenic fami-lies. Nature (Lond.). 294:631-635.

41. Dilosa, R. M., K. Maeda, A. Masuda, A. K. Szakal, and J. G. Tew. 1991.Germinal center B cells and antibody production in the bone marrow. J. Im-munol. 146:4071-4077.

42. Eckert, K. A., and T. A. Kunkel. 1991. DNApolymerase fidelity and thepolymerase chain reaction. PCRMethods Appl. 1:17-24.

43. Ling, L. L., P. Keohavong, C. Dias, and W. G. Thilly. 1991. Optimizationof the polymerase chain reaction with regard to fidelity: modified T7, Taq, andVent DNApolymerases. PCRMethods Appl. 1:63-69.

44. Young, F., L. Tucker, D. Rubinstein, T. Guillaume, J. Andre-Schwartz,K. J. Barrett, R. S. Schwartz, and T. Logtenberg. 1990. Molecular analysis of agerm line-encoded idiotypic marker of pathogenic human lupus autoantibodies.J. Immunol. 145:2545-2553.

45. Kabat, E. A., K. G. Nickerson, J. Liao, L. Grossbard, E. Osserman, E.Glickman, L. Chess, J. B. Robbins, R. Schneerson, and Y. Yang. 1986. A humanmonoclonal macroglobulin with specificity for a (2 -a 8)-linked poly-N-acetylneuraminic acid, the capsular polysaccharide ofgroup B meningococci and Esche-richia coli K1, which cross-reacts with polynucleotides and with denatured DNA.J. Exp. Med. 164:642-654.

46. Atkinson, P. M., G. Lampman, B. C. Furie, Y. Naparstek, R. S. Schwartz,B. D. Stollar, and B. F. Furie. 1985. Homology of the NH2-terminal amino acidsequences of the heavy and light chains of human monoclonal lupus autoantibod-ies containing the dominant 16/6 idiotype. J. Clin. Invest. 75:1138-1143.

47. Teale, J. M. 1985. B cell immune repertoire diversifies in a predictableorder in vitro. J. Immunol. 135:954-958.

48. Jeong, H. D., and J. M. Teale. 1988. Comparison of the fetal and adultfunctional B cell repertoires by analysis of VHgene family expression. J. Exp.Med. 168:589-603.

49. French, D. L., R. Laskov, and M. D. Scharff. 1989. The role of somatichypermutation in the generation of antibody diversity. Science (Wash. DC).244:1152-1157.

50. Kocks, C., and K. Rajewsky. 1988. Stepwise intraclonal maturation ofantibody affinity through somatic hypermutation. Proc. Nat!. Acad. Sci. USA.85:8206-8210.

51. Rajewsky, K., I. Forster, and A. Cumano. 1987. Evolutionary and somaticselection of the antibody repertoire in the mouse. Science (Wash. DC). 238:1088-1094.

52. Wysocki, L., T. Manser, and M. L. Gefter. 1986. Somatic evolution ofvariable region structures during an immune response. Proc. Nat!. Acad. Sci.USA. 83:1847-1851.

53. Berek, C., G. M. Griffith, and C. Milstein. 1985. Molecular events duringmaturation of the immune response to oxazolone. Nature (Lond.). 316:412-418.

54. Manser, T., S. Y. Huang, and M. L. Gefter. 1984. Influence of clonalselection on the expression of immunoglobulin variable region genes. Science(Wash. DC). 226:1283-1288.

55. Baltimore, D. 1981. Somatic mutation gains its place among the genera-tors of diversity. Cell. 27:573-58 1.

56. Kim, S., M. David, E. Sinn, P. Patten, and L. Hood. 1981. Antibodydiversity: somatic hypermutation of rearranged VHgenes. Cell. 27:573-581.

57. Clarke, S. H., L. M. Staudt, J. Kavaler, D. Schwartz, W. U. Gerhard, andM. G. Weigert. 1990. V region gene usage and somatic mutation in the primaryand secondary responses to influenza virus hemagglutinin. J. Immunol.144:2795-2801.

58. Chien, N. C., R. R. Pollock, C. Desaymard, and M. D. Scharff. 1988.Point mutations cause the somatic diversification of IgM and IgG2a antiphos-phorylcholine antibodies (published erratum appears in 1990. J. Exp. Med.172:669). J. Exp. Med. 167:954-973.

59. Claflin, J. L., J. Berry, D. Flaherty, and W. Dunnick. 1987. Somaticevolution of diversity among anti-phosphocholine antibodies induced with Pro-teus morganii. J. Immunol. 138:3060-3068.

60. Cumano, A., and K. Rajewsky. 1986. Clonal recruitment and somaticmutation in the generation of immunological memory to the hapten NP. EMBO(Eur. Mo!. Biol. Organ.) J. 5:2459-2468.

61. Weiss, U., and K. Rajewsky. 1990. The repertoire of somatic antibodymutants accumulating in the memory compartment after primary immunizationis restricted through affinity maturation and mirrors that expressed in the second-ary response. J. Exp. Med. 172:1681-1689.

62. Sablitsky, F., G. Wildner, and K. Rajewsky. 1985. Somatic mutation andclonal expansion of B cells in an antigen-driven immune response. EMBO(Eur.Mol. Biol. Organ.) J. 4:345-350.

63. Levy, N. S., U. V. Malipiero, S. G. Lebecque, and P. J. Gearhart. 1989.Early onset of somatic mutation in immunoglobulin VH genes during the pri-mary immune response. J. Exp. Med. 169:2007-2019.

64. Clarke, S., K. Huppi, D. Ruezinsky, L. Staudt, W. Gerhard, and M.

1342 Huang et al.

Weigert. 1985. Inter- and intraclonal diversity in the antibody response to influ-enza hemagglutinin. J. Exp. Med. 161:687-704.

65. Panka, D. J., M. Mudgett-Hunter, D. R. Parks, L. L. Peterson, L. A.Herzenberg, E. Haber, and M. N. Margolies. 1988. Variable region frameworkdifferences result in decreased or increased affinity of variant anti-digoxin anti-bodies. Proc. Nat!. Acad. Sci. USA. 85:3080-3084.

66. Forster, I., and K. Rajewsky. 1990. The bulk of the peripheral B-cell poolin mice is stable and not rapidly renewed from the bone marrow. Proc. Nat!.Acad. Sci. USA. 87:4781-4784.

67. Schittek, B., and K. Rajewsky. 1990. Maintenance of B-cell memory bylong-lived cells generated from proliferating precursors. Nature (Lond.). 346:749-751.

68. Jacob, J., R. Kassir, and G. Kelsoe. 1991. In situ studies of the primaryimmune response to (4-hydroxy-3-nitrophenyl)acetyl. I. The architecture anddynamics of responding cell populations. J. Exp. Med. 173:1165-1175.

69. Long, A., J. Mueller, J. Andr'-Schwartz, K. J. Barrett, R. Schwartz, and H.Wolfe. 1992. High specificity in-situ hybridization: methods and application.Diagn. Mol. Pathol. 1:45-57.

70. Manser, T., and M. Gefter. 1986. The molecular evolution of the immuneresponse: idiotype specific suppression indicates that B-cells express germ-line-encoded V genes prior to antigenic stimulation. Eur. J. Immunol. 16:1439-1444.

71. Hartman, A. B., and S. Rudikoff. 1984. VHgenes encoding the immuneresponse to beta-(l,6)-galactan: somatic mutation in IgM molecules. EMBO(Eur. Mol. Biol. Organ.) J. 3:3023-3030.

72. Shan, H., M. Shlomchik, and M. Weigert. 1990. Heavy-chain class switchdoes not terminate somatic mutation. J. Exp. Med. 172:531-536.

73. Roes, J., K. Huppi, K. Rajewsky, and F. Sablitzky. 1989. Vgene rearrange-ment is required to fully activate the hypermutation mechanism in B cells. J.Immunol. 142:1022-1026.

74. Freitas, A. A., 0. Burlen, and A. Coutinho. 1988. Selection of antibodyrepertoires by anti-idiotypes can occur at multiple steps of B cell differentiation. J.Immunol. 140:4097-4102.

75. Shuldiner, A. R., A. Nirula, and J. Roth. 1989. Hybrid DNAartifact fromPCRof closely related target sequences. Nucleic Acids Res. 17:4409.

76. Dersimonian, H., R. S. Schwartz, K. J. Barrett, and B. D. Stollar. 1987.Relationship of human variable region heavy chain (VH) germline genes to genesencoding anti-DNA autoantibodies. J. Immunol. 139:2496-2501.

B Lymphocyte Immunoglobulin cDNA Libraries 1343