evolution of immunoglobulin vh pseudogenes in...

9
Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi Nei Institute of Molecular Evolutionary Genetics and Department of Biology, The Pennsylvania State University In chickens, there is a single functional gene (Vu 1) coding for the heavy chain variable region of immunoglobulins, and immunoglobulin diversity is generated by gene conversion of the Vn 1 gene by many variable region pseudogenes (YVn’s) that exist on the 5’ side of the Vu 1 gene. To understand the evolution of this unique genetic system, we conducted statistical analyses of Vu 1 and YVH genes together with functional Vn genes from other higher vertebrate species. The results indicate, first, that chicken VH genes are all closely related to one another and were derived relatively recently from an ancestral gene belonging to one of the three major groups of VH genes in higher vertebrates. Second, the rate of nonsynonymous substitution is slightly higher than that of synonymous substitution in the complementarity-determining regions (CDRs), which suggests that diversity-enhancing selection has operated in the CDRs even for pseudogenes. However, both the rates of synonymous and nonsynonymous substitution are higher in the CDRs than in the framework regions ( FRs), apparently because of an interaction between positive selection and meiotic gene conversion in the CDRs. Third, a dot matrix analysis of the YVn genes and genomic diversity (D) genes has indicated that the 3’ end of YVu genes is attached by D-gene-like sequences, and this region of YVu genes has high similarity with D gene sequences. This suggests that V and D genes were fused at some point of evolutionary time and this fused element multiplied by gene duplication. Finally, two alternative hypotheses of explaining the evolution of the chicken Vu gene system are presented. Introduction In most higher vertebrates immunoglobulin mol- ecules are encoded by four different multigene families (i.e., the variable [VI, diversity [D], joining [J], and constant [C] gene or gene segment families), and im- munoglobulin diversity is generated by random com- bination of these genes and junctional variation that oc- cur during the gene rearrangement events for producing complete heavy and light chain immunoglobulin genes. In addition, somatic mutation is known to generate fur- ther diversification of rearranged genes. In chickens and possibly many other avian species, however, there are single functional V and J genes (Reynaud et al. 1989), and immunoglobulin diversity is generated mainly by somatic gene conversion of this functional V gene by many V pseudogenes that exist on the 5’ side of the functional gene (see fig. 1 and the review articles by McCormack et al. 199 1, 1993). Therefore, the mecha- nism of generation of immunoglobulin diversity in Key words: chicken, gene conversion, immunoglobulins, pseu- dogenes, variable region genes. Address for correspondence and reprints: Masatoshi Nei, Institute of Molecular Evolutionary Genetics, The Pennsylvania State University, 328 Mueller Laboratory, University Park, Pennsylvania 16802. Mol. Bid. Ed 12( 1):94- 102. 1995. 0 1995 by The University of Chicago. All rights reserved. 0131-4038/95/1201-0009$02.00 chickens is radically different from that of nonavian ver- tebrate species. However, it remains unclear how this unique system has evolved. Recently, we (Ota and Nei 19940) studied the phy- logenetic relationships of heavy chain variable region (Vn ) genes from diverse species of vertebrates and showed that the functional Vu genes in higher vertebrates (mammals and amphibians) can be classified into three major groups. It is therefore interesting to know how the heavy chain functional gene (Vn 1) and pseudogenes (‘Wu’s) in chickens are related to these three groups of genes. Another interesting problem is whether positive Darwinian selection operates at the antigen-binding sites or the complementarity-determining regions (CDRs) of YVu’s. It has previously been shown (Tanaka and Nei 1989) that in the CDRs of functional Vu genes from humans and mice, the number of nonsynonymous dif- ferences per nonsynonymous site is greater than the number of synonymous differences per synonymous site, and thus positive selection has operated at the CDRs. In chickens, pseudogenes are used as the source of im- munoglobulin diversity by means of somatic gene con- version, so they are not really dead genes. If the CDRs of these genes are really important for generating im- munoglobulin diversity, they are expected to show the same pattern of nucleotide substitution as that in the CDRs of mammalian functional genes. Another inter- 94

Upload: others

Post on 31-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolution of Immunoglobulin VH Pseudogenes in …igem.temple.edu/labs/nei/downloads/publications/1995...Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi

Evolution of Immunoglobulin VH Pseudogenes in Chickens

Tatsuya Ota and Masatoshi Nei Institute of Molecular Evolutionary Genetics and Department of Biology, The Pennsylvania State University

In chickens, there is a single functional gene (Vu 1) coding for the heavy chain variable region of immunoglobulins, and immunoglobulin diversity is generated by gene conversion of the Vn 1 gene by many variable region pseudogenes (YVn’s) that exist on the 5’ side of the Vu 1 gene. To understand the evolution of this unique genetic system, we conducted statistical analyses of Vu 1 and YVH genes together with functional Vn genes from other higher vertebrate species. The results indicate, first, that chicken VH genes are all closely related to one another and were derived relatively recently from an ancestral gene belonging to one of the three major groups of VH genes in higher vertebrates. Second, the rate of nonsynonymous substitution is slightly higher than that of synonymous substitution in the complementarity-determining regions (CDRs), which suggests that diversity-enhancing selection has operated

in the CDRs even for pseudogenes. However, both the rates of synonymous and nonsynonymous substitution are higher in the CDRs than in the framework regions ( FRs), apparently because of an interaction between positive selection and meiotic gene conversion in the CDRs. Third, a dot matrix analysis of the YVn genes and genomic diversity (D) genes has indicated that the 3’ end of YVu genes is attached by D-gene-like sequences, and this region of YVu genes has high similarity with D gene sequences. This suggests that V and D genes were fused at some point of evolutionary time and this fused element multiplied by gene duplication. Finally, two alternative hypotheses of explaining the evolution of the chicken Vu gene system are presented.

Introduction

In most higher vertebrates immunoglobulin mol- ecules are encoded by four different multigene families (i.e., the variable [VI, diversity [D], joining [J], and constant [C] gene or gene segment families), and im- munoglobulin diversity is generated by random com- bination of these genes and junctional variation that oc- cur during the gene rearrangement events for producing complete heavy and light chain immunoglobulin genes. In addition, somatic mutation is known to generate fur- ther diversification of rearranged genes. In chickens and possibly many other avian species, however, there are single functional V and J genes (Reynaud et al. 1989), and immunoglobulin diversity is generated mainly by somatic gene conversion of this functional V gene by many V pseudogenes that exist on the 5’ side of the functional gene (see fig. 1 and the review articles by McCormack et al. 199 1, 1993). Therefore, the mecha- nism of generation of immunoglobulin diversity in

Key words: chicken, gene conversion, immunoglobulins, pseu- dogenes, variable region genes.

Address for correspondence and reprints: Masatoshi Nei, Institute of Molecular Evolutionary Genetics, The Pennsylvania State University, 328 Mueller Laboratory, University Park, Pennsylvania 16802.

Mol. Bid. Ed 12( 1):94- 102. 1995. 0 1995 by The University of Chicago. All rights reserved. 0131-4038/95/1201-0009$02.00

chickens is radically different from that of nonavian ver- tebrate species. However, it remains unclear how this unique system has evolved.

Recently, we (Ota and Nei 19940) studied the phy- logenetic relationships of heavy chain variable region (Vn ) genes from diverse species of vertebrates and showed that the functional Vu genes in higher vertebrates (mammals and amphibians) can be classified into three major groups. It is therefore interesting to know how the heavy chain functional gene (Vn 1) and pseudogenes (‘Wu’s) in chickens are related to these three groups of genes. Another interesting problem is whether positive Darwinian selection operates at the antigen-binding sites or the complementarity-determining regions (CDRs) of YVu’s. It has previously been shown (Tanaka and Nei 1989) that in the CDRs of functional Vu genes from humans and mice, the number of nonsynonymous dif- ferences per nonsynonymous site is greater than the number of synonymous differences per synonymous site, and thus positive selection has operated at the CDRs. In chickens, pseudogenes are used as the source of im- munoglobulin diversity by means of somatic gene con- version, so they are not really dead genes. If the CDRs of these genes are really important for generating im- munoglobulin diversity, they are expected to show the same pattern of nucleotide substitution as that in the CDRs of mammalian functional genes. Another inter-

94

Page 2: Evolution of Immunoglobulin VH Pseudogenes in …igem.temple.edu/labs/nei/downloads/publications/1995...Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi

Chicken YVHs 95

WV* w DH JI-I G FIG. I.-Genomic organization of chicken immunoglobulin heavy chain genes. Y, pseudogene; V, variable region gene; D, diversity

segment gene; J, joining segment gene; C, constant region gene. There is only one functional variable region gene (V,l).

esting feature of the chicken YVn gene system is that DNA segments on the 3’ side of YVn’s show some se- quence similarity with genomic D genes (Reynaud et al. 1989 ). However, no one has done a detailed analysis of this problem, and the extent of the similarity remains unclear. If the similarity is high, it will raise a question how such similar sequences on the 3’ side of YVu’s have arisen.

The main purpose of this paper is to give some answers to the above three questions, considering all in- formation available about chicken and other Vu genes. We will also present some hypotheses about the origin of the chicken Vn gene system.

Evolutionary Relationships of Chicken VH Pseudogenes with Other VH Genes

The chicken genome is considered to contain more than 80 YVn’s, but only 11 complete pseudogene se- quences are available in the literature, though there are seven more truncated or incomplete sequences. In our studies, we used all of these 11 pseudogene sequences plus the functional Vn 1 gene (Reynaud et al. 1989). The alignment of these sequences is presented in figure 2. We constructed a phylogenetic tree for these sequences together with 10 representative Vu genes from Xenopus Zaevis, 7 genes from humans, and 11 genes from mice. The source of these sequences was presented elsewhere (table 1 of Ota and Nei 1994a). All the sequences used in this paper were germline sequences rather than those that were subjected to somatic modification.

The V region of immunoglobulins consists of the complementarity-determining regions (CDRs) and the framework regions (FRs). The CDRs are responsible for antigen binding and are highly variable; there are both amino acid differences and insertions/deletions. Because of this high variability, it is often difficult to align amino acid or nucleotide sequences in these re- gions. We therefore eliminated all the codons involved in the CDRs and used only the FRs for our phylogenetic analysis. Since some deletions/insertion sites were ex- cluded even from the FRs, the total number of codons used was 64 (see fig. 2 of Ota and Nei 1994a).

We ( Ota and Nei 1994a) used deduced amino acid sequence data for constructing our phylogenetic tree, because the sequence divergence of the Vu genes was extensive. In the present paper, we are primarily inter-

ested in the divergence of chicken YV, genes and their relationships with Vu genes from higher vertebrates. Therefore, we used DNA sequence data, which are more informative for constructing trees for closely related genes. However, our statistical analysis has shown that the frequencies of nucleotides A, T, C, and G at the third codon position vary considerably among different species (table 1). In chickens, the G+C content is very high, but it is relatively low in Xenopus. Since this vari- ation in G+C content can disturb phylogenetic infer- ence, we decided to use only the first and second codon positions in this study. A phylogenetic tree for the 40 sequences was constructed by using the minimum-evo- lution method (Rzhetsky and Nei 1992) with Jukes and Cantor’s ( 1969) distances. The phylogenetic tree ob- tained is presented in figure 3. The Vn genes in higher vertebrates can again be classified into three major groups (i.e., the group A, B, and C genes). This branch- ing pattern is essentially the same as that given previously (Ota and Nei 1994a). Chicken genes clearly belong to group C, but they form a tight cluster with a 99% con- fidence probability. They are also closely related to each other compared with human, mouse, and Xenopus genes. This suggests either that they are of recent origin or that the genes have been homogenized by concerted evolution. However, all YVn genes seem to have been derived from a single group C gene that is ancestral to both mammalian and avian Vn genes.

One interesting feature of the chicken gene cluster is that, unlike many other mammalian YVn genes (see Ota and Nei 1994a), the chicken YVn genes do not have a long branch compared with the functional gene Vu 1. This is quite unusual but is understandable because the chicken YVu’s are not really dead genes. Although the DNA sequences for other YVn genes are not avail- able now, it seems that other chicken genes are also closely related to the genes used here (see Reynaud et al. 1989).

If we assume that the genomic YVn genes diverged primarily by gene duplication, it is possible to estimate the divergence time between the functional Vn gene and the most divergent YVn gene ( YVu 15- 11). The Jukes- Cantor distance (d) for this pair of genes is 0.074. Go- jobori and Nei ( 1984) previously estimated that the rates of nucleotide substitution per site per year for the first and second codon positions of the FRs in humans and

Page 3: Evolution of Immunoglobulin VH Pseudogenes in …igem.temple.edu/labs/nei/downloads/publications/1995...Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi

“Hl Q ", 15-2 Q v, 15-3 yr V, 15-8 Q v, 15-9 yrv, 15-11 Q “, 57-1 Q v, 57-10 yv,57-11 y v, 57-13 l+f v, 57-15 yl V, 57-16

“Hl y V, 15-2 Q", 15-3 Q V, 15-8 yl v, 15-9 yv, 15-11 yv, 57-1 y v, 57-10 yv,57-11 yv,57-13 Q ", 57-15 y Vu 57-16

"HI ggt att gac sac act ggt aga tat aca ggc tat ggg tcg gcg gtg aag ggc Q V, 15-2 c-a --- ac- -gt -g- --- g-t ag- --- ta- --- --- g-a --- --- --- ---

Q v, 15-3 -t- --- a-t get g-- --- --t gg- _-- aa- --- --- --- --- --- --- ---

yr V, 15-8 a-- --- w- --- --- --- -0t --- --- -ca --- --- g-- --- --- --- ---

Q “, 15-g -aa _-- ag- gg- ___ --- --t ag- _-- tam --- -c- c-- --- --- --- ---

Q “, 15-11 __- -__ --g --t ga- ___ --t ac- --- tt- ___ ___ c-- --_ --- g-t --_

Q”,57-1 --- --- t-- -gt gg- a-- -- _ --- t-- --- --- g-- _-- --- --- ---

yr v, 57-10 caa _-- ag- -g- ___ --- --t ag- _-_ t-- --- -c- c-- --a --- --- --_

yv,57-11 -c- -__ a-- --t tt- --c -at ag- --g ___ --t -c- g-- -__ __- --- ---

QV,57-13 --- --- aga ggt g-- --- --t ag- _-_ t-g --- -co a-- --- --- --- ---

Q “, 57-15 caa --- ac- -gt -g- --- -- _ --- t-- --- --- -__ _-- --- --- ---

Q V, 57-16 caa --- ag- -g- --- --- --t ag- _-- tap --- -c- c-- --a --- --- ---

““1 y V, 15-2 yl v, 15-3 Q V, 15-8 Q “, 15-g Q”, 15-11 Q “, 57-1 Q “, 57- 10 Q”,57-11 QvH57-13 yr v, 57-15 yl V, 57- 16

gee gtg acg ttg gac gag tee ggg ggc ggc ctc tag acg ccc gga aga gcg ctc age ctc gtc tgc aag gee to ___ --- ___ --_ __- --_ _-- --_ _-- --- --- --- --- --- --- g-- -__ --- --- --- --- --- --- --- -- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- g-w -g- --- -_- --- --- --_ --- -_- -- --- --- --- --- --- --- --- --- --- --- --- --- ___ _-- --- g-- -g- --- --- --- a-- -__ --- __- -- --- --- --- --- -_- --- --- --- --- --- ___ ___ ___ _-- --_ g-- --- --- --- --- --- ___ --- __- -- --- --- -ga --- --- a-- --- --- --- ___ --- __- --- --- --- g-q -g- --- --- --- a-- --- --- --- --. --- --- --- --- --- --- --- --- --- --- -__ __- --- _-- --- g-- -g- --- --- --- --- --- --- _-- --. --- --- --a --- --- --- _-- --- --- --- --- --- ___ _-- ___ --- --- --- --- --- --- --- --- -g- --. --- --- --- --- --- --- --- --- --- ___ ___ __- _-_ _-- --- g-- -g- --- --- ___ --- -__ --- _-- --, --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- g-- --a --_ --- --- --- --- --- --- --. --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- g-- -g- --- --- --- --- --- --- --- --. --_ --- --_ --- --- --- _-- --- --- --- --- -_- --- --- --- g-v --- -_- --- --- --- --- --- -t- --.

am ---

ttc act ttc age --- --- c-- ---

--- --a ---

--- t-- a-- --_ -a- -g- --- -t- -__ --- -_- ---

--- --- cm- --- ---

agt tat sac atg ggt tgg gtg cga tag gcg ccc ggc aag ggg --c --- g-- --- ctc --- --- --- --- --- -__ --- --- --- --- --- g-- --- --- --_ a-- --c ___ --- --- ___ --a ___ ca- -t- tgg --- sac --- --- --- --- --- --- --- --- --- --c -t- --- --- ttc -_- --- --- ___ --- --- --- --- --- __- --t t-- --- -ag ___ --- --c --- --- ___ --- __- --- _-- --- -c- --- cw --- --- --c --- --- --- --- --- --- --c -t- t-- --- ttc __- --- --c --- --- -__ --- --- --- ga- --t gg- --_ --c _-- a-- _-- --- --a --- _-- --- -__ --- --t gc- --_ atg --- ___ -__ ___ a-a --- a-- _-_ a-t --- -t- t-- --_ t-c --- --_ --c --- --- --- --- __- --- --- --- q-- --_ t-c --- --- -a- -__ -a- --- a-- --a ---

ctg gag ttc gtc get --- --a --- _-- --_ --- --- --- --- s-z t-- --a --- a-- --( --- --a -a- --_ _-. --- --- -w --- --- --- --- -gg --- --_ --- --a -a- --- --_ --- --a -a- --- a-- --- t-- -gg a-- --- -_- --a -_- -_- --- --- --a -a- --- ---

cgt gee act ate tcg agg gac sac --- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- g--

--- -__ --- --_ --_ -a- _-- ---

___ __- _-_ --- --- --- --- ---

ggg cw wc aca gtg agg --- --- --- --- --- --- ctg tag ctg --- --- ---

sac ---

sac ---

ctc agg get gag gac act ggc act tat _-- --- --- --- --- --- --- -_- ---

tat ---

WC ---

WC ---

--_ _-- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- -c- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- __- --- -t- --t -__ -__ _-- -__ --- --_ --- --- --- --- --- --- --- --- _-- ___ --- -__ --- -_- --- --- --- --- --- --- --- --- --- --- --- -t- --- --- --- --- --- --- --- --- --- -c- --- --- --- --- a-- ___ _-- ___ --- -__ --- --- --- _-- --- --- --- --- --- --- --- --- -cm --- --- -t- --- --- ___ _-- ___ --- -__ --- --- --- --- --- --- --- --- --- --- --- --- -cm --- --- --- --- a-g --- _-- --_ --- --- --- --- --- --- --- --- --- --- _-- --- --- --- -cm --- --- --- --- a-g --- _-- --- --- --- --- -_- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --t --- -__ _-- ___ --- --- --- --- --- --- --- --- --- --- --- --- --- --- tc- --- --- -t- --- a-g --- --- --- --- --- --- --- --- --- --- --- --- ___ _-- -__ --- --- -c- --- --- --- --- a-g

"H1 y”, 15-2 Q v, 15-3 yl VH 15-8 $ v, 15-9 vvn 15-11 Q”,57-1 yr v, 57-10 QvH57-11 Q “, 57-13 Q “, 57-15 Q V, 57- 16

wt ggt c-- t-- --- c-c tt- --- --- --- c-- t-- --- -c-

-g- -c- --- tat __- -a-

--- tat -g- ---

cat gqt qac act gat ccc tag cat ggg tgg cat aaa act cat cgt tgc sac agt -c- act tgg ag- tgt ggt act t-- a-t t-t get tat -gt ate aa- agt --- agt ggt tg- tgt a-t get tat g-t atg g-c tct --- -tt ggt --g at- g-c ag- ate gac act agt t-- act ta- -g- tat agt tgt -ca g-t ggt tgt ggt --g -ac agt -c- act tgg cg- tgt ggc act t-- a-t t-t get tat tgt at- gat ggt ta- -‘gt tgt -c- tat ggt tgg t-t get get gtt ctt ggt tat atg g-- get --- a-t tgt -c- ggt tgt get -aa gat atg -gt agt --- tgt ggt -g- ggt tg- tgg -ct g-t get get ggt t-- atg ga- agt --- t-t ggt tg- g-t t-t ggt tt- --t get get tat agt a-- -gt tgt --g tat g-a get --t get --- agt tgt -c- ggt tgt get -aa gat at- -gt

aa get -cl --- -- -a- -g -a- -- ---

-g --- --

I

ag- -9 -CT- -- tg- -- w -g Ml- -g -g-

FIG. 2.-Nucleotide sequences of chicken immunoglobulin heavy chain variable region genes. All sequences were taken from Reynaud e al. (1989). The nucleotides underlined for the Vnl gene are the recombination signal sequences. (---) indicates a nucleotide identical with tha of the Vu 1 gene. Boxes indicate the complementarity-determining regions. Y, pseudogene.

Page 4: Evolution of Immunoglobulin VH Pseudogenes in …igem.temple.edu/labs/nei/downloads/publications/1995...Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi

Chicken YV,s 97

Table 1 Average Nucleotide Frequencies (in Percentage) at the First, Second, and Third Codon Positions of the VH Genes’

FIRST POSITION SECOND POSITION THIRD POSITION

SPE~IE~~ A T C G A T C G A T C G

xenopus (10) . 32.3 22.0 17.8 27.8 31.9 24.8 23.6 19.7 32.0 27.0 17.3 23.6 Chicken (12) . . . . 26.7 14.5 23.3 35.5 25.4 21.5 23.3 29.8 7.7 4.8 48.4 39.1 Human (7) . . 26.3 21.9 23.4 28.3 26.8 24.8 27.7 20.8 12.7 12.7 32.4 42.2 Mouse(l1) . . . . 26.6 22.6 22.4 28.4 29.5 23.3 24.7 22.4 18.8 19.7 27.1 34.4

a Only 64 codons in the FRs (see fig. 2 of Ota and Nei 1994~) were used. b The numbers in parentheses refer to the numbers of sequences used to estimate the average nucleotide frequencies.

mice are 1 .O 1 X 10 -9 and 0.65 X 10 -9, respectively. Therefore, the average rate ( h) is 0.83 X 10 -9. If we use this rate, the time of divergence (t) between Vu 1 and ‘#‘Vu 15-l 1 is estimated to be 45 ( = d/ [ 2h]) million years (MY). This estimate will be an overestimate if YVn genes evolve faster than functional Vu genes. However, chicken YVu genes do not seem to have evolved particularly faster compared with the functional gene, as mentioned above. Therefore, our estimate ap- pears to be acceptable as a first approximation. At any rate, our estimate suggests that the chicken pseudogenes evolved much later than the time of emergence of birds (about 150 MY ago). The duplications of YVH’S after the emergence of birds are consistent with another fea- ture of the YVH’S. In chickens, a 60-80-kb DNA segment covers the entire Vu gene cluster ( Reynaud et al. 1989)) whereas a 2.5-3.0-megabase DNA segment contains the Vu gene cluster on chromosome 14 in humans (Matsuda et al. 1993 ). The physical distance between two contig- uous Vu genes in chickens is very short ( -0.85 kb on the average) compared with either that of human ( - 10 kb; Matsuda et al. 1993) or of Xenopus ( - 5 kb; Haire et al. 199 1). This suggests that the chicken YVu genes have arisen relatively recently by tandem gene dupli- cation.

Synonymous and Nonsynonymous Substitutions

One of the simplest ways to study positive Dar- winian selection for a given region ‘of protein is to compare the number of synonymous nucleotide dif- ferences per synonymous sites (ps) and the number of nonsynonymous differences per nonsynonymous sites (PN) (Nei 1987). To estimate these quantities, we used Nei and Gojobori’s ( 1986) method allowing stop codons in the evolutionary pathways. The stan- dard errors of mean ps and PN ( js and &) were then computed by our method (Ota and Nei 1994b). It has already been shown (Tanaka and Nei 1989) that the excess of pN over ps owing to positive selection in the CDRs is observed only when ps is small, because mul-

tiple nucleotide substitutions disturb the relationship of the rates of synonymous and nonsynonymous nu- cleotide substitution. We therefore classified the com- parisons of ps and PN into four groups according to the ps values for the entire V region (PST ). Then 2s and & were computed for the CDRs and the FRs sep- arately.

The results obtained are presented in table 2 and figure 4. In the CDRs, & tends to be greater than as when PST is less than 0.15. However, only when &T < 0.05 is PN significantly higher than ps at the 5% level. Figure 4 also shows that PN tends to be greater than ps in the CDRs when ps is small. However, PN tends to saturate more quickly than ps. By contrast, & and ps are nearly the same in the FRs, though & tends to be higher than p N . None of the differences between ps and PN in the FRs is statistically significant. This relationship between ps and & is different from that for mouse and human Vu genes, where fiN is generally much lower than &.

There is another important difference in the re- lationships of fis and pN between chicken YVH’S and mammalian functional Vu’s In mouse Vu genes, js’s are nearly the same for both the CDRs and the FRs. In chicken YVH’S, however, both js and & are consid- erably higher in the CDRs than in the FRs. The reason for this seems to be that germline Y/Vu genes in chickens have been subject to occasional gene conversion. Figure 5 shows the germline nucleotide sequences of Vu 1 and five YVH’S. Several pairs of these genes show fragments of identical or nearly identical sequences, which are probably generated by meiotic gene conversion. A study of polymorphism of Vu 1 and YVu genes (Benatar and Ratcliffe 1993) has also suggested that YVu genes are subject to meiotic gene conversion. If gene conversion occurs in the CDRs, the mutant allele generated will often be selected for, because it gives a new type of antibody. Therefore, the rate of nonsynonymous nu- cleotide substitution will be enhanced. In this case, however, the rate of synonymous substitution will also

Page 5: Evolution of Immunoglobulin VH Pseudogenes in …igem.temple.edu/labs/nei/downloads/publications/1995...Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi

98 Ota and Nei

96 I j+=+-+=+xe7

I XeLL2.8

XeVHl

vVH57-10 L ~~~57-16

- vVH57-13

99

-vVHlS-11

-I MmVH283 MmVH441 MmME3 MmVll

HsVH26

0.1 I

1 African toad

Mammalian clan I

I African toad

I Mammalian clan II

African toad .

Chicken

I Mammalian clan III

GROUP A

GROUP B

GROUP C

FIG. 3.-A phylogenetic tree of immunoglobulin heavy chain variable region genes. There are three major groups (A, B, and C) of Vu genes among the tetrapods. A number given to an interior branch is the probability at which the branch length is different from 0 (confidence probability). The confidence probability < 90% is not given. Xe, Xenopus laevis (African toad); Mm, Mus musculus (mouse); Hs, Homo sapiens (human). The branch lengths are measured in terms of the number of nucleotide substitutions per site with the scale given below the tree.

be enhanced because of the “hitchhiking effect.” When gene conversion occurs in the FRs, there will be no enhancement of either the nonsynonymous or the syn- onymous rate, because there will be no positive selec- tion in this case. Therefore, the higher values ofjs and & in the CDRs than in the FRs can be explained by the interaction of positive selection and meiotic gene conversion.

Diversity-like Segments Attached to the 3’ End of YV$s

Reynaud et al. ( 1989) noted that a diversity (D)- like segment is attached to the 3’ end of YVn’s that contributes to the diversification of the rearranged VDJ gene in chickens. To visualize the similarity of D genes to the D-like sequences of YVu genes, we conducted a dot matrix analysis comparing 11 YV, genes and 16 D

Page 6: Evolution of Immunoglobulin VH Pseudogenes in …igem.temple.edu/labs/nei/downloads/publications/1995...Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi

Chicken YVns 99

Table 2 Mean Numbers of Synonymous Nucleotide Differences per Synonymous Site (ps) and of Nonsynonymous Differences per Nonsynonymous Site (j&) between \vVH Genes in Chickens”

RANGE OF psTb

CDRs” FRs NUMBER OF

COMPARISONS Ps I% Fs h

0.00-0.05 ...... 0.05-0.10 ...... 0.10-0.15 ...... >0.15 .........

Total .......

3 9.2 + 4.4 22.8 +- 5.1* 1.8 + 1.0 3.1 f 1.0 21 19.5 + 7.0 22.2 If: 5.0 4.7 f 1.7 4.9 + 1.1 22 28.8 + 7.0 31.3 +- 6.8 8.9 f 2.4 6.3 -t 1.3 9

55 37.4 +- 9.6 30.7 +- 7.0 11.9 + 3.1 6.9 f 1.5 25.6 + 6.7 27.3 z!z 5.8 7.4 + 2.0 5.7 -t 1.2

*ps and pi were estimated by Nei and Gojobori’s (1986) method, and their standard errors were obtained by our earlier Ota and Nei ( 1994b) method.

b PST is the ps value of the entire variable region (both CDRs and FRs). ’ Codons in CDR 1 and CDR2 were used. * & is significantly higher than ps at the 5% level.

genes (Reynaud et al. 199 1). This analysis showed that with the J gene. Case 3 shows similar nucleotide sequence all D genes are quite similar with one another as well as similarity between YVu 15-9 and the D3 gene. with the 3’ ends of YVu genes. Figure 6 shows a few cases in which the similarity between D genes and the Evolution of the Chicken VH Gene System

3’ ends of YVn genes is particularly high. The first case is the similarity between YVn57- 1 and the D 1, D4, D7, D8, and D 11 genes. A D gene codes for about 10 codons, and the five D genes in this group are closely related to one another. The similarity of these D genes with the middle portion of the D-like segment of YVu57-1 is very high. Case 2 shows the similarity between YVu57- 11 and the D5, D6, D9, DlO, D12, D13, D14, and D15 genes. This group of D genes again show high similarity with the beginning part of the D-like segment of YVu57- 11. Interestingly, YVn57-11 also has partial similarity

.7 t CDRs .7

.6 .6

The genomic structure of Vu genes and the mech- anism of generating immunoglobulin diversity in chick- ens are drastically different from those of mammals (McCormack et al. 1991). How did this unique system evolve? For now, this question remains virtually unan- swered. In light of the present statistical analyses, how- ever, it is possible to offer two alternative hypotheses.

First, one thing seems to be obvious. That is, the ancestral form of the Vu gene system was the same as that of mammalian species, because fishes and Xen- opus apparently have essentially the same system as

.l .2 .3 .4 .5 .6 .7 .l .2 .3 .4 .5 .6 .7

PS PS

FIG. 4.-Relationships between the number of synonymous nucleotide differences per synonymous site (ps) and the number of nonsy- nonymous differences per nonsynonymous site (pN) between YVn genes in chickens. The line represents the expected relationship for the case of no selection.

Page 7: Evolution of Immunoglobulin VH Pseudogenes in …igem.temple.edu/labs/nei/downloads/publications/1995...Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi

VI4 1 gee gtg acg ttg gac gag tee ggg ggc ggc ctc tag acg ccc gga aga gcg ctc age ctc gtc tgc aag gee tee y v, 57-3 -tt at- a-a t-- ___ -t- -t- ___ _-_ g-- -g- ___ __- --- --- --- --- --_ ___

WVHfl-6 ctt a-- a-a -c- --g --- --a --- --- g-- -g- --- --- --- a-- --- --- -t- --- yrv,57-11 --- --- --- --- --- --- --- -a- --- --- --- --- --- --- --- g-a -g- --- --- --- --- --- --- --- -et y v, 15-7 -- --- w- ___ ___ ___ ___ a-- --- -__ ___ ___ y v, 57-10 - - - - - - --a - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

-g- - - -

VH’ y v, 57-3 y V, 57-6 yv,57-11 y v, 15-7 y v, 57-10

ggg ttc act ttc age agt tat sac atg ggt tgg gtg cga tag gcg ccc ggc aag ggg ctg gag ttc gtc wt --- --- --- --- --- s-c c-t - --- m-c --- --- --- --- --a e-w --- --- --- --- --a -at --_ -cl- --- --- --- --- --- --- --t c-g --- c-g --- --- -es --- --- w-t --- --- -a- --- --a --g -ga a --a --- --- --- --- --- ga- --t gg - --- --c --- a-- --- --- --a --- --- --- --- --- --a -a- --- a-- --- --- --- --- --- --- -et gc- --- sac --- --- -a- --- --- v-t --- --- --- --- --a -a- --- -g- --- --- --- --- --- --c -t- t-- --- ttc --- --- w-c --- --- --- --- --- --- _-- --a -a- --- ---

VII 1 lggt att gac sac act ggt aga tat aca ggc tat ggg tcg gcg gtg aag ggclcgt gee act ate tcg agg gac y v, 57-3 -t- -I_ ag- -q- -g- - - - --t ac- - - - aaa - - - - - - qt- - - - - - - c-- - - - - - - - - - - - - - - - - - - -a- - - -

y V, 57-6 -a- - - - ag- t-t -g- ggt - - - --t a- - - - - t-- - - - -cm c-- -mm -- - c-w - - - - - - - - - - - - - - - - - - - - - - - -

y v, 57-l 1 -c- _-- a-- --t tt- --c -at ag- --g --- --t -c- g-- --- --- --- --- w-w --- em- --a e-s --- --- y v, 15-7 ct- --- ag- -q- -g- --- -ct ace --- aaa --- --- g -a --- --- --- --- --- --- --- --- --- --- --- yv,57-10 caa --- ag- -g- --- --- --t ag- --- ta- --- -c- c-- --a --- --- --- --- --- --- --- --- --- ---

vii 1 sac ggg tag age aca gtg agg ctg tag ctg sac sac ctc agg get gag gac act ggc act tat tat tgc gee aaa y v, 57-3 -et -a- --- --- --- --- e-m --- --- --- --- --- --- --- --- --- --- --- -c- --- --- --- --- a- - - y V, 57-6 --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- -c- --- --- --- --- --- --- y v, 57-l 1 - - - - - - - - - - - - - - - - - - - - - - - - e-v - - - - - - - - - - - - - - - - - - - - - - - - - - - -c- -_- - - - - - - - - - a-g -_-

y v, 15-7 -es -we - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -c- - - - - - - - - - - - - - - - - - -

y v, 57-10 - - - - - - - - - - - - - - - --_ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -c- - - - - - - - - - - - - a- - -

vii 1 get get ggt y v, 57-3

F---l

- - - - -c- gctggtagttgtgctggttqtqctqaaqatatatcatg y V, 57-6 aa- --- -a- -------a-cct-a--c-g---agate--c-g--c-g---c yv,57-11 w- --- tat

L2LL.A

ag----t--g---g------gtgg-ct-g-gctgctggttacatggac y v, 15-7 at- --- -a- ag----ta-a--tg--t-ct--g-tggag-gctgctgctccgtgg yv,57-10 - - - - -cm -------a------------------------g-gt

100 Ota and Nei

FIG. 5.-Segmental sequence identities among YVH genes. Boxes indicate the complimentarity-determining regions. Sequence identity or semi-identity is indicated by underlining.

that of mammals (Litman et al. 1993). We have ob- served that all chicken Vu genes have been derived from an ancestral Vu gene belonging to the group C genes, and they are closely related to one another. This means that the group A and B genes in the ancestral species have been lost from the genome. It also means that many divergent genes belonging to group C have been lost. Therefore, it seems that a large portion of genetic variability among different loci was lost in the early stage of avian evolution and that the current VH gene system evolved to cope with the deficiency of immunoglobulin diversity due to loss of many func- tional genes. This loss of functional genes may have occurred by population size bottlenecks or an extended period of population size reduction that caused ac- cumulation of deleterious mutations by genetic drift. Wyles et al. ( 1983) proposed the hypothesis that all the current orders and families of birds were derived from a single bird species or family that survived the worldwide catastrophe around 65 MY ago when an asteroid hit the earth and resulted in months of dark- ness and mass extinction (Alvarez 1983). This hy- pothesis is consistent with the observation that the molecular distances between different genera of birds are considerably smaller than those of other verte-

brates (Avise and Aquadro 1982). It is also interesting to note that our estimate (45 MY ago) of the time of origin of chicken Vu pseudogenes is somewhat lower than 65 MY ago.

There are two explanations (hypotheses) for the evolution of Vu pseudogenes in chickens. One is to assume that they originated from functional genes like those of mammals, and at the time of population size reduction all of them except one became nonfunc- tional owing to accumulation of deleterious mutations in the promoter region or coding region. The fact that virtually no polymorphism exists at the major histo- compatibility complex loci of cheetah (O’Brien et al. 1985 ) and mouse populations living on small islands (Figueroa et al. 1986) suggests that loss of a certain extent of antigen recognition capacity for a short evo- lutionary period is not critical for the survival of the species. Indeed, complete suppression of the expres- sion of immunoglobulin K light chain genes in mice does not seem to impair the survival of the individual in laboratory conditions (Weiss et al. 1984). At any rate, once the genes became nonfunctional, a new genetic mechanism of gene conversion apparently evolved to compensate the deficiency of immuno- globulin diversity. Initially gene conversion probably

Page 8: Evolution of Immunoglobulin VH Pseudogenes in …igem.temple.edu/labs/nei/downloads/publications/1995...Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi

Chicken YV,s 10 1

(1)

\y v, 57-l AGT GCT GCT GGTTATGGTTGTGCTTATGGTTGGTGTGCTGCTGTTCTTGGTTACATGGAC Dl GG-T--A-------C-----TG-----TA- D4, D8, Dll ---A-------CT-----GA-----A- D7 ---A-C-----CT------A-----A-

(2)

IqvH57-11 TGT GCT TAC AGTGGTTGTGGTGGTGGTTGGTGGGCTGGTGCTGCTGGTTACATGGAC D5 GGT A-- --- --- T-----A------C-TA- D6 GGT A-- -G- --- T-----A------C-TA- D9,D12,D13 GGT A-- -G- --- T-----A---C--C-TA- DlO =T A-- -G- --- T-------G----C-TA- D14 GGT A-- -G- --- T-------GA---C-TA- D15 GGT A-- -G- --- T-----A------C--A- J AG -T- TTG GGT GAAAAZUi- GC-GAT-TTGG--CATTG----A--------AG---CG--GCAT

(3)

lp v, 15-9 GCT GCT GGT AGTTGTACTTACGGTTATAGTTGTGCAGGTGGTTGTGGTCAGCAC D3 G--A--G-----T---G----G----TTA-

FIG. 6.-Similarity between the 3’ ends of chicken YV,‘s and diversity segment (D) genes

occurred by one or a few donor pseudogenes located upstream of the functional gene (Vu 1 ), and later one of these pseudogenes duplicated many times to pro- duce the present group of pseudogenes. All the present YVu genes are known to lack the promoter region, recombination signals, and leader peptide region, which can be explained if we assume that the original pseudogene already lacked these regions. This hy- pothesis also explains why the physical distance be- tween two contiguous YVH genes is so short in chickens.

Alternatively, one can assume that the gene con- version mechanism existed before many Vu genes be- came nonfunctional. Indeed, Maizels ( 1989 ) pointed out the possibility that somatic mutation that occurs in mammalian mature Vu genes is actually caused by so- matic gene conversion. Note also that somatic gene con- version is known to occur to generate immunoglobulin diversity in rabbits ( Knight 1992 ) . If similar gene con- version occurs occasionally in the germline gene, it makes it easy to explain the evolution of chicken YVu genes. That is, if a gene conversion mechanism similar to that of the rabbit Vu genes once evolves, inactivation of the transcription system of functional Vu genes except the Vu 1 gene would not be harmful to the individual as long as stop codons are prevented from accumulating in the coding region. However, even in this hypothesis it is necessary to assume that the current YVu genes have arisen by recent gene duplication.

There is one problem in the above hypotheses. That is, they cannot explain the presence of a D-like segment attached to the 3 ’ end of YVu genes. To explain this peculiar feature of the chicken Vu gene system, it is nec- essary to assume that at some point of evolutionary time a fused VD gene appeared, and this fused gene then multiplied many times by gene duplication. The fused gene may have occurred either by germline rearrange- ment of the V and D genes or by reverse transcription of a somatically rearranged VD gene. Although no direct evidence exists of germline rearrangement of V and D genes, some V and D genes are fused in cartilaginous fishes (Litman et al. 1993). In contrast, the scenario of reverse transcription is attractive, because the absence of the promotor region, recombination signals, and leader peptide region from YVu can easily be explained by this process. However, it seems to be very difficult to have the VD element to be reverse-transcribed precisely into the genomic place where the functional Vu gene is located.

At any rate, the above hypotheses are no more than speculations at this time. To understand the real process of evolution of the avian Vu gene system, we need more information on the genomic organization of both Vu and VL genes from related species. Acknowledgment

This work is supported by research grants from Na- tional Institute of Health (GM-20293 ) and the National Science Foundation (DEB-9 119802) to M.N.

Page 9: Evolution of Immunoglobulin VH Pseudogenes in …igem.temple.edu/labs/nei/downloads/publications/1995...Evolution of Immunoglobulin VH Pseudogenes in Chickens Tatsuya Ota and Masatoshi

102 Ota and Nei

LITERATURE CITED

ALVAREZ, L. W. 1983. Experimental evidence that an asteroid impact led to the extinction of many species 65 million years ago. Proc. Natl. Acad. Sci. USA 80:627-642.

AVISE, J. C., and C. F. AQUADRO. 1982. A comparative sum- mary of genetic distances in the vertebrates: pattern and correlations. Evol. Biol. 15: 15 l- 185.

BENATAR, T., and M. J. H. RATCLIFFE. 1993. Polymorphism of the functional immunoglobulin variable region genes in the chicken by exchange of sequence with donor pseudo- genes. Eur. J. Immunol. 23:2448-2453.

FIGUEROA, F., H. TICKY, R. J. BERRY, and J. KLEIN. 1986. MHC polymorphism in island populations of mice. Cur-r. Top. Microbial. Immunol. 127: 1 OO- 105.

GOJOBORI, T., and M. NEI. 1984. Concerted evolution of the immunoglobulin Vn gene family. Mol. Biol. Evol. 1:195- 212.

HAIRE, R. N., Y. OHTA, R. T. LITMAN, C. T. AMEMIYA, and G. W. LITMAN. 199 1. The genomic organization of im- munoglobulin Vu genes in Xenopus laevis shows evidence for interspersion of families. Nucl. Acids Res. 19:3061- 3066.

JUKES, T. H., and C. R. CANTOR. 1969. Evolution of protein molecules. Pp. 2 l-l 32 in H. N. MUNRO, ed. Mammalian protein metabolism. Academic Press, New York.

KNIGHT, K. L. 1992. Restricted Vn gene usage and generation of antibody diversity in rabbit. Annu. Rev. Immunol. 10: 593-616.

LITMAN, G. W., J. P. RAST, M. J. SHAMBLOTT, R. N. HAIRE, M. HULST, W. ROES& R. T. LITMAN, K. R. HINDS-FREY, A. ZILCH, and C. T. AMEMIYA. 1993. Phylogenetic diver- sification of immunoglobulin genes and the antibody rep- ertoire. Mol. Biol. Evol. 10:62-72.

MAIZELS, N. 1989. Might gene conversion be the mechanism of somatic hypermutation of mammalian immunoglobulin genes? Trends Genet. 5:4-8.

MATSUDA, F., E. K. SHIN, H. NAGAOKA, R. MATSUMURA, M. HAINO, Y. FIJKITA, S. TAKA-ISHI, T. IMAI, J. H. RILEY, R. ANAND, E. SOEDA, and T. HONJO. 1993. Structure and physical map of 64 variable segments in the 3’ 0.8-megabase region of the human immunoglobulin heavy-chain locus. Nat. Genet. 3:88-94.

MCCORMACK, W. T., L. W. TJOELKER, and C. B. THOMPSON. 199 1. Avian B-cell development: generation of an immu-

noglobulin repertoire by gene conversion. Annu. Rev. Im- munol. 9:2 19-24 1.

MCCORMACK, W. T., L. W. TJOELKER, and C. B. THOMPSON. 1993. Immunoglobulin gene diversification by gene con- version. Prog. Nucl. Acid Res. Mol. Biol. 45:27-45.

NEI, M. 1987. Molecular evolutionary genetics. Columbia University Press, New York.

NEI, M., and T. GOJOBORI . 1986. Simple methods for esti- mating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:4 18-426.

O’BRIEN, S. J., M. E. ROELKE, L. MARKER, A. NEWMAN, C. A. WINIUER, D. MELTZER, L. COLLY, J. F. GREUMANN, M. BUSH, and D. E. WILDT. 1985. Genetic basis for species vulnerability in the cheetah. Science 227: 1428- 1434.

OTA, T., and M. NEI . 1994a. Divergent evolution and evo- lution by the birth-and-death process in the immunoglob- ulin VH gene family. Mol. Biol. Evol. 11:469-482.

~ 1994b. Variances and covariances of the numbers of synonymous and nonsynonymous substitutions per site. Mol. Biol. Evol. 11:613-619.

REYNAUD, C.-A., V. ANQUEZ, and J.-C. WEILL. 1991. The chicken D locus and its contribution to the immunoglobulin heavy chain repertoire. Eur. J. Immunol. 21:2661-2670.

REYNAUD, C.-A., A. DAHAN, V. ANQUEZ, and J.-C. WEILL. 1989. Somatic hyperconversion diversifies the single Vn gene of the chicken with a high incidence in the D region. Cells 59:171-183.

RZHETSKY, A., and M. NEI . 1992. A simple method for esti- mating and testing minimum-evolution trees. Mol. Biol. Evol. 9:945-967.

TANAKA, T., and M. NEI. 1989. Positive Darwinian selection observed at the variable-region genes of immunoglobulins. Mol. Biol. Evol. 6:447-459.

WEISS, S., K. LEHMANN, W. C. RASCHKE, and M. COHN. 1984. Mice completely suppressed for the expression of immu- noglobulin kappa light chain. Proc. Natl. Acad. Sci. USA 81:211-215.

WYLES, J. S., J. G. KUNKEL, and A. C. WILSON. 1983. Birds, behavior, and anatomical evolution. Proc. Natl. Acad. Sci. USA 80:4394-4397.

JAN KLEIN, reviewing editor

Received April 6, 1994

Accepted August 25, 1994