a superfamily of s locus-related sequences in arabidopsis ... · are the arabidopsis receptor...

16
The Plant Cell, Vol. 6, 1829-1843, December 1994 O 1994 American Society of Plant Physiologists A Superfamily of S Locus-Related Sequences in Arabidopsis: Diverse Structures and Expression Patterns Kathleen G. Dwyer,' Muthugapatti K. Kandasamylb Dusty I. Mahosky,' Joann Acciailb Bela I. Kudish,' Janet E. Miller,' Mikhail E. Nasrallahlb and June B. Nasrallahbi' a Biology Department, University of Scranton, Scranton, Pennsylvania 18510 Section of Plant Biology, Division of Biological Sciences, Cornell University, Ithaca, New York 14853 Six sequences that are closely related to the S gene family of the largely seltincompatible Brassica species have been identified in self-fertilizing Arabidopsis. The sequences define four genomic regions that map to chromosomes 1 and 3. Of the four functional genes identified, only the previously reported Arabidopsis AtS7 gene was expressed specifically in papillar cells and may function in pollination. The remaining three genes, including two nove1genes designated ARKP and ARK3, encode putative receptor-like serinehhreonine protein kinases that are expressed predominantly in vegetative tissues. ARKP promoter activity was detected exclusively in above-ground tissues, specifically in cotyledons, leaves, and sepals, in correlation with the maturation of these structures. ARK3 promoter activity was detected in roots as well as above-ground tissues but was limited to small groups of cells in the root-hypocotyl transition zone and at the base of lateral roots, axillary buds, and pedicels. The nonoverlapping patterns of expression of the ARK genes and the diver- gente of their sequences, particularly in their predicted extracellular domains, suggest that these genes perform nonredundant functions in specific aspects of development or growth of the plant body. INTRODUCTION The S superfamily of genes is composed of genes that share sequence similarity with genes derived from the self- incompatibility (S) locus of Brassica. The S locus genes include the S Locus Glycoprotein (SLG) gene (Nasrallah et al., 1987) and the S locus Receptor Kinase (SRK) gene, which encodes a transmembrane protein with an extracellular domain that shares a high degree of sequence similarity with SLG (Stein et al., 1991) and a cytoplasmic domain that exhibits ser- inehhreonine kinase activity (Goringand Rothstein, 1992; Stein and Nasrallah, 1993). This family of sequences appears to be ubiquitous in plants, and S gene family members that en- code secreted glycoproteins and transmembrane receptor protein kinases have been identified in monocots as well as dicots (reviewed in Nasrallah and Nasrallah, 1993). In crucifers, several S family members have been reported in addition to the S locus genes of Brassica. These include the Brassica S Locus-Related genes SLR7 (Lalonde et al., 1989; Trick and Flavell, 1989; lsogai et al., 1991) and SLRP (Boyes et al., 1991) and the Arabidopsis AtS7 gene (Dwyer et al., 1992); these genes encode secreted glycoproteinsthat are expressed specifically in reproductive tissues. Also reported are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4 To whom correspondence should be addressed. (Walker, 1993), which encode putative receptor protein kinases that are expressed predominantly in vegetative structures. Ex- cept for the latter two sequences, which were isolated as cDNAs by hybridization to a probe derived from the kinase domain of maize ZmPK7 (Walker and Zhang, 1990), all of these addi- tional members share at least 60°/o sequence identity with the Brassica S locus genes. AI1 family members share an "S do- main" that comprises the coding region of each of the secreted glycoprotein-encoding genes and the extracellular domains of each of the receptor protein kinase-encoding genes. Fur- thermore, where investigated, the details of gene structure are conserved (Dwyer et al., 1992; Tobias et al., 1992). With the exception of the SLG and SRK genes of Brassica, which have a role in cell-cell signaling between pollen and stigma and are required for the operation of the self- incompatibility system, the biological function of the various family members is not known. However, in view of similarities in primary sequence and gene structure, it is possible that these genes also function in cell-cell signaling at some as yet unknown stage of the plant life cycle. To understand the orga- nization and evolution of the S gene family and eventually define the biological function of individual members, it is im- portant first to define the composition of the gene family in a particular plant species and to describe the structure and expression of its members. We have focused on the S gene family of Arabidopsis for several reasons. First, Arabidopsis has well-known advantages as a model system for molecular

Upload: others

Post on 11-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

The Plant Cell, Vol. 6, 1829-1843, December 1994 O 1994 American Society of Plant Physiologists

A Superfamily of S Locus-Related Sequences in Arabidopsis: Diverse Structures and Expression Patterns

Kathleen G. Dwyer,' Muthugapatt i K. Kandasamylb Dusty I. Mahosky,' Joann Acciailb Bela I. Kudish,' Janet E. Miller,' Mikhail E. Nasrallahlb and June B. Nasrallahbi'

a Biology Department, University of Scranton, Scranton, Pennsylvania 18510 Section of Plant Biology, Division of Biological Sciences, Cornell University, Ithaca, New York 14853

Six sequences that are closely related to the S gene family of the largely seltincompatible Brassica species have been identified in self-fertilizing Arabidopsis. The sequences define four genomic regions that map to chromosomes 1 and 3. Of the four functional genes identified, only the previously reported Arabidopsis AtS7 gene was expressed specifically in papillar cells and may function in pollination. The remaining three genes, including two nove1 genes designated ARKP and ARK3, encode putative receptor-like serinehhreonine protein kinases that are expressed predominantly in vegetative tissues. ARKP promoter activity was detected exclusively in above-ground tissues, specifically in cotyledons, leaves, and sepals, in correlation with the maturation of these structures. ARK3 promoter activity was detected in roots as well as above-ground tissues but was limited to small groups of cells in the root-hypocotyl transition zone and at the base of lateral roots, axillary buds, and pedicels. The nonoverlapping patterns of expression of the ARK genes and the diver- gente of their sequences, particularly in their predicted extracellular domains, suggest that these genes perform nonredundant functions in specific aspects of development or growth of the plant body.

INTRODUCTION

The S superfamily of genes is composed of genes that share sequence similarity with genes derived from the self- incompatibility (S) locus of Brassica. The S locus genes include the S Locus Glycoprotein (SLG) gene (Nasrallah et al., 1987) and the S locus Receptor Kinase (SRK) gene, which encodes a transmembrane protein with an extracellular domain that shares a high degree of sequence similarity with SLG (Stein et al., 1991) and a cytoplasmic domain that exhibits ser- inehhreonine kinase activity (Goring and Rothstein, 1992; Stein and Nasrallah, 1993). This family of sequences appears to be ubiquitous in plants, and S gene family members that en- code secreted glycoproteins and transmembrane receptor protein kinases have been identified in monocots as well as dicots (reviewed in Nasrallah and Nasrallah, 1993).

In crucifers, several S family members have been reported in addition to the S locus genes of Brassica. These include the Brassica S Locus-Related genes SLR7 (Lalonde et al., 1989; Trick and Flavell, 1989; lsogai et al., 1991) and SLRP (Boyes et al., 1991) and the Arabidopsis AtS7 gene (Dwyer et al., 1992); these genes encode secreted glycoproteins that are expressed specifically in reproductive tissues. Also reported are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

To whom correspondence should be addressed.

(Walker, 1993), which encode putative receptor protein kinases that are expressed predominantly in vegetative structures. Ex- cept for the latter two sequences, which were isolated as cDNAs by hybridization to a probe derived from the kinase domain of maize ZmPK7 (Walker and Zhang, 1990), all of these addi- tional members share at least 60°/o sequence identity with the Brassica S locus genes. AI1 family members share an "S do- main" that comprises the coding region of each of the secreted glycoprotein-encoding genes and the extracellular domains of each of the receptor protein kinase-encoding genes. Fur- thermore, where investigated, the details of gene structure are conserved (Dwyer et al., 1992; Tobias et al., 1992).

With the exception of the SLG and SRK genes of Brassica, which have a role in cell-cell signaling between pollen and stigma and are required for the operation of the self- incompatibility system, the biological function of the various family members is not known. However, in view of similarities in primary sequence and gene structure, it is possible that these genes also function in cell-cell signaling at some as yet unknown stage of the plant life cycle. To understand the orga- nization and evolution of the S gene family and eventually define the biological function of individual members, it is im- portant first to define the composition of the gene family in a particular plant species and to describe the structure and expression of its members. We have focused on the S gene family of Arabidopsis for several reasons. First, Arabidopsis has well-known advantages as a model system for molecular

Page 2: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

1830 The Plant Cell

and genetic studies. Second, Arabidopsis and Brassica are both members of the Brassicaceae, and the family members that are most closely related to the Brassica S locus genes are therefore easily identified by Brassica-derived probes. Third, unlike Brassica species, which are generally self- incompatible, Arabidopsis lacks a self-incompatibility system; it is therefore of interest to determine whether, in this self-fertile plant, an S locus complex has been maintained and whether pollen-stigma interactions are likely to be mediated by receptor- like protein kinases. The analysis of the S gene family in Arabidopsis thus provides the opportunity not only to inves- tigate the evolutionary relationships of cell-cell recognition phenomena that operate during plant reproduction to those that operate in vegetative tissues, but also to understand the evolution of breeding behavior in a plant family.

In this study, we present the gene structure, sequence, and expression pattern of several members of the Arabidopsis S gene family. We show that the cloned sequences define four regions in the Arabidopsis genome. Of a total of six sequences identified by Brassica probes, four sequences have not been previously described and include two pseudogenes and two putative transmembrane protein kinase genes, designated ARK2 and ARK3. These two nove1 protein kinase genes are expressed vegetatively and are therefore likely to have a func- tion very different from that of AtS7, the only cloned member of the family that we show in this report to be expressed spe- cifically in the papillar cells of the stigma.

RES U LTS

lsolation of Arabidopsis Sequences with Homology with the Brassica S Gene Family

The Arabidopsis genome contains several genes related to the Brassica S locus genes, as revealed by DNA gel blot analysis (Nasrallah et al., 1988; Dwyer et al., 1992). As a con- sequence, the initial screening of an Arabidopsis genomic library with Brassica SLG and SLR7 gene probes that had led to the identification of theAtS7 (Dwyer et al., 1992) and ARKl (Tobias et al., 1992) genes had resulted in the isolation of a total of 18 positive recombinant clones, many of which had remained uncharacterized. Each of these positive clones con- tained sequences that are related to SLGISLR7 and that will be referred to as S-related sequences or genes. Seven of these clones also hybridized to a 0.5-kb Accl fragment derived from the kinase region of the Brassica SRK, gene, suggesting that they contained transmembrane protein kinase genes. Further analysis revealed that these recombinant clones spanned three distinct genomic regions containing several S-related se- quences (Figure 1). Genomic region 1, diagrammed as a 12-kb Sal1 fragment in Figure 1, contains the previously reported Arabidopsis AtS7 gene (Dwyer et al., 1992). Genomic region 2, a 20.4-kb region defined by several overlapping clones, con- tains two tandemly repeated putative SRK-related receptor

protein kinase genes; their protein coding regions are sepa- rated by only 848 bp. Of these two genes, the ARKl gene has been described by Tobias et al. (1992), and we have desig- nated the second, previously unreported gene as ARK2. Genomic region 3, a 19.2-kb region defined by several over- lapping clones, spans four regions that contain S-related sequences, which we designated SY7 and SY2a, SY2D, and

’ SY2y, respectively, because, as shown below, they comprise nonfunctional pseudogenes.

DNA gel blot analysis using probes derived from the S-related sequences within these genomic regions of Arabidopsis demonstrated intense cross-hybridization between the region 1 and region 3 S-related sequences on the one hand and be- tween the two S-related genes of region 2 on the other. In contrast, under the hybridization conditions used, the S-related sequences in regions 1 and 3 hybridized only weakly with those in region 2. To identify possible additional S-related sequences that might have been missed by the Brassica-derived probes, we screened another Arabidopsis genomic library with the 1.6-kb EcoRI-Sstl fragment containing the AtS7 gene (probe A in Figure 1; representative of all of the S-related sequences in regions 1 and 3) and with the 4.2-kb Pstl fragment contain- ing ARKgene sequences (probe B in Figure 1; representative of both of the S-related sequences in region 2). Of 20 recom- binant clones isolated, one clone defined a fourth distinct Arabidopsis genomic region containing S-related sequences (Figure 1). A 9.0-kb EcoRl fragment from this genomic region (region 4) was found to contain a third putative S-related recep- tor protein kinase, which we designated ARK3.

Genomic Organization of Arabidopsis S-Related Sequences

Only sequences most closely related to the Brassica S gene family members were expected to be identified by the probes used in this study. Thus, the Arabidopsis S-related putative receptor protein kinase genes RLK7 and RLK4 (Walker, 1993), which share only 37 and 38% sequence identity, respectively, with SRK would not have been detected by the hybridization conditions used in our screening of genomic and cDNA libraries. To determine whether the cloned genes represented the full complement of the “Brassica-like”S gene family mem- bers in Arabidopsis, we performed gel blot analyses of Arabidopsis (ecotype Columbia) genomic DNA using as probes the 1.6-kb EcoRI-Sstl fragment, containing AtS7 gene se- quences (probe A in Figure l) , and the 0.7-kb SallXbal fragment, containing ARKl S domain sequences (probe C in Figure 1). Figure 2 shows that in EcoRI-digested DNA, theAfS7 sequence probe hybridized with varying degrees of intensity to six restriction fragments of 8.8, 7.2, 6.0, 2.8, 2.0, and 1.8 kb. This probe hybridized to S pseudogene sequences as well as to the AtS7 gene. As predicted from the physical maps of genomic regions 1 and 3 in Figure 1, this probe is expected to hybridize with genomic EcoRl fragments of 7.1 (SY7), 5.8 (AtS7), 2.8 (SY2y), 2.0 (SY2a), and 1.8 (SY2p) kb in length.

Page 3: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

Arabidopsis S Locus-Related Genes 1831

REGION ONE

5 3 Sa P R H R R R H S K B HR P H R S Sa

S A t S 1 - probe A

REGION TWO

pKD421 pBL225

3 5 3 PKH H RH R S R H H R H 5 S P HH R R S P H S R RH H HR I 1 1 I 0 1 I I

Sa Sa ombe B -

probe C

REGION THREE

pKD411 pJA414

pKD419 - @A41 3

5' 3 5 middle 3 H P S H S R R KH S R K H R R SR H R C I ~ L V =I II I II A I I~ d U

I I P V I wza WZP wzu Sa Sa Sa

REGION FOUR

S 3 R B Sa P K P R

I I 1 ARKB

.I pmbe D

Figure 1. Partia1 Restriction Map of the Four Genomic Regions of Arabidopsis Containing S-Related Sequences.

The open reading frames and exons of the functional S genes in regions 1, 2, and 4 are depicted by black boxes and introns by open boxes. The nonfunctional S pseudogene and promoter (p) sequences of region 3 are also shown by black boxes. The orientations of the S-related genes are indicated. The lines above the maps indicate the positions of subcloned DNA fragments that were used to sequence the ARK genes and S pseudogenes. The IineS below the restriction maps delineate the location of various DNA probes used in DNA gel blot analysis and/or to screen Arabidopsis genomic libraries. The regions are drawn to scale, with probe A representing 1.6 kb. B, BamHI; R, EcoRI; H, Hindlll; K, Kpnl; P, Pstl; Sa, Sall; S, Sstl.

Page 4: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

1832 The Plant Cell

The expected results agree well with the actual results exceptfor the 8.8-kb EcoRI fragment, which is unaccounted for in ourcollection of clones. Our screens of genomic libraries failedto identify additional S-related genes that might correspondto this EcoRI fragment. Because the genomic DMA preparedfor the blot analysis and the genomic libraries was derived fromdifferent Arabidopsis strains, this unique EcoRI fragment maybe due to a restriction fragment length polymorphism (RFLP)found between the different Arabidopsis strains used.

The ARK1S domain probe hybridized with varying degreesof intensity to three restriction fragments of 10, 6.2, and 3.9kb (Figure 2). This probe has been shown to recognize theS domain sequences of all three ARK genes. As predicted fromthe restriction maps of genomic regions 2 and 4 in Figure 1,this probe is expected to hybridize with EcoRI fragments of9.0 (ARK3), 5.3 (ARK1), and 3.4 (ARK2) kb. The expected andactual results were in approximate agreement. Note that the6.2-kb fragment ascribed to the ARK1 gene was recognizedmost intensely by the ARK1-der\ved probe. The blot shown inFigure 2 was reprobed with sequences specific to the ARK3gene and derived from its 3' untranslated region (probe 0 in

23 —|

9.5-6.7-5 .1 -4.3'

2.

1.6 —

BFigure 2 DNA Gel Blot Analysis of the Arabidopsis S Gene Family.(A) Genomic DNA (2 ug) was digested with EcoRI. The DNA gel blotwas probed with probe A (Figure 1, genomic region 1), which consistsof AtSl sequences and hybridizes to genomic fragments containingboth AtSl and S pseudogene sequences.(B) The same DNA blot was reprobed with probe C (Figure 1, genomicregion 2), which consists of ARK1 S domain sequences and hybrid-izes to genomic fragments containing ARK1, ARK2, and ARKS S domainsequences. Molecular length standards are indicated at left in kilobases.

Figure 1), and only the 10-kb EcoRI fragment was detected(data not shown).

RFLP mapping in recombinant inbred lines (see Methods)indicated that genomic regions 2 and 3 mapped to chromo-some 1, separated by a distance of >2 centimorgans, whereasgenomic regions 1 and 4 were within 1 centimorgan of eachother on chromosome 3.

Sequence Analysis of Newly Identified Membersof the Arabidopsis S Gene Family

The Expressed Genes

The DNA sequence and gene structure of the AtS1 and ARK1genes were reported previously (Dwyer et al., 1992; Tobias etal., 1992). To determine the nucleotide sequence of the ARK2gene, restriction fragments derived from genomic region 2 werecloned into plasmid vectors to generate the subclones shownin Figure 1. Subclone pBL225 contains a 4.2-kb Pstl fragmentthat spans most of the ARK1 gene, the spacer sequences be-tween the ARK1 and ARK2 genes, and the 5' region of the ARK2gene; subclone pKD421 contains a 6.3-kb Sall-EcoRI fragmentthat spans most of the ARK2 gene and its downstream flank-ing sequences. Similarly, for sequence analysis of the ARK3gene in genomic region 4, we generated subclone pDM2 thatcontains the 5.5-kb BamHI-Pstl fragment that spans the entireARK3 gene and its flanking regions.

Alignment of the ARK2 and ARK3 sequences with the ARK1sequence revealed a conserved gene structure consisting ofseven exons separated by six introns. Interestingly, theexon-intron boundaries are conserved among all threeArabidopsis ARK genes and the Brassica SRK6 and SflK2genes as well (data not shown). Thus, in all these receptor-like genes, the first exon encodes the entire S domain, the sec-ond exon encodes the transmembrane domain, and exons 3to 7 encode the kinase domain. The ARK2 gene, from thepredicted translation initiation codon to the predicted termi-nation codon, is 3358 bp in length, with introns 1 through 6spanning 369, 93, 76, 85, 103, and 88 bp, respectively. Thesequence predicts a spliced transcript of 2544 bp encodinga protein product of 847 amino acids with a molecular massof 95,957 D. The ARKS gene is 3871 bp in length, with introns1 through 6 spanning 881, 89, 77, 91, 96, and 84 bp, respec-tively. The 2553-bp spliced transcript predicted from the ARKSsequence would encode a protein product of 850 amino acidswith a molecular mass of 96,456 D.

The amino acid sequences of the predicted ARK2 and ARKSproteins are presented in alignment with the sequences of thepredicted ARK1 and AtS1 proteins in Figure 3. The S domainis the only region of shared sequence similarity between AtSlon the one hand and ARK1, ARK2, and ARKS on the other.As has been noted previously for AtSl and ARK1, the ARK2and ARKS S domains have conserved features first found as-sociated with the Brassica SLG, SLR1, and SRK genes. At theirN terminus lies a stretch of hydrophobic amino acids that mayfunction as a signal sequence. Cleavage of this signal peptide

Page 5: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

Arabidopsis S Locus-Related Genes 1833

ARKZ ARK3 A t S l

ARKl ARKZ ARK3 A t S l

ARKl ARKZ ARK3 A t S l

ARKl ARKZ ARK3

ARKl ARKZ ARK3

ARKl ARKZ ARK3

ARKl ARK2 ARK3

DT:SSE:NLF * * * * * *

YSPKDLCDNYKVCGNFGYCDSNSLPNCY

1 2 i l A F A N A D I R N G G S G S V I W T R E I L D M R N Y A K G G Q D L Y V R L - E L E b K R ~ ~ E K ~ I ~ S S ~ G V S ~ L L L L ~ F V I F ~ ~ W K R K Q K R S I ~ ~ Q ~ ~ N ~ . QVRSQDSL

6 5 6 5 6 5 69

1 6 0 1 6 5 1 6 4 1 6 7

2 5 7 2 6 2 2 6 1 2 6 1

3 5 6 3 6 0 3 6 0 366

4 5 5 4 5 9 4 6 0 408

5 5 5 5 5 9 5 6 0

6 5 5 6 5 9 6 6 0

7 5 5 7 5 9 7 5 9

Figure 3. Amino Acid Sequence Alignment of the Predicted ARKl, ARKP, ARW, and AtSl Proteins.

ldentical sequences are indicated by colons and deleted amino acids by dots. Vertical lines mark the sites of the six exonlintron boundaries for the ARK genes. The underlined residues represent hydrophobic regions thought to function as signal peptide and transmembrane domains. The numbers to the right refer to the positions of amino acid residues, with the first residue following the putative signal sequence designated as +I. Within the Sdomains (encoded byexon 1 of the ARKgenes and the entireAtS1 gene), open boxes enclose potential N-linked glycosylation sites, and asterisks indicate the positions of the 12 conserved cysteine residues. Within the kinase domains (encoded by exons 3 through 7 of the ARKgenes), the 15 amino acid residues that are invariant in protein kinases are marked by asterisks, and the consensus sequences DLKASN and GTYGYMSP that predict all three ARK genes to have serindthreonine protein kinase specificity are shown in boxes.

would yield mature protein products of 92,751 and 93,035 D for ARK2 and ARK3, respectively. In addition, the proteins ex- hibit the 12 invariant cysteines that also occur within the C-terminal third of AtSl, ARK1, and all Brassica S domains analyzed to date. Finally, the ARK2 and ARK3 S domains ex- hibit severa1 potential N-linked glycosylation sites with the consensus N-X-Sn, three of which are conserved in all four Arabidopsis genes and in the Brassica SRh, SLR1, and SLG6 S domains as well. Interestingly, as was first described for the Brassica SRKs gene, all three Arabidopsis ARK genes have an in-frame TAG stop codon within the first intron 2 bp downstream of the first splice junction. There is evidence (Tobias et al., 1992; C.M. Tobias and J.B. Nasrallah, manuscript in preparation) that these ARKgenes produce alternative tran- scripts predicted to encode a truncated protein product consisting entirely of the S domain and presumably secreted into the extracellular matrix.

Hydrophobicity plot analysis of the predicted ARK2 and ARK3 proteins reveals a second concentration of 22 hydrophobic

amino acids followed by a stretch of eight,amino acids, six of which are basic. These sequences, which are encoded by the second exon in the three genes, are similar to the membrane- spanning and juxtamembrane domains of other transmem- brane proteins (Weinstein et al., 1982). In each of the ARK proteins, the putative transmembrane domain is followed by the protein kinase catalytic domain. The amino acid sequences of the kinase domains contain the 11 subdomains character- istic of the catalytic core for all protein kinases (Hanks et al., 1988), including the 15 invariant residues that are essential for nucleotide binding and for the phosphate transfer reaction (Figure 3). Furthermore, the DLKASN sequence in subdomain VI and the GTYGYMSP in subdomain Vlll are suggestive of serinehhreonine rathe: than tyrosine kinase activity for all three Arabidopsis ARK proteins.

Pairwise sequence comparisons of the Arabidopsis and Bras- sica S-related genes indicate that the ARKl and ARK2 proteins are the most closely related and exhibit an overall sequence identity of 85.7%. The ARK3 protein exhibits approximately

Page 6: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

1834 The Plant Cell

equal overall sequence similarity with ARKl (75.3%) and ARK2 (i'6%). As expected for interspecific sequence comparisons, the Brassica SRKs gene is more distantly related to the Arabidopsis ARKgenes, sharing 65.5,64.9, and 63.9% overall amino acid sequence identity with theARK3, ARK7, and ARK2 genes, respectively. In Tables 1 and 2, the sequence relation- ships are broken down between the various S (Table 1) and kinase (Table 2) domains. The same pattern of relatedness ob- served overall between the receptor protein kinase sequences holds true for their S and kinase domains when considered separately. Interestingly, however, the kinase domains are con- sistently more conserved than the S domains. This difference is especially apparent in the relative drift between the ARKl and ARK2 S and kinase domains, which exhibit 21.4 and 7% amino acid divergence, respectively.

Some interesting features emerge from the comparison of S domain sequences in the predicted Arabidopsis and Bras- sica proteins (Table 1). Of the three ARK proteins, the S domain of ARK3 is most similar to that of AtSl. Interestingly, the S do- mains of the three Arabidopsis ARK proteins share somewhat more sequence identity with the Brassica SRK6 protein than with Arabidopsis AtS1. In addition, the ARK proteins are slightly more similar to Brassica SLG6 than to SLR1, which is consis- tent with the fact that the Brassica SLG and SRK genes isolated from one haplotype show >90% sequence identity.

The S Pseudogenes

To determine the nucleotide sequence of the S-related se- quences in genomic region 3, subclones pKD419, pKD411, pJA413, and pJA414 (shown in Figure 1) were used. All four S-related sequences within this region were found to repre- sent nonfunctional pseudogenes. This conclusion is based on our inability to detect the corresponding transcripts in floral bud or leaf tissue by RNA polymerase chain reaction (PCR) with primers specific for each sequence and on the sequence analysis summarized in the following discussion.

The alignment of the four S-related sequences of genomic region 3 relative to AfS7 is shown diagrammatically in Figure 4. The SY7 sequence exhibited 92.2% nucleotide identity over the length of the AfS7 coding region, but it contained many

Table 1. Percent Amino Acid ldentity between the S Domains of the Following Arabidopsis and Brassica Genes

ARKl I%) ARK2 (010) ARK3 (Vo) AtSl (Vo)

ARKl A RK2 ARK3 AtSl SRKe SLRl

RLKl RLK4

SLGs

1 O0 78.6 70.4 57.1 61.2 52.3 59.2 28.2 34.2

78.6

70.9 57.0 59.0 55.8 55.7 27.9 35.4

1 O0 70.4 70.9

60.5 63.9 57.6 62.3 24.2 36.4

1 O0

57.1 57.0 60.5

58.7 63.2 58.5 26.2 34.1

1 O0

Table 2. Percent Amino Acid ldentity between the Kinase Domains of the Following Arabidopsis and Brassica Genes

ARKl (Vo) ARK2 (Yo) ARK3 (%)

ARKl 1 O0 93.0 80.3 A RK2 93.0 1 O0 81.3

SRKs 68.7 68.9 65.0 RLKl 33.0 34.1 34.0 RLK4 33.4 35.4 30.2

A RK3 80.3 81.3 1 O0

insertions and deletions. Most notable was a 132-bp insertion at a nucleotide position corresponding to position 94 in the AfS7 gene. This insertion introduced two stop codons and in- terrupted the reading frame such that the SY7 sequence could encode only a truncated protein of 38 amino acids. Examina- tion of the sequences flanking the SY7 pseudogene also revealed extensive sequence similarity with the correspond- ing regions of AfS7. The SY7 and AfS7 3' flanking regions exhibited 87.9% nucleotide identity for at least 600 bp after the stop codon used in AfS7. The S'flanking region of the SY7 pseudogene shared sequence similarity with the correspond- ing AfS7 5'region until just 16 bp upstream of the putative TATA box (located at position -54 relative to the start of translation in AfS7). An additional 270 bp of promoter sequence exhibit- ing 94.3% identity with AS7 promoter sequences immediately 5'of the TATA box (-55 to -315 bp in AfS7) were found 1.3 kb farther upstream of SY7.

SY2a, SY2p, and SY2y were found to be truncated S-related sequences. The SY2a sequence exhibited 91% nucleotide identity with the first 400 bp of the AtS7 coding region. No se- quence similarity was detected between SY2a and AfS7 in regions flanking this 400-bp sequence. A potential TATA box was located at position -34 relative to the ATG initiating codon. However, the location of this TATA box was aberrant relative to theAfS7 gene, in which the TATA box is located at -54 rela- tive to the start of translation. In addition, the SY2a sequence predicted a truncated 154-amino acid protein terminating in a stop codon located .u70 bp 3'0f the AfS7-homologous 400-bp region.

The SY2p sequence exhibited 91.9% nucleotide sequence identity with the AfS7 gene from nucleotide positions 268 to 711 of the AfS7 coding region. As a result, the 5' end of the SY2p sequence overlapped over 141 bp with the 3'sequence of SY2a. A 1-bp deletion ata position corresponding to nucleo- tide position 468 in AS7 led to a frame shift and generated a stop codon 50 bp farther downstream. The SY2y sequence exhibited 91.3% nucleotide identity with the 3' region of the AfS7 coding sequence, beginning at AS7 nucleotide position 705 and continuing for 19 bp after the stop codon. The SY2b and SY2y sequences thus overlapped by 6 bp. A 2-bp deletion occurred 108 bp into the S-related sequence of SY2y (corre- sponding to nucleotide positions 813 and 814 in AW), shifting the reading frame, which then terminated in a stop codon 20 bp downstream. Interestingly, sequences 89.5% identical to the

Page 7: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

Arabidopsis S Locus-Related Genes 1835

3' flanking region of the AtS1 gene over a few hundred basepairs were found again almost 300 bp after the Sf2y stopcodon.

The Sf 7 and the SV2a, Sf2/3, and Sf2r sequences con-sidered together thus represent two duplicate genes that exhibit~90°/o sequence identity with each other and >90°/o identitywith the AtS1 gene. Interestingly, the Sf 1 sequence on theone hand and the Sf2a, Sf2/J, and Sf2y sequences on theother hand share 35 point mutations and two deletions rela-tive to the AtS1 gene and therefore appear to be more closelyrelated evolutionary to each other than to AtS1.

Expression of the Arabidopsis AtS1, ARK2, and ARKSGenes

The AtS1 Gene

We had previously shown that the AtS1 gene exhibits a floralbud-specific expression pattern by RNA gel blot analysis offloral bud and seedling tissues and by RNA PCR using gene-specific primers (Dwyer et al., 1992). More recently, we iso-lated several AtS1 cDNA clones from an Arabidopsis floral budpoly(A)+ RNA. Based on the similarity of the floral bud-spe-cific expression pattern of AtS1 with that of the Brass/'ca SLR1gene and on the observation that the AtS1 and SLR1 genes

1 2 3

97-

66-

45-;

ooo

31- I

Figure 5. Immunoblot Analysis of Total Protein Extracts from Pistilsof Arabidopsis and Brass/ca.

Lane 1 contains protein extracts from Arabidopsis strain C24; lane 2,protein extracts from Arabidopsis strain RLD; lane 3, protein extractsfrom B. napus. The blot was treated with polyclonal antibodies pro-duced against the AtS1 fusion protein. The AtS1 glycoforms areindicated by small circles and the SLR1 glycoforms by large circles.Molecular mass standards are shown at left in kilodaltons.

AtSl

( -1320

+ 132 -S3

92.2%

200 bp

91 .0%S V2 p

91.9%+375

9 1 . 3 %

Figure 4. Schematic Representation of the ST7, Sf2cr, Sf2/3, andSf 2r Pseudogene Structures Relative to the AtSl Gene.

The pseudogenes are positioned below the diagram of the AtSl geneand its flanking sequences and are in alignment with their regionsof shared nucleotide sequence identity. The percent nucleotide se-quence identity shared by the AtS1 gene and each S pseudogene regionis indicated below each pseudogene. The numbers above the diagramsindicate the lengths, in base pairs, of the longer insertions (positivenumbers) and deletions (negative number) found in the pseudogenesrelative to the AtSl gene.

share more sequence identity with each other than with othermembers of the S gene family (Dwyer et al., 1992; Table 1),we had suggested that AtS1 is a good candidate for the func-tional homolog of the Brass/ca SLR1 gene (Dwyer et al., 1992).

The relationship between the AtS1 and SLR1 genes was in-vestigated further by comparative immunochemical analysesaimed at assessing the antigenic relatedness of the AtS1 andSLR1 proteins. In addition, reporter gene analyses were per-formed with AtS1 and SLRl promoter sequences to comparethe site of expression of the two genes in floral tissues. Figure5 shows that, when used to probe pistil extracts of Arabidop-sis strain C24 (lane 1) and strain RLD (lane 2), antibodiesproduced against a glutathionine S-transferase (GST)-AtS1fusion protein (see Methods) identified a cluster of bands of51 to 53 kD; we believe that these are the products of the AtSlgene. Molecular mass heterogeneity is a characteristic of gly-coproteins in the S gene family and has been shown to bedue to differential N-glycosylation of a single primary transla-tional product (Umbach et al., 1990). Thus, AtS1 proteins existas glycoforms that, interestingly, can exhibit strain-to-strainpolymorphism, as shown by the molecular mass differencesobserved in the C24 and RLD strains (Figure 5). The anti-AtS1antibodies also cross-reacted with stigma extracts of B. napus(lane 3), in which they bound to a series of protein bands simi-lar in molecular weight to AtS1 and previously identified asSLR1 glycoforms (Umbach et al., 1990). Conversely, the Arabi-dopsis AtS1 proteins reacted with antibodies produced against

Page 8: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

1836 The Plant Cell

Brassica SLRl protein (Umbach et al., 1990) (data not shown), further underscoring the relatedness of these two proteins.

The similarity between AfS7 and SLRl was also supported by a comparison of AfS7 and SLRl 5'flanking sequences. The alignment in Figure 6 shows that the promoters share exten- sive sequence identity. In particular, severa1 DNA elements previously shown to be conserved in the promoters of the SLRl and SLG genes (Dzelzkalns et al., 1993) are also found in the AfS7 promoter. To determine whether theAfS7 and SLRl genes are indeed expressed in the same cell types, we constructed a chimeric gene, AfS7::uidA, consisting of a 350-bp sequence derived from the promoter region of AfS7 fused to the Esche- richia coliuidA gene that encodes P-glucuronidase (GUS). For comparative purposes, we also constructed a second chimeric gene, SLRtuidA, which consists of 1.5 kb of 5'sequence from a B. oleracea SLRl gene (Lalonde et al., 1989) fused to M A . The two constructs were introduced into Arabidopsis plants by Agrobacterium-mediated transformation. Twenty and 12 transgenic plants were recovered for the AtS7::uidA and SLR7::uidA transformations, respectively. Of these, 15 AtS7::uidA and nine SLR7::uidA transformants exhibited GUS activity. Figure 7 shows that GUS activity was detected spe- cifically in the papillar cells of the stigma in the AfS7::uidA (Figure 7A) as well as the SLR7::uidA (Figure 76) transformants. No GUS activity was detected in any other cells of Arabidop- sis transformants or in untransformed plants (Figure 7D). In papillar cells, the intensity of blue staining increased during the development of floral buds, and maximal activity levels were correlated with flower opening (Figure 7C). The intensity of staining subsequently decreased in older flowers, probably due to the postpollination autolysis of papillar cells. Interestingly, the AtSl and SLRl promoters were not detectably active in transgenic Arabidopsis anthers, unlike the Brassica SLG pro- moter, which was reported to be active in the anther tapetum (Toriyama et al., 1991). This result differs from the result ob- tained in transgenic tobacco, in which the SLR7 promoter was found to direct GUS activity in pollen (Hackett et al., 1992; 6.A. Lalonde and J.B. Nasrallah, unpublished observations). This difference in SLRl promoter activity may reflect different dis- tributions of the relevant transcription factors in tobacco and Arabidopsis pollen.

lhe ARK Genes

lnitial studies suggested that theARK7, ARK2, andARK3genes were expressed in the same organ systems. cDNA clones de- rived from all three ARK genes were isolated from libraries derived from Arabidopsis floral bud or seedling poly(A)+ RNA. However, RNA PCR using primers specific forARK2 and ARK3 suggested that the two genes did not have precisely overlap- ping spatial expression patterns (data not shown): ARK2 transcripts could be amplified from above-ground tissues but not from roots, whereas ARK3 transcripts could be amplified from above-ground tissues as well as from roots.

Here, we report on the activity of the ARK2 and ARK3 promoters in transgenic Arabidopsis plants. A similar experi- ment performed with the ARK7 gene together with a biochemical characterization of the ARKl kinase will be de- scribed elsewhere (C.M. Tobias and J.B. Nasrallah, manuscript in preparation). To prepare the appropriate reporter gene con- structs, PCR-generated fragments of 718 and 1600 bp containing 5' upstream sequences were derived from the ARK2 and ARK3 genes, respectively, and fused to the uidA gene (see Methods). Arabidopsis plants transformed with these con- structs were analyzed histochemically for GUS activity in the TI and T2 generations throughout the course of plant development.

The patterns of GUS activity for each promoter are illustrated in Figures 8 and 9. Transformation with the ARK2 pro- moter::uidA construct produced 25 kanamycin-resistant primary transformants, all of which exhibited strong and con- sistent GUS activity. The pattern of ARK2 promoter activity was analyzed in detail in eight independent T2 transgenic families. This promoter was inactive in germinating seed and P-day- old seedlings (Figure 8A), and it became active only in 4-day- old seedlings (Figure 86). GUS activity was first detected in a group of cells located at the dista1 tip of the cotyledon and subsequently spread in a basipetal direction to encompass all cells of the cotyledonary blade (Figures 86 to 8E). In older seedlings, ARK2 promoter activity was also detected in leaves (Figures 8F and 8G). Here again, a basipetally directed wave of promoter activity was associated with leaf maturation: only the tips of newly emerged leaves were visibly stained, whereas

box i box 11 Atsi -250 TGTTCAACAACATATGTTATGTTGCACGAGCMATTAATTTCACATGGTGACGTTATC~ACTAATGACAGT~GTTTG?ITGAAG S L R l -308 GTCCA::A:CACA::::::C::..::T::.-....C..-::...G::' . . . . . . . . . . . . . .:A:G::::C : : : C : : : " . ' . . . . :A:]: : : : : : :A: :A

box 111 box IV box V Atsi -167 ~ G A A T C A A T G A G G T A ~ T G A A G T C A T A G A L ~ C G T G G A A T G A G T T T T G SLRl -225 I:::G:T::::I::TGG:[::G:::::::::::(:T:::::::TM:19:A::T:::::::::::~::::GT::ATA:TATCTACAAT

AtSl -84 CAAATTTGAAAACCAACATGTGAAGAAATCTATAAATATATAGGTTTT~GA~CAAAGAAGAGCA~~GAAATAGAAAG~CGTG~G SLR1 -142 T::GACAT:::C:ATGC:M:T::A:TCAAACC:TCCTC:TTAGGTTT:C::ATCTA:TA:AG:C:TA::GTCC:TAT: :A:CA

Figure 6. Nucleotide Sequence and Alignment of the Promoter Regions of the Arabidopsis A S 7 and Brassica SLR7 Genes.

The five regions that are conserved among Brassica SLR7 and SLG promoters (Dzelzkalns et al., 1993) are indicated by boxes. The sequences are numbered from the translation initiation codon of each gene.

Page 9: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

Arabidopsis S Locus-Related Genes 1837

Figure 7. Reporter Gene Analysis of AtSl and Brassica SLR1 Promoter Activity in Transgenic Arabidopsis.(A) Histochemical localization of GUS activity in flowers of a transformant expressing the AtS1 promoter::u/d/4 fusion. Bar = 0.2 mm.(B) Histochemical localization of GUS activity in flowers of a transformant expressing the SLR1 promoter.-.uidA fusion. Bar = 0.2 mm.(C) AtSI promoter activity in stigmas at different stages of flower development. Bar = 0.5 mm.(D) Histochemical staining of a flower from an untransformed plant. Bar = 0.2 mm.

all cells of the leaf blade stained Intensely at leaf maturity. Ininflorescences, intense blue staining was observed in thesepals of floral buds and flowers (Figure 8H). In addition, lightblue staining was detected in the styles of mature flowers (Fig-ure 81). No GUS activity was evident in other cells of the flowerand inflorescence or in petioles, stems, and roots at any stageof development.

The ARK3 promoter exhibited low and somewhat variableexpression in transgenic plants. Of 20 kanamycin-resistant pri-mary transformants generated with the ARK3::uidA construct,

seven had histochemically detectable GUS activity. Three fam-ilies exhibited a highly specific pattern of GUS activity, as shownin Figure 9. In the T2 progeny of these plants, GUS activitywas first detected 1 week after seed germination. Intense bluestaining was observed in the central core of the transition zonebetween the root and hypocotyl (Figure 9A), a zone from whichadventitious roots originate at later stages of plant develop-ment. GUS activity was also observed in a small group of cellsat the base of emerging lateral roots (Figures 98 to 9D) as wellas at the base of lateral shoots and flower pedicels (Figures

Page 10: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

1838 The Plant Cell

Figure 8. Histochemical Localization of GUS Activity in Transgenic Arabidopsis Expressing the ARK2 Promoterr.uidA Fusion.(A) A vernalized seed and two 2-day-old seedlings. There is no detectable GUS activity. Bar = 0.5 mm.(B) and (C) Four-day-old seedlings. Note the blue staining in the tip of the cotyledons in (B) and the basipetal progression of staining in (C).Bars = 1 mm.(D) Five-day-old seedling. The entire cotyledonary blade stained blue. Bar = 1 mm.(E) Seven-day-old seedling. Blue staining is visible in the tips of the first pair of leaves. Bar = 1 mm.(F) and (G) Two- and four-week-old seedlings. Intense blue staining is visible only in the leaf blades of mature leaves. Bars = 1 mm.(H) Inflorescence. Intense blue staining is restricted to the sepals of floral buds and flowers. Bar = 0.5 mm.(I) Mature flower. In addition to strong staining in the sepals, weak blue staining is visible in the style (arrow). Bar = 0.5 mm.

9E and 9F). In addition, six of the seven transgenic familiesexhibited very low levels of GUS activity in leaves (data notshown).

DISCUSSION

We have shown that the Arabidopsis genome contains sixgenes that exhibit sequence similarity with members of theBrassica S gene family. Of the four functional genes contained

in these regions, only the AtSI gene was found to exhibit floral-specific expression. The observation that the AtS1 promoteris active specifically in papillar cells together with the immuno-logical relatedness of AtS1 to Brassica SLR1 supports thehypothesis that the AtS1 and SLR1 genes are functional homo-logs and strongly suggests that their respective proteinproducts perform a pollination-related function common to self-incompatible and self-fertile crucifers.

In contrast to AtS1, the ARK2 and ARK3 genes are expressedpredominantly in vegetative tissues. We found that the activi-ties of VneARK2 and ARK3 promoters differed both qualitatively

Page 11: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

Arabidopsis S Locus-Related Genes 1839

Figure 9. Histochemical Localization of GUS Activity in Transgenic Arabidopsis Expressing the ARK3 Promoter::u/d/\ Fusion.

(A) Eight-day-old seedling. Blue staining is visible in the central core of the transition zone (arrow) between the root and hypocotyl. No stainingis evident in the aerial parts of the seedling. Bar = 0.5 mm.(B) and (C) Enlarged view of GUS staining in the root-hypocotyl transition zone (B) and at the base of lateral root initials (arrows in [B] and(C)). Bars = 1 mm.(D), (E), and (F) Five-week-old plant. In (D), roots were stained blue at the base of emerging lateral roots. In (E), a nodal region was stainedat the site of origin of an axillary bud. In (F), a portion of an inflorescence showed staining at the base of axillary buds and flower pedicels (arrow).Bars in (E) and (F) = 1 mm; the bar in (D) = 0.3 mm.

Page 12: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

1840 The Plant Cell

and quantitatively. The ARK2 promoter was found to direct con- sistent and relatively high expression in the leaf blade as well as in sepals, paralleling the maturation of these structures. No expression from this promoter was detected in anthers, pe- tals, pedicels, petioles, stems, or roots. The ARK3 promoter was also active in leaves but at much lower levels than the ARK2 promoter, as revealed by the faint and variable blue stain- ing observed in leaves of different transgenic plants. No overlap between the sites of ARK2 and ARK3 expression was observed in other organ systemsof the plant. Thus, in roots and inflores- cences, ARK3 promoter activity was limited to a small subset of cells at the root-hypocotyl transition and at the base of lateral roots, lateral shoots, and flower pedicels. Significantly, except for the low leve1 of ARK2 promoter activity observed in the styles of mature flowers, the ARK2 and ARK3 promoters as well as the ARKl promoter (C.M. Tobias and J.B. Nasrallah, manu- script in preparation) were not active in reproductive organs. In particular, the observation that the ARK promoters are not active in stigmatic papillar cells argues against a functional association between ARK and AS7 gene products and sug- gests that, unlike Brassica SLG, AtSl does not function in conjunction with a receptor-like protein kinase, at least not one with which it would share a substantial degree of sequence identity. More generally, the absence of ARK promoter activ- ity in stigmas, ovaries, anthers, and pollen grains argues against a pollination- or fertilization-related function for these receptor-like protein kinases in Arabidopsis.

Rather, the data presented in this study are consistent with the notion that theARK2and ARK3 genes function during de- velopment of the Arabidopsis sporophyte, perhaps in processes related to organ maturation andlor the establishment of growth pattern transitions. Based on DNA sequence analysis, the ARK2 and ARK3 proteins are predicted to be transmembrane proteins, with extracellular S domains that are proposed to func- tion in the binding of a specific ligand, thereby initiating signal transduction and eliciting the appropriate cellular response. Biochemical evidence will be required to determine whether the ARK2 and ARK3 proteins do in fact exhibit intrinsic kinase activity. Nevertheless, based on the presence of the DLKASN and GTYGYMSP consensus sequences in their kinase domains, these proteins are predicted to have serinelthreonine rather than tyrosine kinase specificity. Indeed, the Brassica SRK proteins were shown to exhibit intrinsic serinehhreonine protein kinase activity (Goring and Rothstein, 1992; Stein and Nasrallah, 1993), and a similar substrate specificity was shown for the Arabidopsis ARKl protein (C.M. Tobias and J.B. Nasrallah, manuscript in preparation).

The relatively high degree of sequence conservation ob- served between the ARK kinase domains suggests that they may phosphorylate the same or related cytoplasmic substrates. In contrast, we found that the S domains of the ARK genes exhibit a higher degree of sequence divergence than their ki- nase domains (Tables 1 and 2). This was especially apparent in the comparison of the tandemly repeated ARKl and ARK2 genes, for which the divergences in amino acid sequence iden- tity for the S and kinase domains were 21.4 versus 7%,

respectively. This divergence of the S domains among the var- ious ARK receptor protein kinases suggests that they may be activated by different ligands. Taken together with the nonover- lapping expression patterns of the ARKgenes, the data suggest that the putative receptor protein kinases encoded by these genes have distinct functions.

The Arabidopsis S-related sequences define four genomic regions that occur as two genetically linked clusters on two distinct chromosomes. In Brassica, it has been shown that gene duplication events have occurred repeatedly during the evo- lution of the Sgene family. In particular, a duplication apparently led to the generation of the S locus gene pair SLG and SRK (Tantikanjana et al., 1993). Duplication events appear to have occurred in the Arabidopsis genome as well. A relatively re- cent gene duplication event is suggested by the arrangement of the highly similar ARKl and ARK2 genes as a tandem re- peat. Gene duplications are also likely to have been involved in generating the S pseudogenes of region 3. These pseu- dogenes show >90% identity with the A S 7 gene. The SY7 gene also shows >87.9% identity in its flanking sequences rel- ative to the corresponding flanking sequences of AtS7. When the S pseudogene sequences are compared with each other, 35 point mutations and two deletions relative to theAfS7 gene are found common to both SY7 and SY2a, SY2p, and SY2y. This observation suggests that two duplication events may have occurred to generate the S pseudogenes: an initial duplica- tion of theAfS7 gene to an unlinked chromosomal site would have generated SY7 and related flanking regions; a second duplication of SY7 to a linked site followed by a series of inser- tions of extraneous DNA would have subsequently generated thesplit SY2a, SY2p, and SY2y pseudogenes. It is interesting to note that neither SY7 nor SY2y contains the 1-bp insertion that occurs in AtS7 at nucleotide position 1252, which shifts the reading frame of the AfS7 gene relative to the Brassica SLR7 gene (Dwyer et al., 1992). This mutation must have oc- curred in theAfS7 gene after the initial duplication event that generated the S pseudogenes.

Interestingly, when S pseudogene sequences are compared with the Brassica SLRl and SLGs genes, all consistently show more nucleotide identity with SLRl (high 70%) than with SLGs (low 70%). Furthermore, no kinase-homologous sequences occur in the vicinity of the pseudogenes, as determined by pulse field gel electrophoresis (data not shown). These two observations support the hypothesis that the S pseudogenes are derived from the AtS7 gene, as suggested previously, and intimate that these pseudogenes are not the remnants of an Arabidopsis equivalent of the Brassica S locus SLG and SRK genes. In fact, we found no evidence, based on either nucleo- tide sequence or expression pattern, for the existence of an SLGISRK-like gene pair in the Arabidopsis genome that might correspond to the Brassica S locus or that might function in pollination. In particular, the AtS7 and ARK3 genes, which ex- hibit tight genetic linkage, share only 60% sequence identity, are expressed in different cell types, and are therefore func- tionally unrelated. Thus, if an ancestral S locus equivalent ever existed in the Arabidopsis lineage, it has apparently been

Page 13: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

Arabidopsis S Locus-Related Genes 1841

deleted during the evolution of this self-fertile genus. It is in- teresting to note in this context that deletion of Brassica S locus gene sequences has been reported to result in the breakdown of self-incompatibility (Nasrallah et al., 1994) and is therefore an important factor in shaping breeding behavior in the largely self-incompatible Brassica species.

In any event, the observation that the papillar cells of Arabidopsis apparently do not express an S-related receptor protein kinase suggests that pollination responses in this spe- cies, although perhaps requiring a secreted glycoprotein such as AtS1, are not likely to be mediated by signaling receptors closely related to the Brassica S locus gene products. This conclusion is consistent with the results of previously reported transgenic studies in which papillar cells were specifically ab- lated in Arabidopsis plants transformed with a gene fusion between the promoter of the Brassica SLG promoter and subunit A of diphtheria toxin (Thorsness et al., 1993). These ablated papillar cells, although biochemically inactive, still al- lowed, albeit at reduced efficiency, the development of pollen grains in crosses to the wild type, indicating that successful pollen tube development in Arabidopsis does not require sig- naling by papillar cells. Thus, the comparative analysis of Arabidopsis and Brassica provides no evidence for the evolu- tion of the Brassica self-incompatibility system from a signaling system operative in compatible pollination. However, the se- quence and expression data presented here strongly suggest that the Brassica SLG/SRK gene pair operative in the highly specific discrimination between self- and cross-pollen was recruited from vegetatively expressed genes that have a func- tion unrelated to pollination.

METHODS

Plant Material

Arabidopsis thaliana strain C24 was obtained from M. Jacobs (Vrije Universiteit, Brussels, Belgium) and strain RLD from C. Somerville (Carnegie lnstitution of Washington, Stanford, CA). For restriction frag- ment length polymorphism (RFLP) mapping, the recombinant inbred lines developed by C. Lister and C. Dean (John lnnes Institute, Nor- wick, U.K.) were obtained from the Arabidopsis Biological Resource Center at Ohio State University (Columbus, OH).

lsolation of Arabidopsis S-Related Sequences and Nucleotide Sequence Analysis

An Arabidopsis Columbia genomic library consisting of parlially digested Mbol DNA fragments and constructed in the vector EMBL3 was obtained from Clonetech (Palo Alto, CA) and used for lhe first screening with cDNA probes derived from the Brassica SLG (S Locus Glycoprotein) and SLR7 (S Locus-Related 1) genes. An Arabidopsis Columbia genomic library consisting of partially digested Sau3Al DNA fragments and constructed in the bacteriophage h vector EMBL4 was used for the second screening with Arabidopsis AtSl and Arabidop- sis Receptor Kinase ( A R 4 gene probes. The AtS7 and ARK cDNA

probes were obtained from a cDNA library derived from Arabidopsis floral bud poly(A)+ RNA and constructed in the A. vector Uni-ZAP XR (Stratagene). The libraries were screened as described by Sambrook e1 al. (1989). DNA probes were labeled with phosphorous-32 by the method of Feinberg and Vogelstein (1983) using the random primer labeling system of Boehringer Mannheim. Positively hybridizing phage from the genomic libraries were purified and their DNA isolated ac- cording to the methcd of Thomas and Davis (1975). Positively hybridizing phage from the cDNA library were purified and their DNA isolated as pBluescript phagemids according to protocols supplied by the manufac- lurer (Stratagene). The regions that hybridized to S domain and kinase probes found on the genomic clones were delinealed by DNAgel blot analysis (Southern, 1975) and subcloned either into pUC118/pUC119 (Vieira and Messing, 1987) or pBluescript SK+ and KS+ (Stralagene) plasmid vectors. Nested deletions spaced at -200-b~ intervals were generated by Exolll digestion using the Erase-A-Base system (Promega). Nucleotide sequence was determined by the dideox- ynucleotide chain termination melhod (Sanger et al., 1977) modified for double-stranded DNA sequencing (Chen and Seeburg, 1985) and using the Sequenase V2 kit (U.S. Biochemical Corp.). For lhe DNA blot analysis used to correlate the cloned S-related regions with the organization of the S gene family in the Arabidopsis genome, genomic DNA from the Columbia ecotype was obtained from Clonetech. The method used in this and DNA gel blot analyses performed to deter- mine the chromosomal location of the S-related regions by RFLP mapping was as described by Dwyer et al. (1992).

Detection of Gene Expression by RNA Polymerase Chain Reaction Amplification

Poly(A)+ RNA was isolated from the indicated Arabidopsis tissues using the MicroFast Track kit (Invitrogen, San Diego, CA). For amplifi- cation by polymerase chain reaction (PCR), poly(A)+ RNA was treated with RNase-free DNasel (Boehringer Mannheim), as suggested by Grillo and Margolis (1990). The resulting DNA-free poly(A)+ RNA was am- plified with the GeneAmp RNA PCR kit (Perkin Elmer Cetus, Norwalk, CT) using gene-specific synthetic oligonucleotides (Midland Co., Mid- land, TX). PCR-generated fragments were detected by electrophoresis of one-lenth to one-fifth of lhe PCRs in 1 to 1.2% (wh) agarose gels.

Protein lmmunoblot Analysis

To produce an AtSl fusion protein, we generated a 982-bp 89111 frag- ment containing AtS7 coding region sequences from position +314 through +1291 bp and subcloned this fragment into the BamHl site of expression vector pGEX-3X (Pharmacia). The resulting clone, pKDX9, determined to have the correcl orientation of theAtS7 insert by Hindlll and Sstl digestion, contained the glutathionine S-transferase coding region fused to that of AtS7 commencing 106 codons downstream of the start codon and ending at codon 430. The pKDX9 fusion protein was expressed in Escherichia coli JM109 afler induction wilh 0.1 mM isopropyl P-o-thiogalactopyranoside for 4 hr at 3OOC. Cells were lysed by sonication in phosphate-buffered saline, pH 7.4, containing 1% (vh) Triton X-100, and the bacterial extracts were subjected to SDS-PAGE on 10% (wh) polyacrylamide preparative gels and detected by Coomas- sie Brilliant Blue R 250 staining. The fusion protein was isolated by electroelution according to the procedure of Hunkapillar et al. (1983) and was used to immunize a rabbit at the Cornell University Polyclonal Anlibody Production Facility.

Page 14: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

1842 The Plant Cell

For immunoblot analysis, Arabidopsis stigmas were ground at liq- uid nitrogen temperature in an Eppendorf tube with a fitted pestle. Proteins were extracted either in 10 mM Tris-HCI buffer, pH 7.2, or in a sample extraction solution containing 80 mM Tris-HCI, pH 6.8, 1% (w/v) SDS, 1.5% (w/v) dithiothreitol, and 10% (v/v) glycerol. Routinely, three stigmas were extracted in 25 pL of buffer, and the entire extract was loaded in one gel lane for electrophoresis. Stigmas of Brassica napus were processed in the same manner, except that proteins from two stigmas were extracted in 50 pL of buffer and one-tenth of the final volume was loaded per lane. SDS-PAGE, electrophoretic trans- fer to nitrocellulose membrane, and treatment with primary antibody followed by treatment with alkaline phosphatase-conjugated anti-rabbit IgG secondary antibody were performed as described earlier (Umbach et al., 1990).

Construction of Reporter Gene Fusions, Plant Transformation, and Histochemical Analysis of b-Glucuronidase Activity

DNA fragments containing 5' upstream sequences of the AfSI, ARK2, and ARW genes were generated by PCR amplification using the GeneAmp PCR kit and the appropriate promoter-specific primers (Mid- land Co.). To facilitate cloning, a BamHl restriction site was incorporated into the synthetic primers. All of the antisense primer sequences were chosen to locate the ATG start codon of the P-glucuronidase (GUS) reporter gene at the same relative position as the ATG start codon of the corresponding S gene. The DNA fragments thus made were cloned into the pCRll vector using the TACloning kit (Invitrogen). Each of the promoter inserts was excised from this vector by BamHl diges- tion and cloned into the BamHl site of the GUSexpression vector pBllOl vector 5' of the uidA gene encoding the GUS protein followed by a polyadenylation site (Jefferson et al., 1987). The correct orientation of the promoter relative to the GUS coding sequences was verified by restriction enzyme digestion, by PCR amplification with appropri- ate primers, and by DNA sequencing.

The resulting chimeric ConStructs were introduced into Agmbac@rium rumefaciens pCIB542/A136 (derived from helper plasmid pEHA101; Hood et al.; 1986). Transformation of Arabidopsis C24 was performed by the method of Valvekens et al. (1988), and transgenic plants were selected on the basis of kanamycin resistance. Stable integration of the transgene was confirmed by DNA gel blot analysis.

GUS assays were performed histochemically with the chromogenic substrate 5-bromo-4-chloro-3-indoyl glucuronide (X-gluc; Jefferson et al., 1987) as detailed by Toriyama et al. (1991). To monitor GUS activity at early stages of plant development, T2 seeds generated by the self- ing of primary transformants were sterilized, sown on Murashige and Skoog plates (Murashige and Skoog, 1962), and vernalized for 3 days at 4OC. The plates were then transferred to a 25OC growth chamber for germination and subsequent growth. Staining for GUS activity was performed on developing seedlings starting 1 day after transferring the plates to 25OC. GUS activity at later stages of development was assayed in growth chamber- or greenhouse-grown plants.

ACKNOWLEDGMENTS

This work was supported by the National Science Foundation through RUI (Research at an Undergraduate Institution) Grant No. IBN-9108232 to K.G.D. and Grant No. IBN-9220401 to M.E.N. and by a grant from the U.S. Department of Agriculture to J.B.N.

Received August 22, 1994; accepted October 18, 1994.

REFERENCES

Boyes, D.C, Chen, C-H., Tantikanjama, T., Esch, J.J., and Nasrallah, J.B. (1991). lsolation of a second S locus related cDNA from B. oler- acea: Genetic relationship between the S locus and two related loci. Genetics 127, 221-228.

Chen, C.H., and Seeburg, P.H. (1985). Supercoil sequencing: A fast and simple method for sequencing plasmid DNA. DNA 4, 165-170.

Dwyer, K.D., Lalonde, B.A., Nasrallah, J.B., and Nasrallah, M.E. (1992). Structure and expression of AtS7. an Arabidopsis fhaliana gene homologous to the S locus related genes of Brassica. MOI. Gen. Genet. 231, 442-448.

Dzelzkalns, V.A., Thorsness, M.K., Dwyer, K.D., Baxter, J.S., Belent, M.A., Nasrallah, M.E., and Nasrallah, J.B. (1993). Distinct cis-acting elements direct pistil-specific and pollen-specific activity of the Bras- sica S locus glycoprotein gene promoter. Plant Cell 5, 855-863.

kinberg, A.P., and Vogelstein, B. (1983). A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132, 6-13.

Goring, D.R., and Rothstein, S.J. (1992). The S-locus receptor ki- nase gene in the self-incompatible Brassica napus line encodes a functional serinekhreonine kinase. Plant Cell 4, 1273-1281.

Grillo, M., and Margolls, F.L. (1990). Use of reverse transcriptase poly- merase chain reaction to monitor expression of intronless genes. Biotechniques 9, 263-268.

Hackett, R.M., Lawrence, M.J., and Franklin, F.C.H. (1992). A Bras- sica S locus related gene promoter directs expression in both pollen and pistil of tobacco. Plant J. 2, 613-617.

Hanks, S.K., Quinn, A.M., and Hunter, T. (1988). The protein kinase family: Conserved features and deduced phylogeny of the catalytic domains. Science 241, 42-52.

Hood, E.E., Helmer, G.L., Fraley, R.T., and Chilton, M.D. (1986). The hypervirulence of Agrobacterium tumefaciens A281 is encoded in a region of pTiBo542 outside of T-DNA. J. Bacteriol. 168,1291-1301.

Hunkapillar, M.W., Lojan, E., Ostrander, F., and Hood, L.E. (1983). lsolation of microgram quantities of proteins'from polyacrylamide gels for amino acid sequence analysis. Methods Enzymol. 91,

Isogai, A,, Yamakawa, S., Shiozawa, H., Takayama, S., Tanaka, H., Kono, T., Watanabe, M., Hinata, K., and Suzuki, A. (1991). The cDNA sequence of NS1 glycoprotein of Brassica campestris and its homology to S locus related glycoproteins of B. oleracea. Plant MOI. Biol. 17, 269-271.

Jefferson, R.A., Kavanaugh, T.A., and Bevan, M.W. (1987). GUS fusions: 6-Glucuronidase as a sensitiva and versatile gene fusion marker in higher plants. EMBO J. 6, 3901-3907.

Lalonde, B.A., Nasrallah, M.E., Dwyer, K.G., Chen, C.-H., Barlow, B., and Nasrallah, J.B. (1989). A highly conserved Brassica gene with homology to the S-locus-specific glycoprotein structural gene. Plant Cell 1, 249-258.

Murashige, T., and Skoog, F. (1962). A revised medium for rapid growth and bioassays with tobacco tissue cultures. Physiol. Plant. 15, 473-497.

Nasrallah, J.B., and Nasrallah, M.E. (1993). Pollen-stigma signal- ing in the sporophytic self-incompatibility response. Plant Cell 5,

227-236.

1325-1335.

Page 15: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

Arabidopsis S Locus-Related Genes 1843

Nasrallah, J.B., Kao, T.H., Chen, C.-H., Goldberg, M.L., and Nasrallah, M.E. (1987). Amino-acid sequence of glycoproteins en- coded by three alleles of the S-locus of Brassica oleracea. Nature

Nasrallah, J.B., Yu, S.M., and Nasrallah, M.E. (1988). Self- incompatibility genes of Brassica oleracea: Expression isolation and structure. Proc. Natl. Acad. Sci. USA 85, 5551-5555.

Nasrallah, J.B., Rundle, S.J., and Nasrallah, M.E. (1994). Genetic evidence for the requirement of the Brassica S locus receptor ki- nase gene in the self-incompatibility response. Plant J. 5,373-384.

Sambmok, J., Frltsch, E.F., and Maniatis, T. (1989). Molecular Clon- ing: A Laboratory Manual, 2nd ed. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press).

Sanger, F., Nlcklen, S., and Coulson, A.R. (1977). DNAsequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. USA 74,

Stein, J.C., and Nasrallah, J.B. (1993). A plant receptor-like gene, the S locus receptor kinase of Brassica oleracea, encodes a functional serine/threonine kinase. Plant Physiol. 101, 1103-1106.

Steln, J.C., Howlett, B., Boyes, D.C., Nasrallah, M.E., and Nasrallah, J.B. (1991). Molecular cloning of a putative receptor kinase gene encoded at the self-incompatibility locus of Brassica oleracea. Proc. Natl. Acad. Sci. USA 88, 8816-8820.

Southern, E.M. (1975). Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. MOI. Biol. 98,503-517.

Tantlkanjana, T., Nasrallah, M.E., Stein, J.C., Chen, C.-H., and Nasrallah, J.B. (1993). An alternative transcript of the S locus gly- coprotein gene in a class II pollen-recessive self-incompatibility haplotype of Brassica oleracea encodes a membrane-anchored pro- tein. Plant Cell 5, 657-666.

Thomas, M., and Davis, R.W. (1975). Studies on the cleavage of bac- teriophage X DNA with EcoRl restriction endonuclease. J. MOI. Biol.

326, 617-619.

5463-5467.

91, 315-328.

Thorsness, M.K., Kandasamy, M.K., Nasrallah, M.E., and Nasrallah, J.B. (1993). Genetic ablation of floral cells in Arabidopsis. Plant Cell

Tobias, C.M., Howlett, B., and Nasrallah, J.B. (1992). An Arabidop- sis thaliana gene with sequence similarity to the S locus receptor kinase of Brassica oleracea: Sequence and expression. Plant Physiol.

Torlyama, K., Thorsness, M.K., Nasrallah, M.E., and Nasrallah, J.B. (1991). A Brassica S locus gene promoter directs sporophytic ex- pression in the anther tapetum of transgenic Arabidopsis. Dev. Biol.

Trick, M., and Flavell, R.B. (1989). A homozygous S genotype of Bras- sica oleracea expresses two S-like genes. MOI. Gen. Genet. 218,

Umbach, A.L., Lalonde, B.A., Kandasamy, M.K., Nasrallah, J.B., and Nasrallah, M.E. (1990). lmmunodetection and post-translation modification of two products encoded by two independent genes of the self-incompatibility multigene family of Brassica. Plant Phys- iol. 93, 739-747.

Valvekens, D., Van Montagu, M., and Van Lijsebettens, M. (1988). Agrobacterium tumefaciens-mediated transformation of Arabidop- sis thaliana root explants by using kanamycin selection. Proc. Natl. Acad. Sci. USA 85, 5536-5540.

Vieira, J., and Messlng, J. (1987). Production of single-stranded plas- mid DNA. Methods Enzymol. 153, 3-11.

Walker, J. (1993). Receptor-like protein kinase genes of Arabidopsis thaliana. Plant J. 3, 451-456.

Walker, J.C., and Zhang, R. (1990). Relationship of a putative recep- tor protein kinase from maize to S locus glycoproteins of Brassica. Nature 345, 743-746.

Weinstein, J.N., Blumenthal, R., van Renswoude, J., Kempf, C., and Klausner, R.D. (1982). Charge clusters and the orientation of membrane proteins. J. Membr. Biol. 66, 203-212.

5, 253-261.

99, 284-290.

143, 427-431.

112-117.

Page 16: A Superfamily of S Locus-Related Sequences in Arabidopsis ... · are the Arabidopsis Receptor Kinase gene ARK7 (Tobias e: al., 1992) and the Receptor-Like Kinase genes RLK7 and RLK4

DOI 10.1105/tpc.6.12.1829 1994;6;1829-1843Plant Cell

NasrallahK G Dwyer, M K Kandasamy, D I Mahosky, J Acciai, B I Kudish, J E Miller, M E Nasrallah and J B

patterns.A superfamily of S locus-related sequences in Arabidopsis: diverse structures and expression

 This information is current as of August 16, 2020

 

Permissions 298X

https://www.copyright.com/ccc/openurl.do?sid=pd_hw1532298X&issn=1532298X&WT.mc_id=pd_hw1532

eTOCs http://www.plantcell.org/cgi/alerts/ctmain

Sign up for eTOCs at:

CiteTrack Alerts http://www.plantcell.org/cgi/alerts/ctmain

Sign up for CiteTrack Alerts at:

Subscription Information http://www.aspb.org/publications/subscriptions.cfm

is available at:Plant Physiology and The Plant CellSubscription Information for

ADVANCING THE SCIENCE OF PLANT BIOLOGY © American Society of Plant Biologists