identification hla late in - pnas.org · identification in the hlaclass i...

5
Proc. Natl. Acad. Sci. USA Vol. 90, pp. 9470-9474, October 1993 Genetics Identification in the HLA class I region of a gene expressed late in keratinocyte differentiation (major histocompatibility complex/CpG island/epidermis/cDNA) YIQING ZHOU AND DAVID D. CHAPLIN* Howard Hughes Medical Institute and Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO 63110 Communicated by Emil R. Unanue, June 14, 1993 ABSTRACT A gene designated S has been identified in the class I region of the human major histocompatibility complex. The S gene is located 160 kb telomeric of HLA-C. It is expressed at high levels as 2.2-kb and 2.6-kb mRNAs in human skin. No homologous transcripts were detected in other tissues including placenta, liver, spleen, thymus, and brain. In situ hybridization showed that S gene expression was restricted to the differen- tiating keratinocytes in the granular layer of the epidermis. The predicted amino acid sequence of the S protein was remarkable for its high content of serine, glycine, and proline. There were significant similarities with the amino acid sequences of loric- rin, keratin 1, and keratin 10, all major components of the granular-cell layer. The selective expression of the S gene in the granular-cell layer in the epidermis suggests a role in the developmental program of differentiating keratinocytes. Fur- thermore, in light of the recognized association of psoriasis vulgaris, a disorder of keratinocyte proliferation, with alleles of HLA-C, this gene may contribute primarily to the pathogenesis of this common disorder. The human major histocompatibility complex (MHC) spans >4000 kb of genomic DNA at chromosome 6p21.3 (1-3). The complex was originally defined by the highly polymorphic class I and class II genes that are required for normal antigen recognition by CD8+ and CD4+ T lymphocytes. Important for the function of these genes are additional genes within the class II region that encode two components of the protea- some and two proteins with homology to the ATP-binding cassette family of transmembrane transporters (4-8). The MHC also contains genes encoding proteins active in non- cognate host defense, the classical class III proteins C2, C4A, C4B, and factor B (9), as well as the potent immunomodu- latory cytokines tumor necrosis factor a (TNF-a) and lym- photoxin (TNF-f3) (10). Studies over the past 10 years have established that the MHC also contains genes with no ap- parent relationship to immune recognition. These include the adrenal steroid biosynthetic enzyme 21-hydroxylase (11, 12), two heat shock protein genes of the HSP70 class (13, 14), a valyl-tRNA synthetase locus (15), and a homologue of the Drosophila gene female sterile homeotic (16). Approximately 25 additional less-well-characterized genes have been iden- tified within the complex (17-19). Together, these studies indicate that only a subset of genes within the complex are concerned directly with immune functions. Studies of the distribution of class I and class II alleles within normal and diseased human populations have dem- onstrated that genes either within or linked closely to the HLA complex participate in the development of >200 human illnesses (20). In some cases, the specific gene that confers disease risk has been clearly identified (21-23). But, in most cases, it is far from clear whether the genetic predisposition The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. is directly associated with a particular HLA class I, II, or III gene or is related to the presence of an unidentified gene tightly linked to the known class I, II, or III loci. The identification within the HLA complex of currently unrec- ognized genes will provide essential reagents to define the nature of the disease susceptibilities. Progress toward the definition of the complete gene con- tent of regions of genomic DNA has been dramatically facilitated by the development of large-scale genomic cloning strategies. We have recently isolated nearly the entire 4-megabase HLA complex as a collection of overlapping yeast artificial chromosome (YAC) clones (24-26). These YACs provide direct access to all of the genes contained within the complex. Many methods have been developed to identify genes within genomic DNA. Some of these depend on structural features of the DNA, including the CpG islands that are often associated with the 5' ends of vertebrate genes (27) as well as evolutionarily conserved sequences that often are portions of genes themselves (28). Genes can also be detected based on their functions using direct cDNA selection (29) or exon trapping methods (30). We describe here the identification of a gene within the class I region of the HLA complex. This gene was detected by the presence of an evolutionarily conserved sequence near a CpG island.t This gene is struc- turally unrelated to other known HLA complex genes and is expressed exclusively in skin. MATERIALS AND METHODS Cosmid Cloning. Three overlapping clones that defined the physical relationship of the HLA-B and HLA-C genes (24) were isolated previously from the human YAC library at the Center for Genetics in Medicine at Washington University (31). The B209D7 YAC clone was subcloned into cosmids. The total yeast DNA containing the B209D7 YAC, prepared as described by Olson et al. (32), was partially digested with Mbo I and used to make a library with the SuperCos 1 vector and Gigapack II (Stratagene) according to the manufacturer's recommendations. Cosmid clones containing sequences near the third CpG-rich region were isolated by hybridization using the left-end probe from YAC B209D7 and the right-end probe from YAC B38D3 (24). Southern and Northern Blot Analyses. Genomic DNA was isolated from livers of the indicated species according to a modification of the method of Blin and Stafford (33) as described (34). RNA was isolated as described by Chirgwin Abbreviations: MHC, major histocompatibility complex; YAC, yeast artificial chromosome; RACE, rapid amplification of cDNA ends. *To whom reprint requests should be addressed at: Division of Allergy and Immunology, Department of Internal Medicine, How- ard Hughes Medical Institute, Washington University School of Medicine, 4566 Scott Avenue, Box 8022, St. Louis, MO 63110. tThe sequence reported in this paper has been deposited in the GenBank data base (accession no. L20815). 9470

Upload: phamthien

Post on 22-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Proc. Natl. Acad. Sci. USAVol. 90, pp. 9470-9474, October 1993Genetics

Identification in the HLA class I region of a gene expressed late inkeratinocyte differentiation

(major histocompatibility complex/CpG island/epidermis/cDNA)

YIQING ZHOU AND DAVID D. CHAPLIN*Howard Hughes Medical Institute and Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO 63110

Communicated by Emil R. Unanue, June 14, 1993

ABSTRACT A gene designated S has been identified in theclass I region of the human major histocompatibility complex.The S gene is located 160 kb telomeric ofHLA-C. It is expressedat high levels as 2.2-kb and 2.6-kb mRNAs in human skin. Nohomologous transcripts were detected in other tissues includingplacenta, liver, spleen, thymus, and brain. In situ hybridizationshowed that S gene expression was restricted to the differen-tiating keratinocytes in the granular layer ofthe epidermis. Thepredicted amino acid sequence of the S protein was remarkablefor its high content of serine, glycine, and proline. There weresignificant similarities with the amino acid sequences of loric-rin, keratin 1, and keratin 10, all major components of thegranular-cell layer. The selective expression of the S gene in thegranular-cell layer in the epidermis suggests a role in thedevelopmental program of differentiating keratinocytes. Fur-thermore, in light of the recognized association of psoriasisvulgaris, a disorder ofkeratinocyte proliferation, with alleles ofHLA-C, this gene may contribute primarily to the pathogenesisof this common disorder.

The human major histocompatibility complex (MHC) spans>4000 kb of genomic DNA at chromosome 6p21.3 (1-3). Thecomplex was originally defined by the highly polymorphicclass I and class II genes that are required for normal antigenrecognition by CD8+ and CD4+ T lymphocytes. Importantfor the function of these genes are additional genes within theclass II region that encode two components of the protea-some and two proteins with homology to the ATP-bindingcassette family of transmembrane transporters (4-8). TheMHC also contains genes encoding proteins active in non-cognate host defense, the classical class III proteins C2, C4A,C4B, and factor B (9), as well as the potent immunomodu-latory cytokines tumor necrosis factor a (TNF-a) and lym-photoxin (TNF-f3) (10). Studies over the past 10 years haveestablished that the MHC also contains genes with no ap-parent relationship to immune recognition. These include theadrenal steroid biosynthetic enzyme 21-hydroxylase (11, 12),two heat shock protein genes of the HSP70 class (13, 14), avalyl-tRNA synthetase locus (15), and a homologue of theDrosophila gene female sterile homeotic (16). Approximately25 additional less-well-characterized genes have been iden-tified within the complex (17-19). Together, these studiesindicate that only a subset of genes within the complex areconcerned directly with immune functions.

Studies of the distribution of class I and class II alleleswithin normal and diseased human populations have dem-onstrated that genes either within or linked closely to theHLA complex participate in the development of >200 humanillnesses (20). In some cases, the specific gene that confersdisease risk has been clearly identified (21-23). But, in mostcases, it is far from clear whether the genetic predisposition

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

is directly associated with a particular HLA class I, II, or IIIgene or is related to the presence of an unidentified genetightly linked to the known class I, II, or III loci. Theidentification within the HLA complex of currently unrec-ognized genes will provide essential reagents to define thenature of the disease susceptibilities.

Progress toward the definition of the complete gene con-tent of regions of genomic DNA has been dramaticallyfacilitated by the development of large-scale genomic cloningstrategies. We have recently isolated nearly the entire4-megabase HLA complex as a collection of overlappingyeast artificial chromosome (YAC) clones (24-26). TheseYACs provide direct access to all of the genes containedwithin the complex.Many methods have been developed to identify genes

within genomic DNA. Some of these depend on structuralfeatures of the DNA, including the CpG islands that are oftenassociated with the 5' ends of vertebrate genes (27) as well asevolutionarily conserved sequences that often are portions ofgenes themselves (28). Genes can also be detected based ontheir functions using direct cDNA selection (29) or exontrapping methods (30). We describe here the identification ofa gene within the class I region of the HLA complex. Thisgene was detected by the presence of an evolutionarilyconserved sequence near a CpG island.t This gene is struc-turally unrelated to other known HLA complex genes and isexpressed exclusively in skin.

MATERIALS AND METHODSCosmid Cloning. Three overlapping clones that defined the

physical relationship of the HLA-B and HLA-C genes (24)were isolated previously from the human YAC library at theCenter for Genetics in Medicine at Washington University(31). The B209D7 YAC clone was subcloned into cosmids.The total yeast DNA containing the B209D7 YAC, preparedas described by Olson et al. (32), was partially digested withMbo I and used to make a library with the SuperCos 1 vectorand Gigapack II (Stratagene) according to the manufacturer'srecommendations. Cosmid clones containing sequences nearthe third CpG-rich region were isolated by hybridizationusing the left-end probe from YAC B209D7 and the right-endprobe from YAC B38D3 (24).

Southern and Northern Blot Analyses. Genomic DNA wasisolated from livers of the indicated species according to amodification of the method of Blin and Stafford (33) asdescribed (34). RNA was isolated as described by Chirgwin

Abbreviations: MHC, major histocompatibility complex; YAC,yeast artificial chromosome; RACE, rapid amplification of cDNAends.*To whom reprint requests should be addressed at: Division ofAllergy and Immunology, Department of Internal Medicine, How-ard Hughes Medical Institute, Washington University School ofMedicine, 4566 Scott Avenue, Box 8022, St. Louis, MO 63110.tThe sequence reported in this paper has been deposited in theGenBank data base (accession no. L20815).

9470

Proc. Natl. Acad. Sci. USA 90 (1993) 9471

et al. (35). Southern and Northern blots were prepared andhybridization was performed as described (34) by using DNAprobes labeled with [32P]dCTP using the Multiprime system(Amersham). Southern blots were washed in 0.1 x standardsaline citrate/0.1% SDS at 56°C. Northern blots were washedat 60°C in the same buffer.

Isolation of cDNA Clones. The oligo(dT)-primed humanfetal (18th week of gestation) skin cDNA library was in thepcDNAI vector (Invitrogen) and the randomly primed humanneonatal foreskin keratinocyte cDNA library was in Agt1O(Clontech). The libraries were screened and clones wereisolated as described (36).

Nucleotide Sequencing. Twelve positive cDNA clones inAgtlO were subcloned into pBluescript II KS+/- (Strata-gene) and sequenced with the dideoxynucleotide chain-termination method by using Sequenase (United States Bio-chemical) with either polylinker primers T3 and T7 or syn-thetic oligonucleotide primers. The S cDNA sequence wasdetermined on both strands.Rapid Amplification of cDNA Ends (RACE). The 5' RACE

was performed as described (37). cDNA synthesis was per-formed by primer extension of an oligonucleotide comple-mentary to nt 99-125 of the final S cDNA sequence. Apoly(dG) tail was introduced by terminal deoxynucleotidyl-transferase (Stratagene). Two rounds of PCR amplificationwere performed with 5' anchor primers (AN primer plusANpolydC primer at a 9:1 ratio) together with a specific 3'primer complementary to nt 56-80 of the S cDNA underconditions described by Loh et al. (37). For the 3' RACE,cDNA was synthesized with a (dT)17 adaptor primer asdescribed (38). PCR was performed with the adaptor primerand a specific 5' S cDNA primer (nt 1577-1598). The PCRproducts were purified by electrophoresis in a 2% agarosegel, purified using Geneclean (Bio 101) and then cloned in thepCR II vector (Invitrogen).

In Situ Hybridization. A Bluescript plasmid containingmost of the coding region (nt 190-945) of the S cDNA waslinearized with a restriction enzyme that cuts once in thepolylinker either upstream or downstream of the cDNAinsert. Labeled sense and antisense RNA probes were syn-thesized using T7 or T3 RNA polymerase in the presence ofuridine 5'-[a-[35S]thio]triphosphate (Amersham) under con-ditions recommended by Promega, yielding specific activitiesof 108 cpm/,ug. In situ hybridization was performed usingsections of human foreskin (39). After autoradiography for 6days, the slides were developed in D19 (Kodak) and coun-terstained with hematoxylin and eosin, and silver grains werevisualized by bright-field and dark-field microscopy.

RESULTS AND DISCUSSIONDetection of Evolutionarily Conserved Sequences and Tran-

scripts. Three overlapping YAC clones that define 290 kb ofgenomic DNA containing the HLA-B and HLA-C genes havebeen isolated (24). These clones were used to prepare a mapof the locations of infrequently cutting restriction endonu-clease cleavage sites (Fig. 1A). At least three regions containclustered cleavage sites suggesting the presence of CpGislands. Two of these regions are located at the 5' ends of theHLA-B and HLA-C genes and a third large CpG-rich regionlies 80-160 kb telomeric of the HLA-C gene in a region notknown to contain any gene encoding sequences. Since CpGislands have been associated with the 5' ends of mosthousekeeping genes and many tissue-specific genes (27), wededuced that there may be unrecognized expressed geneswithin this region. Human genomic DNA fragments from thisthird CpG-rich region were isolated from cosmid clonesprepared from one ofthe YAC inserts. These fragments wereused to test for evolutionarily conserved sequences. A 2-kbBamHI genomic DNA fragment hybridized at high stringency

0)

- ac

-0 -

0 3

Be o

r_B C3

ED-r

A

BME EILa I

BEM ELI II-. -. I III~l--L

HLA- --CHLA-B HLA-C

B BSE SEE

11I

BMN M

020kb

H--

0l

0a)

0)

0

Kb23 -9.4-6.6-

4.4-

Hind II

FIG. 1. (A) Molecular structure ofthe HLA-B andHLA-C portionofthe class I region. This 290-kb genomic DNA segment was definedby three overlapping YAC clones (24). The recognized genes arerepresented by solid boxes, with arrows indicating the direction oftranscription. Restriction endonuclease cleavage sites detected in theYAC DNAs are indicated by B (BssHII), E (Eag I), M (Mlu I), N(Nru I), and S (Sfi I). The open box indicates the position of theevolutionarily conserved 2-kb BamHI fragment that was used as ahybridization probe. (B) Detection of conserved sequences by cross-species hybridization. Five micrograms of each genomic DNA wasdigested with HindIII and analyzed on a Southern blot with the32P-labeled 2-kb BamHI fragment as the probe.

to rabbit, guinea pig, sheep, and mouse DNAs, suggestingthat it might contain gene exon sequences (Fig. 1B). Becausethis fragment showed strong evolutionary conservation, itwas possible to perform a broad initial screen of primarytissues for homologous transcripts by preparing RNA from acomprehensive collection of mouse organs. Northern blotanalysis of these tissues using the 2-kb BamHI fragment as aprobe demonstrated a 2.4-kb transcript in mouse skin (Fig. 2).Subsequent analysis of human neonatal foreskin RNAshowed abundant transcripts of 2.2 and 2.6 kb in this humantissue. Comigrating transcripts were also present at lowerabundance in the human simian virus 40-transformed kera-tinocyte cell line RHEK. There was no specific hybridizationto RNA from mouse fibroblasts, the human epidermal car-

o c c

-0 00 E

a,% o D

_:2 I I.LLJI-Ckf

I-

L/)c

I

28

FIG. 2. Northern blot analysis. Fifteen-microgram samples oftotal RNA from mouse L-cell fibroblasts and skin tissue, humanforeskin, the human epidermal carcinoma cell line A431, the humankeratinocyte cell line RHEK (kindly provided by Thomas Kupper,Harvard Medical School, Boston), and the human mononuclearphagocyte cell line THP1 were analyzed on a Northern blot with the32P-labeled 2-kb BamHI fragment as the probe. Lane human skin (B)is a shorter exposure for better visualization of the hybridizationpattern.

Genetics: Zhou and Chaplin

Proc. Natl. Acad. Sci. USA 90 (1993)

cinoma cell line A431, or the monocytoid line THP1. Inaddition, no transcripts were detected in liver, spleen, kid-ney, heart, lung, or brain (data not shown).

Isolation of cDNA Clones and Sequence Analysis. The 2-kbBamHI fragment was then used as a probe to screen cDNAlibraries prepared from the skin of an 18-week human fetusand from human neonatal foreskin keratinocytes. Screening105 fetal skin cDNAs yielded no positive clones, whereas 30positive clones were identified in 105 foreskin keratinocytecDNAs. This may represent differential expression of thisgene at these developmental stages.Twelve of the isolated cDNA clones were analyzed by

nucleotide sequencing. Two cDNAs showed the same 5' end.No farther-upstream transcriptional initiation site was dem-onstrated using primer-extension analysis or 5' RACE ofanchored PCR (37, 38). Although we cannot rigorouslyexclude the usage of additional 5' untranslated sequences, itis most likely that the 5'-end sequence in these cDNA clonesrepresents the transcriptional initiation site. The 3' end of thecDNA sequence was isolated using anchored PCR. When thecDNA inserts were used as probes for Southern blots of totalhuman genomic DNA and of the CpG-containing YACs andcosmid clones, a single homologous gene was detected (datanot shown). This gene was designated the S gene because itappears to be expressed exclusively in skin. It is located 160kb telomeric of HLA-C within the class I region in a centro-meric-to-telomeric transcriptional orientation (Fig. 3). Com-parison of the cDNA sequence to the sequence of genomicsubclones of the cosmids showed that the S gene consists oftwo exons that are separated by a 2.9-kb intron. This intronis flanked by the canonical consensus mRNA splice se-quences (data not shown).The S cDNA consists of2547 bp. The predicted amino acid

sequence is shown in Fig. 4. The cDNA contains twopotential polyadenylylation signals (AATAAA) that lie 496bp and 987 bp from the stop codon (data not shown). A DNAfragment between the stop codon and the first polyadenylyl-ation signal and a fragment between the two polyadenylyl-ation signals were isolated and used as probes for Northernblot analyses. Both the 2.2-kb and 2.6-kb keratinocytemRNAs were identified with the former fragment but only the2.6-kb mRNA was detected using the latter (data not shown).These results suggest that the two S mRNAs were generatedby usage of alternative polyadenylylation sites in a single Sgene. The 5' end of the S cDNA contains three in-frameATGs at nt 15-17, 39-41, and 63-65. Each is associated witha Kozak consensus sequence (40); however, analysis usingthe Kyte-Doolittle hydrophobicity algorithm (41) indicates

HLA-B HLA-C S

6p U _,

20kb _

S GeneEX1 , -- EX2 -

s \ ,~~~~~~~~~~~0s \ ,/~~~~~~~~~~~~0

S cDNA

400bpH--

5' E 1 \\ 3'

MB S XER B11 I 11

FIG. 3. 5 gene is located 160 kb telomeric from HLA-C. Two exons(open boxes) ofthe S gene are separated by a 2.9-kb intron. The hatchedboxes represent the 5' and the 3' untranslated regions. Restrictionenzyme cleavage sites in the S cDNA are designated by B (BamHI), E(BstEII), M (Mlu I), R (EcoRV), S (Sac II), and X (Xho I).

ccgtgcagtccgagataggctcgtctcgggcaM G S S R A

ccctggatagggcgtgtgggtgggcacgggatP W M G R V G G H G M

LALLLAGLLLPGTLAKSIGTFSDPCKDPTCITSPNDPCLTGKGDSSGFSSYSGSSSSGSSISSARS

SGGGSSGSSSGSSIAQGGSAGSFKPGTGYSQVSYSSGSGSSLQGASGSSQLGSSSSHSGSSGSHSG

SSSSHSSSSSSFQFSSSSFQVGNGSALPTNDNSYRGILNPSQPGQSSSSSQTSGVSSSGQSVSSNQ

RPCSSDIPDSPCSGGPIVSHSGPYIPSSHSVSG

GQRPVVVVVDQHGSGAPGVVQGPPCSNGGLPGKPCPPITSVDKSYGGYEVVGGSSDSYLVPGMTYS

KGKIYPVGYFTKENPVKGSPGVPSFAAGPPISE

GKYFSSNPIIPSQSAASSAIAFQPVGTGGVQLC

GGGSTGSKGPCSPSSSRVPSSSSISSSSGLPYH

PCGSASQSPCSPPGTGSFSSSSSSQSSGKIILQPCGSKSSSSGHPCMSVSSLTLTGGPDGSPHPDP

SAGAKPCGSSSAGKIPCRSIRIS

1

3467

100133

166

199232265

298331

364

397430463486

FIG. 4. Predicted amino acid sequence encoded by the S cDNA.The first 65 nt (lowercase type) of the S cDNA are shown to indicatethe three potential in-frame initiation codons (underlined). Thededuced protein sequence is shown in uppercase type. The hydro-phobic sequence representing a potential signal peptide and twopotential N-linked glycosylation sites are underlined. The methio-nine preceding the hydrophobic sequence is designated residue 1.

the presence of a hydrophobic 16-aa sequence after the thirdATG. This sequence scores high using algorithms that iden-tify sites of cleavage of signal peptidase (42, 43). Conse-quently, we postulate that the ATG in positions 63-65 definesthe translational initiation site. Thus, the open reading framefrom this initial methionine to the stop codon (TAG) atpositions 1521-1523 predicts a protein of 486 aa. Processingat the predicted signal peptidase cleavage site (44) wouldleave lysine (+ 17) as the N-terminal amino acid of a matureprotein consisting of 470 aa with a calculated molecularweight of 45,404. Attachment of carbohydrate at the twopotential N-linked sites and the numerous 0-linked glycosyl-ation sites (45) could lead to a higher molecular mass. Therewas no indication of potential transmembrane regions.

Search of the GenBank data base (May 31, 1993) indicatedthat this sequence has not been previously reported. Thepredicted S amino acid sequence did show homology to thehuman and mouse loricrin, keratin 1, and keratin 10 proteins.The identities were 23% over 438 aa for mouse loricrin, 25%over 235 aa for human loricrin, 29% over 160 aa for humankeratin 1, and 30% over 158 aa for human keratin 10.Interestingly, all of these proteins are specifically expressedin the terminally differentiating epidermis and constitute theinterfilamentous matrix and cell envelopes (46-48). More-over, the predicted S protein has an extraordinarily highcontent of serine (30.9%), glycine (15.7%), and proline(10.2%). This composition is similar to that of purified humanforeskin cell envelopes that are 18.9% serine, 33.9o glycine,and 7.7% proline (49).

Localization of S Transcripts in Epidermis. To determine atwhich stage of differentiation the S gene is expressed, weexamined the distribution of its transcripts in sections ofnormal human foreskin by in situ hybridization. Fig. 5 showsa strong hybridization signal in the granular layer of theepidermis, with only a background signal detected in otherportions of the dermis and epidermis. This result indicatesthat the S gene is expressed specifically in keratinocytes thatare entering the terminal phase of their differentiative path-way. Expression of the S gene late in keratinocyte differen-tiation correlates with the very low level ofS gene expressionin the immature keratinocyte cell line RHEK (Fig. 2) and also

9472 Genetics: Zhou and Chaplin

Proc. Natl. Acad. Sci. USA 90 (1993) 9473

'. ""I

l',A':i'^'^'+ "'N. .

C

.-.

ip.I .. .-

rR

FIG. 5. Localization ofS transcripts in human epidermis by in situ hybridization. 35S-labeled sense (A and O) and antisense (C and D) S cDNAprobes. Bright-field (A and C) and dark-field (B and D) photomicrographs are shown. (x 140.)

with the failure to identify S cDNAs in the human fetal skincDNA library.These data predict that the protein encoded by the S gene

is a major component of the granular layer of the epidermis.Although this protein shares significant structural homologywith loricrin, the fact that it probably contains a signalpeptide suggests that it may act in a different cellular com-partment. Loricrin has no signal peptide and is targeted to thecytoplasmic face of the plasma membrane contributing to theformation of the cell envelope. The S protein, in contrast,may be a secreted protein or be targeted to a nonplasmamembrane intracellular membranous structure. Definition ofits exact cellular localization awaits the preparation of spe-cific anti-S antibodies that should permit its detection insections of skin and in primary keratinocyte cultures.Although the function of the S protein is unknown, the

restriction of its expression to the granular-cell layer of theepidermis suggests that it participates in establishing theunique structure of terminally differentiated skin. It may alsodeliver an important regulatory signal as keratinocytes dif-ferentiate from the proliferating epidermal basal-cell layer tothe final barrier layer of fully keratinized cells. Finally, it isof considerable interest that this gene, expressed exclusivelyin terminally differentiated keratinocytes, is closely linked tothe HLA-C gene. Previous population studies have shown astrong association between the HLA-Cw6 allele and suscep-tibility to development of psoriasis vulgaris (20, 50, 51), acommon cutaneous disease affecting -1% of the population.In fact, psoriasis vulgaris is one of the few diseases showinga strong association with an antigen of the C locus. Althoughthe association ofHLA-C with psoriasis susceptibility is notabsolute (20), it is intriguing to speculate that certain allelesof the S gene may contribute importantly to the pathogenesisof this common skin disorder. Preliminary studies haveshown S gene-associated restriction fragment length poly-morphism (RFLP). These RFLPs were demonstrated using36 HLA-homozygous cell lines constituting a portion of theAmerican Society of Histocompatibility and Immunogenet-ics panel of typing cells. Analysis of Southern blots of MspI-digested genomic DNA from these cell lines by hybridiza-tion with the full-length S cDNA probe showed four RFLP

alleles (Msp I a, 2.3, 0.9, and 0.6 kb hybridizing fragments;Msp I b, 1.6, 0.9, and 0.6 kb fragments; Msp I c, 1.6, 0.6, and0.5 kb fragments; and Msp I d, 1.4, 0.6, and 0.5 kb fragments).Within this panel of 36 HLA homozygous cell lines, HLA-CIwas always found in association with Msp I allele b. HLA-C4was always found with Msp I allele c, and HLA-CJO and -ClIwere seen with Msp I allele a. In contrast, HLA-C2, -C5, -C7,and -C8 were each found in association with at least two MspI alleles. We anticipate that additional S gene DNA sequencepolymorphisms will be identified and should provide infor-mation that supplements serological typing of HLA-C. Theyshould, thus, be useful in the analysis ofthe inheritance of theclass I region of the MHC and for the definition of HLAhaplotypes that contribute to susceptibility to psoriasis.

We gratefully acknowledge the assistance ofDr. William Parks, JillRoby, and Teresa Tolley for in situ hybridization, Dr. Jay Ponder fordiscussion of S protein structure, and Drs. Helen Donis-Keller,Arthur Eisen, and Stanley Korsmeyer for comments on the manu-script. This study was supported in part by National Institutes ofHealth Grants AI-15322 and P50 HGO0201 (D.D.C.).

1. Carroll, M. C., Katzman, P., Alicot, E. M., Koller, B. H.,Geraghty, D. E., Orr, H. T., Strominger, J. L. & Spies, T.(1987) Proc. Natl. Acad. Sci. USA 84, 8535-8539.

2. Dunham, I., Sargent, C. A., Trowsdale, J. & Campbell, R. D.(1987) Proc. Natl. Acad. Sci. USA 84, 7237-7241.

3. Lawrance, S. K., Smith, C. L., Srivastava, R., Cantor, C. R.& Weissman, S. M. (1987) Science 235, 1387-1390.

4. Bahram, S., Arnold, D., Bresnahan, M., Strominger, J. L. &Spies, T. (1991) Proc. Natl. Acad. Sci. USA 88, 10094-10098.

5. Monaco, J. J., Cho, S. & Attaya, M. (1990) Science 250,1723-1726.

6. Brown, M. G., Driscoll, J. & Monaco, J. J. (1991) Nature(London) 353, 355-357.

7. Trowsdale, J., Hanson, I., Mockridge, I., Beck, S., Townsend,A. & Kelly, A. (1990) Nature (London) 348, 741-744.

8. Glynne, R., Powis, S. H., Beck, S., Kelly, A., Kerr, L.-A. &Trowsdale, J. (1991) Nature (London) 353, 357-360.

9. Campbell, R. D., Carroll, M. C. & Porter, R. R. (1986) Adv.Immunol. 38, 203-244.

10. Spies, T., Morton, C. C., Nedospasov, S. A., Fiers, W., Pious,D. & Strominger, J. L. (1986) Proc. Natl. Acad. Sci. USA 83,8699-8702.

--

Genetics: Zhou and Chaplin

9474 Genetics: Zhou and Chaplin

11. White, P. C., Grossberger, D., Onufer, B. J., Chaplin, D. D.,New, M. I., Dupont, B. & Strominger, J. L. (1985) Proc. Natl.Acad. Sci. USA 82, 1089-1093.

12. White, P. C., Chaplin, D. D., Weis, J. H., Dupont, B., New,M. I. & Seidman, J. G. (1984) Nature (London) 312, 465-467.

13. Sargent, C. A., Dunham, I., Trowsdale, J. & Campbell, R. D.(1989) Proc. Natl. Acad. Sci. USA 86, 1968-1972.

14. Harrison, G. S., Drabkin, H. A., Kao, F. T., Hartz, J., Hart,I. M., Chu, E. H., Wu, B. J. & Morimoto, R. I. (1987) SomaticCell Mol. Genet. 13, 119-130.

15. Hsieh, S. L. & Campbell, R. D. (1991) Biochem. J. 278,809-816.

16. Beck, S., Hanson, I., Kelly, A., Pappin, D. J. & Trowsdale, J.(1992) DNA Seq. 2, 203-210.

17. Spies, T., Bresnahan, M. & Strominger, J. L. (1989) Proc.Natl. Acad. Sci. USA 86, 8955-8958.

18. Sargent, C. A., Dunham, I. & Campbell, R. D. (1989) EMBO J.8, 2305-2312.

19. Hanson, I. M., Poustka, A. & Trowsdale, J. (1991) Genomics10, 417-424.

20. Tiwari, J. L. & Terisaki, P. I. (1986) HLA and Disease Asso-ciations (Springer, New York).

21. White, P. C., Werkmeister, J., New, M. I. & Dupont, B. (1986)Hum. Immunol. 15, 404-415.

22. Todd, J. A., Bell, J. I. & McDevitt, H. 0. (1987) Nature(London) 329, 599-604.

23. Hammer, R. E., Maika, S. D., Richardson, J. A., Tang, J. P.& Taurog, J. D. (1990) Cell 63, 1099-1112.

24. Bronson, S. K., Pei, J., Taillon-Miller, P., Chorney, M. J.,Geraghty, D. E. & Chaplin, D. D. (1991) Proc. Natl. Acad. Sci.USA 88, 1676-1680.

25. Kozono, H., Bronson, S. K., Taillon-Miller, P., Moorti,M. K., Jamry, I. & Chaplin, D. D. (1991) Genomics 11, 577-586.

26. Geraghty, D. E., Pei, J., Lipsky, B., Hansen, J. A., Taillon-Miller, P., Bronson, S. K. & Chaplin, D. D. (1992) Proc. Natl.Acad. Sci. USA 89, 2669-2673.

27. Bird, A. P. (1986) Nature (London) 321, 209-213.28. Monaco, A. P., Neve, R. L., Colletti-Feener, C., Bertelson,

C. J., Kurnit, D. M. & Kunkel, L. M. (1986) Nature (London)323, 646-650.

29. Parimoo, S., Patanjali, S. R., Shukla, H., Chaplin, D. D. &Weissman, S. M. (1991) Proc. Natl. Acad. Sci. USA 88,9623-9627.

30. Duyk, G. M., Kim, S. W., Myers, R. M. & Cox, D. R. (1990)Proc. Natl. Acad. Sci. USA 87, 8995-8999.

31. Brownstein, B. H., Silverman, G. A., Little, R. D., Burke,D. T., Korsmeyer, S. J., Schlessinger, D. & Olson, M. V.(1989) Science 244, 1348-1351.

32. Olson, M. V., Loughney, K. & Hall, B. D. (1979) J. Mol. Biol.132, 387-410.

33. Blin, N. & Stafford, D. W. (1976) Nucleic Acids Res. 3,2303-2309.

34. Chaplin, D. D., Woods, D. E., Whitehead, A. S., Goldberger,G., Colten, H. R. & Seidman, J. G. (1983) Proc. Natl. Acad.Sci. USA 80, 6947-6951.

35. Chirgwin, J. M., Przybyla, A. E., MacDonald, R. J. & Rutter,W. J. (1979) Biochemistry 18, 5294-5299.

36. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) MolecularCloning: A Laboratory Manual (Cold Spring Harbor Lab.Press, Plainview, NY), 2nd Ed., 16.60-16.67.

37. Loh, E. Y., Elliott, J. F., Cwirla, S., Lanier, L. L. & Davis,M. M. (1989) Science 243, 217-220.

38. Frohman, M. A., Dush, M. K. & Martin, G. R. (1988) Proc.Natl. Acad. Sci. USA 85, 8998-9002.

39. Prosser, I. W., Stenmark, K. R., Suthar, M., Crouch, E. C.,Mecham, R. P. & Parks, W. C. (1989) Am. J. Pathol. 135,1073-1088.

40. Kozak, M. (1991) J. Cell Biol. 115, 887-903.41. Kyte, J. & Doolittle, R. F. (1982) J. Mol. Biol. 157, 105-132.42. Folz, R. J. & Gordon, J. I. (1987) Biochem. Biophys. Res.

Commun. 146, 870-877.43. Folz, R. J. & Gordon, J. I. (1988) Comput. Appl. Biosci. 4,

175-179.44. von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690.45. Wilson, I. B., Gavel, Y. & von Heijne, G. (1991) Biochem. J.

275, 529-534.46. Hohl, D., Mehrel, T., Lichti, U., Turner, M. L., Roop, D. R.

& Steinert, P. M. (1991) J. Biol. Chem. 266, 6626-6636.47. Schweizer, J., Kinjo, M., Furstenberger, G. & Winter, H.

(1984) Cell 37, 159-170.48. Eckert, R. L. & Green, H. (1986) Cell 46, 583-589.49. Mehrel, T., Hohl, D., Rothnagel, J. A., Longley, M. A., Bund-

man, D., Cheng, C., Lichti, U., Bisher, M. E., Steven, A. C.,Steinert, P. M., Yuspa, S. H. & Roop, D. R. (1990) Cell 61,1103-1112.

50. Christophers, E. & Henseler, T. (1986) in Psoriasis, eds.Farber, E. M., Nall, L., Morhenn, V. & Jacobs, P. H. (Else-vier, New York), pp. 309-315.

51. Ozawa, A., Ohkido, M., Inoko, H., Ando, A. & Tsuji, K. (1988)J. Invest. Dermatol. 90, 402-405.

Proc. Natl. Acad Sci. USA 90 (1993)