linked apolipoprotein(a) and plasminogen genes and identification

6

Upload: trandat

Post on 31-Jan-2017

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: linked apolipoprotein(a) and plasminogen genes and identification
Page 2: linked apolipoprotein(a) and plasminogen genes and identification

Proc. Natl. Acad. Sci. USAVol. 89, pp. 11584-11588, December 1992Genetics

Characterization by yeast artificial chromosome cloning of thelinked apolipoprotein(a) and plasminogen genes andidentification of the apolipoprotein(a) 5' flanking regionN. MALGARETTI*, F. ACQUATIt, P. MAGNAGHI*, L. BRUNOt, M. PONTOGLIO0, M. RoccHIl, S. SACCONE§,G. DELLA VALLE§, M. D'URso¶1, D. LEPASLIERII, S. OTTOLENGHIt, AND R. TARAMELLIt**tDipartimento di Genetica e di Biologia dei Microrganismi, 26-20133 Milano, Italy; *Istituto San Raffaele, Milano, Italy; tIstituto di Genetica, Bari, Italy;'International Institute of Genetics and Biophysics, Napoli, Italy; §Dipartimento di Genetica e Microbiologia, Pavia, Italy; and I'Centre d'Etude duPolymorphisme Humain, Paris, France

Communicated by Stanley M. Gartler, August 14, 1992 (receivedfor review June 19, 1992)

ABSTRACT The apoprotein(a) [apo(a)J gene encodes aprotein component of the circulating lipoproten(a) [Lp(a)].The apo(a) gene is highly homologous to the pasmn gene.It encodes one of the most polymorphic human proteins, dne tovariability in the number of repetitions of tructures calledkln . In addition, Lp(a) levels vary among id s bymore than two orders of mnitude, the high levels being highlycorrelated with predisposition to early atheroederotic disease.To better understand the genetics and function of the apo(a)gene, we have cloned in yeast artiiil chromosome vectorsDNA faments comprising the linked apo(a) andp iogenes and other members of the pa n family. By acombination of pulsed-field gel electrophoresis and genomewalking experiments, we have identified the 5' portion andflanking regions of the apo(a) gene.

Lipoprotein(a) [Lp(a)] is a particle formed by the interactionof apoprotein(a) [apo(a)], apoprotein B, and various types oflipids, including cholesterol esters (1). Lp(a) is present inhuman plasma at a wide range of concentrations, from <0.1mg/100 ml up to 100 mg/100 ml (1). This variability isinherited and is mainly due to a single locus that is closelylinked to the apo(a) gene and that may be represented by theapo(a) gene itself (2, 3). Other unlinked loci may have minoreffects on Lp(a) levels (2).The variability of Lp(a) levels is clinically important (4). In

fact, high levels of Lp(a) may have proatherogenic andprothrombotic roles (5). Epidemiologic studies have shownthat high plasma levels of Lp(a) are associated with anincreased risk of early atheriosclerotic vascular disease andmyocardial infarction (6, 7); it has been suggested that a highLp(a) level is an independent risk factor for this disease (7).A 415,000-nucleotide-long cDNA encoding apo(a) has

been cloned and shown to be highly homologous to that ofplasminogen (PMG) (8) (Fig. 1). The two cDNAs show almostcomplete identity in the 5' untranslated and signal peptideregions; 3' to these structures, the PMG cDNA continueswith a tail region [missing in the apo(a) cDNA], five tandemlyrepeated homologous domains (kringles 1-5), and a proteasedomain. Apo(a) cDNA, on the other hand, contains multiplerepeats ofa sequence closely corresponding to that of kringle4 of PMG, followed by a kringle 5 and a protease domainresembling those of PMG. The number of kringle 4-likerepeats in the apo(a) gene has been shown to be highlyvariable among individuals, from 9 to >37 repeats. Thisresults in >25 different recognizable apo(a) isoforms and anunusually high level of heterozygosity within populations (9,10).

In view of the reported association between high plasmaLp(a) levels and inherited predisposition to early coronarydisease, it is important to determine the molecular basis forthe variable levels of Lp(a) (11), The apo(a) gene has not yetbeen cloned. Due to the large size of the gene, its internallyrepeated structure, and its high homology to PMG and to afamily of related genes, cloning the apo(a) gene by conven-tional genome walking techniques has proven difficult. Herewe report the cloning in yeast artificial chromosomes (YACs)of the linked apo(a) and PMG genes and the identification ofthe apo(a) 5' flanking region.tt

MATERIALS AND METHODSIsation of YAC Clones. Clones were isolated from the

YAC library of the Centre d'ttude du Polymorphisme Hu-main (CEPH) (Paris) (12). The library was screened by PCRusing primers specific for apo(a) kringle 1 (primer pair 2, Fig.1; for the sequence see below). YACs positive for apo(a) byPCR were subsequently hybridized with an apo(a)-specificprobe containing intron sequences from the kringle region(probe 2, Fig. 1).

Preparation and Analysis of Yeast DNA. Total DNA ofYAC-containing clones was prepared as described (13). ThisDNA was suitable for analysis by conventional Southernblotting, PCR, and PFGE. DNA digested with rare cuttingenzymes was fractionated using a contour-clamped homoge-neous electric field (CHEF; Bio-Rad) apparatus, transferredto a nylon membrane, and hybridized according to standardprotocols (14). Apo(a) YAC-1 DNAs, which underwent in-direct end-label mapping, were digested with the indicatedenzymes for different lengths of time. The DNAs werefractionated by pulsed-field gel electrophoresis (PFGE) andhybridized with probes recognizing the left arm and rightarm, respectively, of the YAC vector (i.e., the large and thesmall Pvu II-BamHI pBR322 fragments; see ref. 15).Cloning ofYAC-Derived DNA In A Phage. DNAfrom apo(a)

YAC-I was partially cut with Mbo I and fractions rangingfrom 12 kb to 20 kb were ligated to A Fix arms cut withBamHI(Stratagene). In vitro packaging, plating, and screening ofthelibrary were as described (14).Clonig of EcoeRI-Destd DNA. Genomic DNA cut with

EcoRI and enriched, by gel electrophoresis, for DNA frag-ments of size corresponding to the fourEcoRI bands detected

Abbreviations: PFGE, pulsed-field gel electrophoresis; apo(a), apo-protein(a); Lp(a), lipoprotein(a); PMG, plasminogen; YAC, yeastartificial chromosome.**To whom reprint requests should be addressed.ttThe sequences reported in this paper have been deposited in theGenBank data basel[accession nos. M90078 (4.8-kb band), M90079(3.6-kb band), M90080 (3.2-kb band)].

11584

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Page 3: linked apolipoprotein(a) and plasminogen genes and identification

Proc. Natl. Acad. Sci. USA 89 (1992) 11585

PROBE 3L

FIG. 1. Schematic representation of different structures encoded within PMG and apo(a) cDNAs. Probes 1 and 3 were obtained by PCR usingappropriate oligonucleotide pairs. Probe 2 is a genomic 1.2-kilobase (kb) Pst I fragment derived from the intron separating the two exons thatconstitute the apo(a) kringle A (see text). S, signal peptide; T, tail; P, protease domain; 3', untranslated region; 1, apo(a) kringle 1; 2, one ofthe repeated kringle As; 33, apo(a) kringle 33. Circled numbers represent oligonucleotide pairs.

by probe 1 (Fig. 1) was ligated to Agtll EcoRI-cut arms.Subsequent manipulations were as described (14).PCR. PCRs were carried out as follows. Sample DNAs

(YAC or genomic) were subjected to 25 or 30 cycles ofamplification by heating at 940C for 1 min to denature theDNA, cooling to 55-650C for 2 min to anneal the primers, andincubating at 720C for 3 min to extend the annealed primers.The primer pairs (circled in Fig. 1) are as indicated:

1 5'-CTGGGATTGGGACACACTTTC-3'

5'-TGATTTCAGAAATAAAAGAAG-3'2 5'-TTTCTGTGGTCCTATTATGTTGA-3'

5'-CACCTGAGCAAAGCCATGT-3'3 5'-TCAGAAACAGCCGTTGACGTC-3'

5'-GCTGAGATTAGTCCTTGGTGTTATACCATG-3'4 5'-TTTGTCAGTCAGACCTTAAAAGC-3'

5'-TCAACCTACTTAGAAGCTGAAC-3'5 5'-GTCAAGGAGAGCCTCTGGAT-3'

5'-CTGCAGGTGAATTCTTCGTC-3'6 5'-GTCACCTTACTACAGAATCC-3'

5'-CTGACTCACCTAGAGGCTGGG-3'

Akb

Apo(a)-Specific Genomic Probe. A 1.2-kb Pst I fragment(see ref. 11; probe 2, Fig. 1) was derived from a cosmid clonecontaining seven repeats of apo(a) kringle A (called kringle 4in ref. 11). The fragment corresponds to part of the intronsplitting kringle A. The sequences of the exons are identicalto those predicted by the published cDNA (8). In genomicPFGE experiments this probe detects the typical polymor-phic Kpn I fragment that is diagnostic of the apo(a) gene (11)but does not detect any other fragment (PMG, etc.), dem-onstrating its apo(a) specificity.

RESULTS

Apo(a) Probes Identify a Family of Closely Related Genes.Genomic blotting analysis with probes derived from theapo(a) cDNA invariably gives rise to multiple bands, repre-senting the PMG and a related gene family. To identify theregion including the apo(a) 5' flanking (promoter) sequences,we used a probe spanning the leader sequences common toPMG and apo(a) (8) (probe 1, Fig. 1) for Southern blottingexperiments on total DNA. This probe detects multiple bandswith several restriction enzymes (Fig. 2). To understand thenature of the genes detected by this probe, we cloned andsequenced the four EcoRI bands (4.8, 3.6, 3.2, 1.4 kb) shownin Fig. 2A. The 1.4-kb band is identified as a fragment of thePMG gene, containing 5' flanking sequences, the first exon

0r_ zn E4 N4W x

ATG GTAAGAC

74 co 90 - 95 % 84%o

E

W-I,probe

E AhTa UTIAAUR E

h----A- If --177% 90 - 95% 8&4 '

E ATG GTAAGAC probe

probe 90 95 %E

CAP ATG GTAAGACE 0- _

E

FIG. 2. (A) Southern blot analy-sis of genomic DNA. DNA was di-gested with the indicated restrictionenzymes and hybridized to probe 1.(B) Homology with PMG of se-quences in the 4.8-, 3.6-, and 3.2-kbbands. The 5' untranslated and theleader regions are indicated byblack rectangles. The donor splicesites of the first intron are shown.The regions showing homology tothe PMG gene are indicated with thedegree of homology shown belowthem. Unique DNA probes derivedfrom the 4.8-, 3.6-, and 3.2-kb bandsare indicated by brackets.

4.8 -

3.6 -3.2 -

B

4.8 kb l=

3.6 kb

1.4 -- A3.2 kb

PMG

probe 1

:xto-4 uF

Genetics: Malgaretti et aL

-730 -61 0 -160

Page 4: linked apolipoprotein(a) and plasminogen genes and identification

Proc. Natl. Acad. Sci. USA 89 (1992)

(untranslated sequences and leader), and part of the firstintron. In fact, the size and nucleotide sequences of this bandcorrespond to those reported for the PMG gene (16); more-over, the same 1.4-kb band can be derived from largefragments cloned in A phage, which additionally show theexpected "tail" and "kringle 1" nucleotide sequences.The 4.8-, 3.6-, and 3.2-kb bands show very high conser-

vation, relative to the PMG gene, of the leader and 5'untranslated regions, up to the position corresponding to thecap site of the PMG gene (-160 from the initiator ATG) (Fig.2B). Upstream of this position, a high level of homology ismaintained by bands 3.6 and 4.8 kb (but not 3.2) up tonucleotides -610 and -730, respectively. Downstream (3')of the conserved leader domain the PMG gene is interruptedby an intron; the same holds true for the 3.2-, 3.6-, and 4.8-kbbands, which show a conserved donor splice site, followed by150 highly homologous nucleotides and further conservedsequences in the intron (Fig. 2B). The detailed comparison ofthe upstream PMG and 4.8-kb nucleotide sequences is shownin Fig. 3.The 4.8-, 3.6-, and 1.4-kb Bands Are Closely Linked Within

=400 kb. The apo(a) leader probe was hybridized to EcoRI-digested DNA from a panel of hamster-human somatic cellhybrids (17). All four bands are detected using DNA fromhybrids including human chromosome 6 but not from hybrids

apo (a) GAATTCATTTGCGGAAAGATTGATACTATGCTTTTATTTTATTTTATT -1401apo (a) TTATTTTATTTTATTTTATTTTATTTTATTGAGACTCTCACCCCGGTTGAAGT -1348apo (a) GCACTGACGTGATTTTGGCTCACTGCAACTTCCACCTCCTGGGTTCAAGTGAA -1295apo (a) TACTCCAGCCTCCCTAGTAGCTGGGATTACAGGTGCCCACCACCACGCCTGGC -1242apo (a) TAATTTTTGTATTTTTAGTAGAGATGGGGTTTCACCACATTGGCCTGGCTGGT -1189apo (a) CTCAAACTCCTGACCTTGTGATCCACCTGTCTTGGCCTCCCAAAGTGCTGGGA -1136apo (a) TTACAGAGTTGAGCCACCGCACTCGACCCTATGTTTTATTTTTAAAAATATTT -1083apo (a) ATTTATTTATTTAAGCCACAACTACTAGAATAGGAAGGATTGATATTTTATTA -1030apo (a) ATTTTATTTGGTATTTATTATTTTTTTTTCTTTCCTGAGACATTCTTGCTCTG -977apo (a) TCACCCAGGCTGGAGTGCAGTGGCACATTCTTGGCTCACTGCAACCTCCATCT -924apo (a) CCTGTGTTCAAGCAATTCTAGTGCCTCAGCCTACTTAGTAGCTGGGATGACTG -871apo (a) GCATGTGCCTCCACACCCAGCTAATTTTTGTATTTTTTGTAGAGACAGGGTTT -818apo (a) TGCATGTTGCCCAGGCTTGTCTCAAACTCCTGGCCTCAGGTGATCCATCTGCC -765apo (a) GTGCCTCCAAAATGCTGGGATTATAGCATGAGCCACCACCCCCTCCTGGAAGG -712

apo (a) ATTGATATCTTATAACATAATTTATAATTACAGAAAACATGTGAGTTCACTAG -659PLG ATTGATGTCTTATAACATAATTTATAATTACAGAAAACATGTGAGTTCACTGG -675

apo (a) GAATAAATAAATTTTGAAGATAATAAAAGATTTTCACTTCTGTTGTCATTTCC -606PLG GAATAAATAAATTTTGAAGATAATAAGATACTTTCACTTATGTCATAATTTCT -622

apo (a) GGCACAGTTTGGTATAGGATGTGGAGATGTTAACATTTATACCTAGCTTGCTCPLG ATGTCATTTGGTGT-AGGATGTAGAGATATTAACGTTTACACCTAACTCAAGT

apo (a) GTA--AACTAAGACCTGAAAGGGTTGTGTCTATCAGCTGCACCCCTGGGTAGCPLG TTGTCATCTAAGACCTGAAAGGGTTTTGTCTATCAGCTGCACCCCTGGGTAGA

apo (a) GACACAACCTCGGGAAG--CCTCAGCCCCCTCCT-CGTACAGCA---------PLG GACACAACCTTGGGGAAGGCCTCAGCCCCATCCCTCGTACAGCAGGAATGAGA

apo (a)PLG

apo (a)PLG

apo (a)PLG

------CTGCCTGTTGGAAAGCTTGAGGGAGGCTATGGATGTGCAGCACTTGGACAGCCCTGCCTGTTGGGAAGCTTGAGGGAGGCTATGGACGTGCAGCGCTTGG

CAGAGGGTCTGGTCATGGAAGTTACCAGCAAATATGAGCTACTTTTATGATTT*********** ******* **** **************CAGAAGGTCTCGTCATGGAAGGTTCCAGCAAATGTGAGATACTTTTATGATTT

TATTTTATCCAAAAGAAAGAGAATGAAAGAAGAGGGGAGGAAACAAGACTAAT*****T************G****A*A**************** *********

CATTTTCTCCAAAAGAAAGGGAATAAGAGAAGAGGGGAGGAAATAAGACTAAT

-553-570

-502-517

-461-4 64

-414-411

-361-358

-308-305

apo (a) CAGGAAAGATGAAGGTCTAGGGGTGAGGGAAGGAGTAATGGAGACCATAAAGG -255PLG TGCGAGAGATAAAGTACAAGGGTG-AGGGAAGGAATAAGG-AGAC-ATGACGG -255

apo (a)PLG

apo (a)PLG

CAATGTGGAGCAGCTGGAGGGGGAGAATGGCTTTCACCACCTTCCCAGCATCT -202CAGCGTGGAGCAGCCGAGGGGGGAGATTG-CTTTCACCACTT-CCCAGCATCT -204

ATTG-ACATTGCACTCTCAAATATTTTATAAG-ACTCTATATTCAAGGTAATG -151ATTGCAGATTCCACCCTCAAACATTTTGTAAGGACTCTTTATTCAAGGTAACG -151

apo (a) TTTGAACCCTGCTGACGCAGTGGCATGGGTCTCTGAGAGAATCATTAACTTAA* * * ************************************

PLG TTTGAACCCTGCTGAGCCAGTGGCATGGGTCTCTGAGAGAATCATTAACTTAA-98-98

missing this chromosome. In situ hybridization further con-firms that the 4.8- and 1.4-kb PMG bands hybridize to bandsq26-q27 on chromosome 6, where the PMG gene had beenthought to be previously located (18, 19) (data not shown).These observations suggested that the whole cluster might

be contained within a few hundred kilobase pairs. We there-fore screened by PCR a human DNA library in YACs (12)using an oligonucleotide pair derived from the apo(a) kringle1 (pair 2, Fig. 1). This primer pair yields a single band of theexpected length and sequence upon amplification with totalhuman DNA and is therefore specific for the apo(a) gene (notshown). Three independent clones were obtained. Hybrid-ization of EcoRI-digested DNA from these clones with theapo(a) probe shows that bands 1.4 and 4.8 are contained in allthree clones, whereas band 3.6 is present only in YAC1 andYAC2; band 3.2 is missing from all YACs (Fig. 4).

Identification of the apo(a) and PMG genes was confirmed(not shown) by PCR of YAC1 using primers specific forapo(a) kringles 1 and 33 (pairs 2 and 3, Fig. 1), for the apo(a)3' untranslated region (pair 4, Fig. 1), and for PMG (pairs 5and 6, Fig. 1). Importantly, most regions of the apo(a) geneare present, except the 3' untranslated sequences, indicatingthat the gene is truncated somewhere between kringle 33 andthe 3' untranslated region.Mapping of Apo(a) and PMG Genes on YACs. To locate the

various genes on the YAC clones, we studied YAC1, whichcontains bands 1.4, 3.6, and 4.8. Partial DNA digests ob-tained with Kpn I, Sal I, and Sfi I were separated by PFGEand hybridized to probes derived from the left and right armsof the YAC vector, generating the restriction map shown inFig. 5C.Complete digests ofYAC1 with Kpn I, Sal I, Sfi I, and Sal

I plus Sfi I were then hybridized with total human DNA todetect all of the different bands based on their content ofrepetitive sequences (not shown). The filters were furtherhybridized with an apo(a)-specific probe (probe 2 in Fig. 1from the kringle A intron), the PMG cDNA tail (probe 3 inFig. 1), a unique DNA probe derived from the 4.8-kb band(Fig. 2B), and a unique DNA probe derived from the 3.6-kbband (Fig. 2B). The apo(a) probe hybridizes to a 140-kb KpnI digest (not shown). In addition, it hybridizes to a Sal Ifragment of 135 kb and to the 175-kb Sfl I fragment that is alsodetected by the "left" vector probe; double digestion withthese enzymes generates only the 135-kb band (Fig. 5 A andC). The same bands are also detected by the 4.8-kb band-specific probe (Fig. 5A). In contrast, the PMG cDNA probedetects fragments of 110, 70, and 70 kb in Sal I, Sfi I, anddouble Sal I-Sfi I digests, respectively (Fig. 5A). Finally, the3.6-kb band-specific probe detects the same 110-kb Sal I

4 4 4>. >.

o o C04 < 4

apo (a) TTTGACTATCTGGTTTGTGGGTGCGTTTACTCTCATGTAAGTCAACAATGTCC -45**************************

PLG TTTGACTATCTGGTTTGTGGATGCGTTTACTCTCATGTAAGTCAACAACATCC -45apo (a) TGGGATTGGGACACACTTTCTGGGCACTGCTGGCCAGTCCCAAA&TGGAACAT +9PLG TGGGATTGGGACCCACTTTCTGGGCACTGCTGGCCAGTCCCAAAATGGALAT +9

apo (a)PLG AA&LGTGG TTCTATCPTTTATTTCAGk^eTCA

cJ u

E E0 0O a

kb

_s -4.8

-W -3.6X -3.2

+49+49

FIG. 3. Nucleotide sequence comparison between the 5' flankingregion of the 4.8-kb band and the corresponding region of PMG(PLG). Bold letters represent the leader domain. Position 1 indicatesthe putative translation start site, by homology to PMG; position 50is the first nucleotide of the first intron. Upstream of -711, nohomology is detectable between the two sequences. The 4.8-kb bandcorresponds to the 5' region of the apo(a) gene (see text).

am - 1.4

FIG. 4. Southern blot analysis of apo(a) YAC1, YAC2, YAC3,and genomic DNAs, which were digested with EcoRI and hybridizedto probe 1.

11586 Genetics: Malgaretti et aL

Page 5: linked apolipoprotein(a) and plasminogen genes and identification

Proc. Nati. Acad. Sci. USA 89 (1992)

C = - partialCL co CL m5ve UO x 0 Kpnl

kb

Cl) cn U-(n

= =

(n (n4-

.i175 --- wi *

135 An .'; V+110 - J t

70 ---

16 - **

4.8

S SSfKK K

plasminogen

Sf SfS Sf1F1

left 4.8 left 4 F,

apo (a) 4.8 PMG 3.6

b50 100 150 200

-t +-2t- 5030--035-t0-400-- . -

0 so 100 150 200 250 300 350 400

FIG. 5. Localization ofYAC DNA fragments hybridizing to apo(a), PMG, and 4.8- and 3.6-kb probes. The DNAs were digested to completionwith the indicated enzymes (with the exception of partial Kpn I digestions), fractionated by PFGE, and hybridized to the probes indicated (Aand B). (C) Summary of the location of the fragments based on experiments in A and B.

fragment as the PMG probe and an '45-kb fragment with SfiI and Sal I-Sfi I double digests.

In conclusion, the PFGE experiments show that the apo(a)probe hybridizes to the same 175-kb Sfi I fragment as thatdetected by the left probe; the large kringle-containing KpnI fragment that is typical of the apo(a) gene (11) is comprisedwithin the Sfi I fragment. In contrast, the PMG and 3.6-kbbands lie outside of the 175-kb Sfi I fragment, in a centrallylocated region that is at least 50 kb apart from the apo(a) gene.The 4.8-kb Band Is Adjacent and 5' to the Apo(a) Krinl 1.

As shown above, the 4.8-kb band-specific probe hybridizes tothe same 130-kb Sal I and 175-kb Sfi I fragments as the apo(a)probe. To better locate the region hybridizing to the 4.8-kbprobe, partial and total Kpn I digests were used; it is knownthat the multikringle structure ofthe apo(a) gene is flanked byKpn I sites (11). On the other hand, a vector left probe detectson the same blots an -12-kb Kpn I fragment that is clearlydifferent from the fragment (10 kb) detected by the 4.8-kbprobe. This indicates that the 4.8-kb probe must be hybrid-izing to a small centrally located Kpn I fragment. Indeed, the4.8-kb probe detects on partial Kpn I digests a set of closelymigrating fragments ranging between 115 and 140 kb that are

S*AGA.

XHE

KH XIEP'l K. . . .H .

LEADER

XH E

H X XH E. . a .

K E

1 kb

H X XH Ei i I IA

H B K Ha I a s

H B KH

H K H

H e K H.A.t--1 J-

also detected by the left probe (Fig. 5B). To directly provethat the 4.8-kb fragment is linked and upstream of apo(a)kringle structures, partial Mbo I digests from YACi werecloned in A phage, and clones hybridizing to the 4.8-kbband-specific probe were obtained. Fig. 6 shows the map ofthree overlapping clones. The clones link a region havingperfect identity to the 4.8-kb band sequences (5' flanking, 5'untranslated, and leader sequences) with a region containingtwo exons encoding kringle 1 sequences identical to those ofthe apo(a) cDNA. The large intron separating the apo(a)leader from the kringle 1 first exon and the small intronseparating the two kringle 1 exons show correct splicingsignals.

Digestions with Kpn I and Xmn I demonstrate that the4.8-kb band lies upstream of the kringle in the same tran-scriptional orientation (Fig. 6).

DISCUSSIONIn this paper, we report the cloning in YACs of overlappingDNA fragments containing a large portion ofthe PMG/apo(a)multigene family. In particular, the apo(a) gene is contained

.TTAGC. . TGCCTGT..TCcrGGC. AAa.JAAG.

"I IKR14GLE1 I'

E El H e i

EXONI EXON 2

E EH H 8

E EH H

Apo (a) M1

Apo (a) A2

Apo (a) A.3

FIG. 6. Overlapping clones in A phage from the region surrounding the 4.8-kb band. The kringle 1 sequences perfectly match those expectedfor the apo(a) cDNA. Conserved splice sites are indicated. The Sac I cloning sites are boxed. K, Kpn I; E, EcoRI; H, HindIII; X, Xmn I; B,BamHI.

A

kb

-

Cu)

(_W / U/) Uf)

kb

115

18120-10 S

apo (a)

-- 140-- 125-- 115

,0.,.10C S

K,1 I;

3.6

sS

tR

Genetics: Malgaretti et al. 11587

"o.i" ..

.%:%.. 40O..400...::. W. .:

Page 6: linked apolipoprotein(a) and plasminogen genes and identification

Proc. Natl. Acad. Sci. USA 89 (1992)

within these clones. In fact, YAC1 demonstrates sequencesperfectly identical (by sequence analysis of PCR-amplifiedfragments) to those reported for the 5' untranslated portion ofthe mRNA, leader peptide, and kringles 1 and 33. Kringles 1and 33 are highly diagnostic due to their divergence from thecorresponding PMG sequences. In addition, PFGE analysisofDNA digested with Sal I, Sfi I (Fig. 5), Kpn I, BSSHII, andSac II (data not shown) demonstrates a large block ofDNAstrongly hybridizing to a specific genomic probe derived fromthe highly repeated apo(a) kringle A. Such a block (obtainedwith Kpn I) has been demonstrated to be unique and diag-nostic in genomic DNA for the apo(a) gene (11).PFGE experiments also show that the multikringle struc-

ture of apo(a) lies in the left portion of YAC1, at a distanceof at least 50 kb from the PMG gene. The orientation of theapo(a) gene can be deduced from the observation that themost 3' portion of the gene (3' untranslated sequences) ismissing from all YACs; thus the promoter must lie betweenthe apo(a) block and the PMG gene. PFGE experiments showthat the 4.8-kb fragment, but not the 3.6-kb fragment, lieswithin this region (Fig. 5A) and is therefore a candidate apo(a)promoter.

Indeed, subcloning of the YAC1 insert into A phage dem-onstrates that the 4.8-kb band is contiguous to, and in thesame transcriptional orientation as, the typical sequences ofthe apo(a) kringle 1, from which it is separated by a 14-kbintron. The sequences of the 4.8-kb band, including leaderpeptide, 5' untranslated region, and 5' flanking region, areperfectly identical to those determined previously from anapo(a) cDNA clone (Fig. 3).

In conclusion, the apo(a) gene has been identified on thebasis of sequence determination of diagnostic domains andthe presence of the large Kpn I block of repeated kringles.PFGE and direct cloning experiments show that the 4.8-kbfragment is the 5' portion of the apo(a) gene. In agreementwith the latter conclusion, PFGE experiments (H. G. Kraft,N.M., S. Kochl, F. Acquati, G. Utermann, and R.T., un-published data) with genomic DNA from individuals withpolymorphic apo(a) genes detect the same size variationusing either the 4.8-kb or the apo(a) kringle probes.On the other hand, the 3.6- and 3.2-kb band can be ruled

out as candidate apo(a) 5' regions. The 3.6-kb band maycorrespond to a pseudogene, as it shows a frameshift gener-ating an in-phase stop codon within the coding sequence(unpublished). In addition, hybridization of the blots shownin Fig. SA fails to demonstrate any kringles in fragmentscomprising the 3.6-kb band (not shown).The 3.2-kb band, although not represented in YAC1-

YAC3, is present in a further YAC clone that contains onlythe 3' portion of the apo(a) gene and lies 3' to it (data notshown). In addition, the 3.2-kb band is linked to an -20-kbKpn I fragment containing kringle-like sequences. This is fartoo small a fragment to be compatible with the known apo(a)Kpn I kringle repeat.

Recently, Ichinose (20) reported the 5' sequences of twocandidate apo(a) genes. His apo(a)I gene is closely related toour 4.8-kb band and likely represents the same gene. Ichi-nose's apo(a)II gene probably represents the 3.2-kb band.As previously mentioned, the levels of Lp(a) are geneti-

cally determined, mainly by a locus that coincides with or isvery close to the apo(a) gene itself. In turn, high Lp(a) levelsare considered an independent risk factor for atheroscleroticcardiovascular disease (1, 4-6). The size of the apo(a) gene(that is related to the number of the kringles) and the levelsof expression are highly variable among different individualswithin the same population and among different populations(1). An inverse correlation between apo(a) size and Lp(a)concentration has been reported (2, 3), suggesting that post-translational mechanisms [for example, Lp(a) removal from

the circulation] might differentially affect the levels ofvariousLp(a) isoforms. However, this is not sufficient to explain thewide variability observed. In fact, individuals with the sameLp(a) isoform may still show Lp(a) concentrations differingby one or two orders of magnitude (1). This suggests thattranscriptional mechanisms might be important in determin-ing apo(a) levels. Indeed, the level of Lp(a) has been shownto be proportional to that of apo(a) mRNA in monkey studies(15).The availability ofthe apo(a) gene will now allow definition

of several aspects of the genetics ofthe apo(a) gene and of itslinkage to predisposition to atherosclerosis. Multiple probesspecifically recognizing the apo(a) region can be derived fromthe flanking sequences and from the large first intron andused to define patterns of polymorphic sites (haplotypes) inlinkage disequilibrium with the gene and hence with itsmultiple alleles (molecular weight isoforms, high and lowexpressors). In addition, the promoter and other sequencescontrolling the restricted (mainly hepatic) expression of theapo(a) gene can be identified and further analyzed in differentindividuals and populations to define polymorphic variationand mechanisms for the widely different levels of Lp(a)expression.

We are grateful to Dr. D. Breviario for helpful suggestions. Thisresearch was supported by Progetti Finalizzati Genoma Umano,Ingegneria Genetica, and Biotecnologie e Biostrumentazione. L.B.was a recipient of a fellowship from Istituto Mobiliare Italiano; F.A.and P.M. received fellowships from Clonit SpA.

1. Utermann, G. (1989) Science 246, 904-910.2. Utermann, G., Kraft, H. G., Menzel, H. J., Hopferwieser, T.

& Seitz, C. (1988) Hum. Genet. 78, 41-46.3. Utermann, G., Duba, C. & Menzel, H. J. (1988) Hum. Genet.

78, 47-50.4. Brown, M. & Goldstein, J. (1987) Nature (London) 330, 113-

114.5. Scott, J. (1989) Nature (London) 341, 22-23.6. Kostner, G. M. (1976) in Low Density Lipoprotein, eds. Day,

C. E. & Levy, R. S. (Plenum, New York), pp. 229-269.7. Armstrong, V. W., Cremer, P., Eberle, E., Maruke, A.,

Schulze, F., Wieland, H., Krenzer, H. & Seidel, D. (1986)Atherosclerosis 62, 249-257.

8. McLean, J. W., Tomlinson, J. E., Kuang, W. J., Eaton, D. L.,Chen, E. Y., Fless, G. M., Scanu, A. M. & Lawn, R. M. (1987)Nature (London) 30, 132-139.

9. Lindahl, G., Gersdorf, E., Menzel, H. J., Seed, M.,Humphries, S. & Utermann, G. (1990) Hum. Genet. 84, 563-567.

10. Kamboh, M., Ferrell, R. & Kottke, B. (1991) Am. J. Hum.Genet. 49, 1063-1074.

11. Lackner, C., Boerwinkle, E., Leffert, C., Rahmig, T. & Hobbs,H. (1991) J. Clin. Invest. 87, 2153-2161.

12. Albertsen, H. M., Abderrahim, H., Cann, H., Dausset, J., LePaslier, D. & Cohen, D. (1990) Proc. Natl. Acad. Sci. USA 87,4256-4260.

13. Little, R., Porta, G., Carle, G., Schlessinger, D. & D'Urso, M.(1989) Proc. Natl. Acad. Sci. USA 86, 1598-1602.

14. Maniatis, R., Fritsch, E. & Sambrook, J. (1989) MolecularCloning:A Laboratory Manual (Cold Spring Harbor Lab., ColdSpring Harbor, NY), 2nd Ed.

15. Burker, D. T., Carle, G. F. & Olson, M. V. (1987) Science 236,806-812.

16. Petersen, T. E., Martzen, M. R., Ichinose, A. & Davie, E. W.(1990) J. Biol. Chem. 265, 6104-6111.

17. Rocchi, M., Roncuzzi, L., Santamaria, R., Archidiacono, N.,Dente, L. & Romeo, G. (1986) Hum. Genet. 74, 30-33.

18. Frank, S. L., Kusak, I., Sparkes, R. & Lusis, A. J. (1989)Genomics 4, 449-451.

19. Frank, S. L., Klisak, I., Sparkes, R. S., Mohandas, T., Tom-linson, J. E., McLean, J. W., Lawn, R. M. & Lusis, A. J.(1988) Hum. Genet. 79, 352-356.

20. Ichinose, A. (1992) Biochemistry 31, 3113-3118.

11588 Genetics: Malgaretti et al.