mot so qua trinh co ban cua sinh hoc phan tu

Upload: hohoansvvn

Post on 06-Jul-2015

915 views

Category:

Documents


1 download

TRANSCRIPT

Mt s vn ca sinh hc phn t V Th Hng LanNXB i hc quc gia H Ni 2007. 181 tr. T kho: ADN, GEN, genome, nhim sc th, ty th, lc tp, geomic, ADN, ti t hp, ngn hng cc ADNc, cDNA, ADN genome , phn ng PCR, k thut gen, phng php lai, Protein, tng hp protein, vn chuyn protein, tn hiu t bo, truyn tn hiu t bo, Th th tyrosine kinase, Protein G, sinh trng, pht trin, h gen lng bi, phi, chu trnh t vo, phn chia t bo. Ti liu trong Th vin in t H Khoa hc T nhin c th c s dng cho mc ch hc tp v nghin cu c nhn. Nghim cm mi hnh thc sao chp, in n phc v cc mc ch khc nu khng c s chp thun ca nh xut bn v tc gi.

Mc lcLI NI U........................................................................................................................ 5U

Chng 1 ADN V GEN ....................................................................................................................6

1.1 1.21.2.1. 1.2.2.

Khi nim v gen........................................................................................................ 6 Genome (h gen) ...................................................................................................... 10Genome ca t bo prokaryot (t bo nhn s) ............................................................... 11 Genome ca t bo eukaryot (t bo nhn thc) ............................................................. 13

1.31.3.1. 1.3.2.

Cu trc si nhim sc trong t bo eukaryot .......................................................... 14Histone trong cu trc nucleosome.................................................................................. 15 Methyl ho ADN ............................................................................................................. 17

1.41.4.1. 1.4.2. 1.4.3.

Cc gen trong genome eukaryot............................................................................... 18Cc gen trong cng mt h gen ....................................................................................... 20 Gen lp i lp li lin tc................................................................................................. 21 Pseudogen (gen gi)......................................................................................................... 23

1.51.5.1. 1.5.2.

Thnh phn ADN lp li trong genome eukaryot .................................................... 23ADN v tinh (satelitte DNA) v ADN tiu v tinh (minisatelitte DNA)......................... 23 Cc on ADN c kh nng di chuyn............................................................................ 24

1.6 1.7

Tng tc ca T-ADN vi genome thc vt............................................................ 29 ADN trong ty th v lc lp ..................................................................................... 32

1.7.1. 1.7.2.

ADN ty th ...................................................................................................................... 32 ADN lc lp..................................................................................................................... 33

1.8

Genomics.................................................................................................................. 33

1.8.1 So snh genome ............................................................................................................... 33 1.8.2 Genome ngi ................................................................................................................. 34 1.8.3 Nghin cu Genomics thc vt .................................................................................... 35 Chng 2 HOT NG CA GEN TRONG T BO ...............................................................38

2.12.1.1 2.1.2 2.1.3

Kim sot hot ng ca gen khi phin m ............................................................. 41Kim sot khi u phin m .......................................................................................... 42 Kim sot kt thc phin m ........................................................................................... 50 Cc protein iu khin (regulatory proteins) ................................................................... 51

2.22.2.1 2.2.2 2.2.3 2.2.4 2.2.5

Kim sot sau phin m ........................................................................................... 53Km hm dch m lin quan n cu trc vng 5'UTR ca phn t ARNm.................... 53 di ca ui polyA nh hng ti bn vng ca phn t ARNm......................... 54 bn vng ca ARNm ................................................................................................. 54 ARN anti-sense................................................................................................................ 55 Phn ng c sa ARNm - "RNA editing" ..................................................................... 56

2.3 2.4

Kim sot giai on dch m v sau dch m ........................................................ 57 Bin i phn t ARNm trong t bo eukaryot ....................................................... 59

2.4.1 Phn ng ct intron v ni exon ...................................................................................... 60 2.4.2 Cc intron c kh nng t ct ra khi phn t ARNm-Phn ng self-splicing ............... 62 2.4.3 Phn ng trans-splicing ni hai exon ca hai phn t ARNm......................................... 64 2.4.4 Cu trc chung ca phn t ARNm................................................................................. 64 Chng 3 K THUT ADN TI T HP....................................................................................66

3.1 3.23.2.1 3.2.2

Phn ct, phn ly ADN............................................................................................. 66 a cc on ADN vo vector ................................................................................ 67Cc vector s dng trong k thut tch dng .................................................................. 68 a ADN vo vector ....................................................................................................... 70

3.33.3.1 3.3.2

Ngn hng ADN....................................................................................................... 72Ngn hng cc ADNc (cDNA library) ............................................................................ 72 Ngn hng ADN genome (genomic DNA library).......................................................... 74

3.43.4.1 3.4.2 3.4.3 3.4.4

Sng lc mt dng t ngn hng ADN .................................................................... 76Phng php sng lc chung ........................................................................................... 76 Phng php sng lc phn bit "differential screening"................................................ 77 Phng php i dc nhim sc th chromosome walking ........................................... 78 Nhy bc trn nhim sc th jumping on chromosome ............................................. 80

3.53.5.1 3.5.2 3.5.3 3.5.4

Cc phng php lai................................................................................................. 80Phng php Southern blots............................................................................................ 81 Phng php northern blots............................................................................................. 82 K thut lai in-situ ........................................................................................................... 82 iu kin phn ng lai..................................................................................................... 82

3.6 3.73.7.1 3.7.2

RFLP trong nghin cu genome v lp bn gen ................................................. 83 Phn ng PCR (Polymerase Chain Reaction) .......................................................... 86Cc yu t nh hng n phn ng PCR....................................................................... 87 Mt s dng ca phn ng PCR ...................................................................................... 88

3.83.8.1 3.8.2 3.8.3 3.8.4 3.8.5

K thut gen ............................................................................................................. 89Nghin cu vai tr ca ADN iu khin, chc nng ca gen hoc protein..................... 89 Thay th hoc gy t bin gen ....................................................................................... 92 Gy mt hoc tng cng chc nng ca gen ................................................................. 93 Gen bo co reporter gene ........................................................................................... 96 Bin i genome thc vt ................................................................................................ 96

Chng 4 TNG HP V VN CHUYN PROTEIN................................................................98

4.1 4.2 4.34.3.1 4.3.2

Vai tr ca ARN vn chuyn (ARNt) trong tng hp protein ................................... 98 Tng hp protein b my Ribosome.................................................................... 100 Vn chuyn protein ................................................................................................. 102Vn chuyn vo mng li ni cht .............................................................................. 103 Vn chuyn protein cu trc mng (membrane proteins).............................................. 105

4.44.4.1 4.4.2 4.4.3

Bin i sau dch m v kim tra cht lng protein trong khoang ER................... 108To cu lin kt disulfide (S-S) v cun gp trong khoang ER..................................... 108 Hnh thnh cu trc multimer t cc chui peptide....................................................... 109 Qu trnh ng ho protein.......................................................................................... 109

4.5 4.6

Vn chuyn t mng li ni cht n Golgi v Lysosome.................................... 110 Vn chuyn t Golgi n b mt t bo: Con ng tit ngoi bo (exocytosis) .......... ................................................................................................................................ 110 Th th trn b mt t bo...................................................................................... 114 Th th ni vi protein G ....................................................................................... 117Protein G........................................................................................................................ 117 Hot ho hoc c ch cAMPase thng qua protein G ................................................... 119

Chng 5 TRUYN TN HIU T BO .....................................................................................112

5.1 5.25.2.1 5.2.2

5.3 5.45.4.1 5.4.2

Protein kinase ph thuc cAMP (cAPK hoc kinase A)........................................ 121 Th th tyrosine kinase v cc protein Ras ............................................................ 124Th th tyrosine kinase (RTKs)..................................................................................... 124 Protein Ras v chui cc phn ng truyn tn hiu hot ho bi th th tyrosine kinase ....................................................................................................................................... 127

5.55.5.1 5.5.2 5.5.3

Tn hiu th cp Ca+2 trong chui truyn tn hiu.................................................. 129Inositol phospholipid ..................................................................................................... 130 Inositol triphosphate (IP3) v s vn chuyn Ca+2 ra khi ER ................................... 130 Calmodulin- protein to phc vi Ca+2 trong t bo................................................. 132

5.6 5.75.7.1 5.7.2 5.7.3

Khuch i cc tn hiu bn ngoi t bo............................................................... 133 Truyn tn hiu qua cc th th ni vi enzym trn b mt t bo ........................ 135Th th guanylyl cyclase ............................................................................................... 135 Cc oncogene v tn hiu dn truyn t th th tyrosine kinase.................................... 136 Protein MAP kinase....................................................................................................... 136

5.8 6.1 6.2 6.3 6.4 6.5 6.6

Tyrosine kinase phi hp vi th th. Th th Tyrosine phosphatase................... 137 Nhng c tnh c bn ca chu trnh t bo........................................................... 139 Chu trnh t bo giai on pht trin phi sm ................................................... 143 Protein cyclin.......................................................................................................... 145 Nm men v h thng kim sot chu trnh t bo .................................................. 147 Kim sot phn bo ng vt .............................................................................. 150 Vai tr ca si vi ng tubulin trong phn bo........................................................ 152

Chng 6 CHU TRNH V PHN CHIA T BO....................................................................139

Chng 7 SINH TRNG V PHT TRIN ............................................................................154

7.1 Kim sot xc nh gii tnh .................................................................................. 155 7.2 Pht trin rui gim Drosophila .......................................................................... 158 7.3 Hot ng ca cc gen c ngun gc t m trong qu trnh hnh thnh trc u-ui v trc lng-bng................................................................................................................ 1597.3.1. 7.3.2. 7.3.3. 7.3.4. Nhm gen quyt nh pht trin ca phn u v ngc u th (anterior-group genes) . 160 Nhm gen qui nh pht trin phn ui (posterior-group genes)................................. 162 Nhm gen qui nh pht trin trc lng-bng (dorsoventral-group genes) ................... 162 Nhm gen qui nh pht trin cc cu trc tn cng ca u th (terminal-group genes)164

7.47.3.5. 7.3.6. 7.3.7.

Hot ng ca cc gen trong h gen lng bi (phi) ........................................... 164Cc gen to t "gap" .................................................................................................... 166 Cc gen cp t "pair-rule"............................................................................................ 166 Cc gen phn cc t..................................................................................................... 167

7.5

Cc gen chn lc .................................................................................................... 167

5

Li ni uVi mong mun chia s cng bn c mi quan tm v Sinh hc phn t, mt lnh vc ang c hc tp v nghin cu Vit Nam, chng ti xut bn cun sch "Mt s vn c bn ca Sinh hc phn t" nhm gii thiu nhng qu trnh quan trng xy ra trong t bo (trnh by trong chng 1, 2, 4, 5, 6 v chng 7) v mt s k thut c bn c s dng nghin cu nhng qu trnh (chng 3). Nhng qu trnh ny c nghin cu mc phn t phn no lm sng t s ging v khc nhau trong cu trc ca genome, cu trc ca mt gen gia t bo prokaryot v eukaryot (chng 1). Nhng cu trc lin quan n cc cch thc kim sot hot ng ca cc gen giai on phin m, sau phin m v dch m tng hp protein (chng 2). Qu trnh tng hp protein, nhng bin i cu trc protein v nhng cch thc nhn bit v vn chuyn protein c hiu n nhng v tr ch khc nhau trong t bo hoc tit ra bn ngoi c gii thiu trong chng 4. Ngoi ra, chc nng v hot tnh ca nhng protein tham gia qu trnh truyn tn hiu c trnh by trong chng 5; ca protein tham gia chu trnh t bo c trnh by trong chng 6 v nhng protein tham gia kim sot bit ho, pht trin, sinh trng v hnh thnh c th c gii thiu trong chng 7. c th hc c nhng kin thc chuyn su trong lnh vc sinh hc phn t, ti xin by t lng bit n su sc ti cc thy, cc c trong Khoa Sinh hc Trng i hc Tng hp H Ni (nay l Trng i hc Khoa hc T nhin, i hc Quc gia H Ni). ng thi ti xin chn thnh cm n Ph Gio s Trng Nam Hi v Gio s Nguyn Mng Hng c nhng nhn xt v gp qu bu cho cun sch. Ln u xut bn, chc chn cun sch cn c nhng thiu st, ti rt mong nhn c s ph bnh, gp ca bn c v ng nghip. Vi s cm n chn thnh! Tc gi

6

Chng 1 ADN V GEN1.1 Khi nim v gen

Tri qua mt thi gian di, cc khi nim v nh ngha v gen dn dn c hnh thnh da vo kt qu th nghim, trc ht l cc th nghim di truyn c in. u tin, t php lai gia cc cy u c nhng tnh trng khc nhau v theo di s di truyn ca chng, Menden a ra kt lun mi tnh trng c quyt nh bi cc allen ca mt gen. Mt gen c th c nhiu allen. Mc biu hin ca tnh trng ph thuc vo s kt hp gia hai allen. n gin nht l mt gen c 2 allen (Aa). Khi tnh trng c th biu hin 3 mc khc nhau: tri (AA), bn tri (Aa), hoc ln (aa). Tip theo , vi mt lot cc th nghim tin hnh trn rui gim Drosophila, Morgan v cng s nhn thy mt s tnh trng c quyt nh khng phi do cc alen ca mt gen m do nhiu gen. iu quan trng hn na, da vo tn s trao i cho gia hai nhim sc th tng ng trong qu trnh phn bo gim nhim (meiosis), Morgan c th lp c bn di truyn (genetic map) cho php xc nh v tr ca gen trn nhim sc th. Hai gen cng gn nhau th tn s trao i cho gia chng cng nh. Trn thc t, bn di truyn cho bit v tr ca nhng gen lin quan n cc tnh trng hoc cc t bin m khong cch gia chng c tnh bng tn s trao i cho (cM). Tuy nhin, trao i cho khng xy ra nh nhau mi v tr trn si nhim sc th khin cho khong cch gia cc v tr trn bn di truyn khng phi lc no cng t l vi tn s trao i cho. Trc nm 1940, v tr cc gen trn nhim sc th c xem nh cc ht cm trong mt chui. Trao i cho c xem l ch xy ra gia cc gen m khng th xy ra trong mt gen. V vy, t kt qu th nghim, cc nh di truyn a ra 3 c tnh xc nh mt gen: 1. Gen qui nh mt tnh trng c th quan st c v chim mt v tr trn nhim sc th. 2. Gen c xem l n v di truyn nh nht c th b t bin. 3. Gen c xem l n v di truyn nh nht m trao i cho khng th xy ra trong mt gen. Trao i cho c thc hin gia cc gen tng ng. T nhng c tnh ny, r rng hai tnh trng khng ging nhau c th phn bit c th phi do t nht hai gen khc nhau qui nh. R rng, khi nim v gen ban u ny ch cho php xc nh mi tng quan theo kiu mt t bin - mt tnh trng - mt gen. Trong thc t, vic xc nh tn s trao i cho tm ra v tr mt gen gp rt nhiu kh khn do phi sng lc cc c th t bin t s lng c th rt ln cc th h con chu qua cc php lai khc nhau. Mt khc, bng phn tch trao i cho, v tr cc gen c th c xc nh trn bn di truyn nhng khng phn nh c chc nng ring bit ca chng. Nhc im ny c khc phc nh th nghim b tr chc nng (complementation tests). V d, khi kt hp cc t bo nm men Neurospora dng n bi b t bin c cng mt biu

7

hin l mt kh nng mc trn mi trng thiu histidine, cc nh di truyn nhn c mt s t bo lung bi c th phc hi kh nng sinh si trn mi trng khng c histidine. Kt qu php lai gia cc dng t bo t bin cho php xc nh tnh trng ny lin quan n hai gen khc nhau trong con ng sinh tng hp histidine. Da vo tn s trao i cho, cc gen ny c v tr phn b nhng im khc nhau trn bn di truyn. Nh vy, th nghim b tr chc nng cho php phn bit tng gen trong nhm gen cng qui nh mt tnh trng. Cc nghin cu tip theo do cc nh di truyn Clarence P. Oliver v Melvin M. Green thc hin trn rui gim pht hin thy trao i cho c th xy ra ngay trong mt gen. Ni cch khc, mt gen c th cha nhiu t bin khc nhau. Nh di truyn hc Seymour Benzer xc nh c 199 v tr t bin trn gen rIIA thc khun th T4. c bit nh vo vic khm ph ra cu trc ADN, Charles Yanofsky v cng s ln u tin a ra bng chng r rng v trao i cho xy ra gia cc nucleotide ca mt gen khi nghin cu gen m cho enzym tng hp tryptophan E.coli. Nh cc kt qu c bit quan trng trn m khi nim v gen c hon thin hn. Lc ny gen c xem l mt on nucleotide mang m di truyn cho cc acid amin ca mt si peptide. T khi nim ban u cho rng mi gen l mt ht cm ca mt chui (chui chnh l nhim sc th trong genome) v trao i cho cng nh t bin ch xy ra gia cc ht cm th cc nh di truyn hc tm c mi lin h tuyn tnh gia cc m di truyn b ba ca mt gen vi trt t acid amin trn si polypeptide.

Hnh 1.1: Hai protein c m bi mt on ADN duy nht do im bt u (hoc kt thc) qu trnh phin m tng hp ARNm xy ra cc v tr khc nhau ngay trn on ADN to ra cc si ARNm khc nhau (A) hoc do im khi u dch m tng hp protein phn b cc v tr khc nhau trn mt si ARNm (B).

Tuy nhin, khi nim gen nu trn khng th gii thch cho mt s hin tng nh sau: a/ Hin tng cc gen gi ln nhau (overlapping genes): trn mt on ADN hai gen khng nm k tip nhau m gen ny nm gi u ln gen kia. Nh th, phn ADN c 2 gen nm gi ln nhau cha m di truyn cho c hai gen. C th xy ra cc trng hp sau:

8

* Hai phn t ARNm c phin m t cc v tr bt u hoc kt thc khc nhau trn mt on ADN. Kt qu l hai phn t protein (c dch m t hai si ARNm) c cha mt on acid amin ging h nhau mc d hai protein cc chc nng khc nhau trong t bo (Hnh 1.1A). * Mt phn t ARNm c phin m t mt on ADN c th dng lm khun tng hp hai chui polypeptide khc nhau do im bt u dch m (start codon) phn b lch nhau (hin tng lch khung c). Hai protein ny c th khc nhau hon ton v trnh t acid amin v chc nng trong t bo (Hnh 1.1B). b/ Mt on ADN mang m di truyn ca 2 gen nn c phin m tng hp nn 2 loi ARNm khc nhau. iu ny xy ra khi m di truyn phn b theo cc khung c khc nhau ngay trn on ADN (Hnh 1.2). Do , hai protein c thnh phn acid amin v chc nng khc nhau hon ton c tng hp. Mt t bin xy ra ti mt v tr trn on ADN ny c th gy nh hng n mt hoc c hai gen. iu gy kh khn cho vic xc lp bn tnh trng.

Hnh 1.2: Hai gen m cho hai protein cng nm trn mt on ADN do m di truyn ca hai gen ny phn b theo cc khung c khc nhau

c/ i vi sinh vt eukaryot, mt gen thng bao gm cc on nucleotide cha m di truyn (exon) xen k vi cc on khng cha m (intron). Cc exon v intron u c phin m sang phn t ARN (gi l phn t tin thn ARN thng tin-ARNm). Sau , cc intron s b ct b i, cc exon c ni li vi nhau theo ng th t to ra phn t ARNm hon chnh. C th xy ra trng hp hoc l ch mt s intron hoc l tt c cc intron u b loi i khi phn t ARNm. Mt khc c th xy ra hoc tt c cc exon hoc ch mt s exon c ni vi nhau. Vic la chn intron ct s to ra cc phn t ARNm khc nhau mc d chng u xut pht t mt loi ARNm tin thn c phin m t mt khun ADN (Hnh 1.3). y l hin tng ct ni intron-exon lun phin (alternative splicing).

9

Hnh 1.3: Qu trnh la chn, ct cc intron (I) v ni cc exon (E) theo cc th t khc nhau to ra cc phn t ARNm ch ging nhau mt s exon (E1 v E3). Chng m cho hai chui polypeptide c chc nng khc nhau trong t bo.

d/ Gen m cho polyprotein: Polyprotein l sn phm u tin ca vic dch m t mt phn t ARNm, nhng sau phn t protein ny b ct ra thnh cc on peptide nh hn. Phn t polyprotein khng c hot tnh. Ch c cc on peptide mi c cc chc nng khc nhau. V d, cc hormon adrenocorticotropic (ACTH), lipotropic (LPHs), hormon kch hot melanocyte (MSHs) v enkephalin c to ra t mt phn t proopiomelanocortin ban u (Hnh 1.4). Nh vy trn thc t, mt on ADN sao chp ra mt loi ARNm nhng c nhiu loi protein c to thnh. e/ Mt s gen khng mang thng tin di truyn cho protein: Mt iu r rng rng cc phn t ARN ribosome (ARNr), ARN vn chuyn (ARNt) u c sao chp t ADN nhng chng khng c dch m. Ngoi ra, trong nhn t bo eukaryot cn tm thy cc phn t ARNsn kch thc nh (small nuclear RNA) m nhim rt nhiu chc nng khc nhau nh tham gia vo vic bin i phn t ARNm (ct intron v ni exon), kim tra li thng tin di truyn trn chng (c ch c sa ARNm), tc ng n bn vng ca ARNm trong t bo cht hoc tham gia vo c ch bt hot gen (ARNi-interference RNA tm dch l ARN nhiu). Do , cc on ADN m cho cc loi ARN ny phi c xc nh nh cc gen bi l t bin trn chng u c th lin quan n vic xut hin cc tnh trng l.

Hnh 1.4: Phn t proopiomelanocortin c phn ct to ra cc hormon MSH, LPH, CLIP v -endorphin c hot tnh.

T cc khi nim v gen c hnh thnh v thay i dn ph hp vi cc kt qu th nghim, sinh hc phn t ngy nay nh ngha mt gen nh sau: Gen l mt on ADN cn thit cho s tng hp mt polypeptide c hot tnh hoc mt phn t ARN cn thit cho hot ng ca t bo. Nh vy, mt gen khng phi ch bao gm vng cha m di truyn (codon region) m cn gm cc on ADN (cc vng ADN iu khin (regulatory elements)) cn thit cho vic phin m (Hnh 1.5). Mt khc, c nhng on ADN c cu trc hay trnh

10

t nucleotide rt ging gen nhng chng khng c phin m hoc khng biu hin chc nng g nn chng khng th c xem l gen.

Hnh 1.5: Cu trc gen m cho protein t bo nhn thc (eukaryote gene). V tr nucleotide u tin c phin m sang phn t ARN c k hiu l +1. Nucleotide nm trc v tr +1 c k hiu 1 (khng c v tr 0). Cc nucleotide nm trc v tr ( +1) thuc vng promoter. Cc intron nm xen k cc exon. Intron b loi khi phn t ARNm bi phn ng ct ni intron-exon (spilicing). Chiu phin m c ch bng mi tn.

1.2

Genome (h gen)

Genome cha ton b thng tin di truyn lp trnh m bo hot ng sng cho t bo. a s genome vi khun phn b trn mt nhim sc th c kch thc nh v c dng vng khp kn. Ngc li, phn genome trong nhn t bo eukaryot thng rt ln v phn b trn cc nhim sc th dng thng. Thng tin di truyn khng ch nm trong trnh t nucleotide (genetic information) m ph thuc rt nhiu vo cu hnh khng gian ca nhim sc th (di truyn ngoi sinh- epigenetic information). Trnh t nucleotide ca ton b genome c xc nh i vi mt s sinh vt m hnh (model organisms) i din cho mi gii sinh vt nh vi khun E.coli, nm men, rui gim, giun trn, Arabidopsis v ngi. Bn m khong cch gia cc v tr c tnh bng n v nucleotide c xem l chnh xc nht. Bn ny c gi l bn vt l (physical map). Ngoi ra cn c mt s loi bn khc. V d, bn di truyn (genetic map) cho bit mi lin h v v tr ca cc nhm gen vi nhau hay ca cc ch th (markers) trn nhim sc th. Cc ch th ny c th l hnh thi (biu hin tnh trng), s a dng ca protein (protein polymorphisms), a dng di ca cc on gii hn (restriction fragment length polymorphisms-RFLPs), a dng di cc trnh t n gin (simple sequence length polymorphisms-SSLPs) v a dng cc on ADN c khuych i ngu nhin (randomly amplified polymorphic DNA-RAPD). Khonh cch gia cc v tr trn bn di truyn c tnh bng cM (centiMorgan) da vo tn s trao i cho. Hai v tr cng gn nhau th cng kh xy ra trao i cho gia chng trong phn bo gim nhim. Tuy nhin, trao i cho khng xy ra nh nhau mi v tr trn nhim sc th nn n v centiMorgan khng phn nh chnh xc khong cch gia cc v tr trn bn di truyn. Kt hp gia bn vt l v bn di truyn cho bit chnh xc khong cch gia cc gen (tnh trng), gia cc ch th phn t lin quan n nhng tnh trng cn nghin cu. Genome khng phi n thun l tp hp ca cc gen. Genome ca vi khun v sinh vt eukaryot bc thp thng khng ln v cc gen phn b st nhau. Hu ht cc gen ny ch c mt bn sao trong genome v rt t b gin on bi cc on ADN khng cha m di truyn (intron). Ngc li, thnh phn ADN cha cc gen ch chim mt t l rt nh so vi ton b genome trong t bo eukaryot bc cao. Cc gen trong t bo eukaryot bc cao thng cha nhiu intron v phn b xa nhau. T nhng nm 70, bng cc th nghim gy bo ho t bin, cc nh di truyn hc c th xc nh c s gen nm trn mt on nhim sc th.

11

Ngy nay cc k thut phn tch ADN hin i (cc php lai Southern, northern, microarray...), cho php xc nh s gen hot ng trong mt t bo. V d, t bo nm men (sinh vt eukaryot bc thp) c khong 4000 gen hot ng, cn t bo ng vt c v khong 10.000 - 15.000 gen. Nh vy, nu di trung bnh ca mt gen khong 10000 bp th tng s chiu di cc gen hot ng trong mt t bo cng ch chim 1-2% genome. Hay ni cch khc ch mt phn rt nh genome mang thng tin di truyn cn thit cho hot ng sng ca t bo. So snh kch thc genome ca mt s loi gn nhau trong bc thang tin ho (tc l c phc tp loi tng t nh nhau) cng nh genome ca nhng loi cch xa nhau (tc l c tnh phc tp khc nhau) cho thy kch thc genome khng phi lun lun t l vi tnh phc tp ca loi. V d, genome ca ngi c kch thc khong 3,3x109 bp, trong khi genome cc loi lng c di tng t c 3,1x109 bp hoc ca thc vt c th t n 1011 bp. C l no loi lng c li c tnh phc tp nh c th chng ta? Mt khc, ngay trong cng mt loi chng ta cng nhn thy s mu thun v kch thc genome. V d, rui sng trong nh (Musca domestica) c genome c 8,6x108 bp, ln gp 6 ln kch thc genome rui gim (D.melanogaster) vi genome c 1,4x108 bp. Ngoi ra, kch thc genome ca cc qun th lng c thay i t 109 bp n 1011 bp (khc nhau gp 100 ln). V sao ngay trong cng mt loi kch thc genome li bin thin nhiu nh vy ? Kt qu bc u so snh genome gia cc loi sinh vt vi nhau cho php rt ra ba c im ni bt. Th nht, cc gen phn b khng theo qui lut trong genome. Th hai, kch thc genome thay i khng t l vi tnh phc tp ca loi v cui cng l s lng nhim sc th cng rt khc nhau ngay gia nhng loi rt gn nhau. Nu phn tch chi tit i vi mt gen nht nh th v tr cc intron, cc exon, cc on ADN iu khin hot ng ca gen vv... u l nhng yu t quan trng so snh tm ra mi quan h gia cc loi. Ngoi ra, tng s gen ni chung, s lng cc gen c nhiu bn sao trong genome, t l cc loi ADN lp li v thnh phn ca chng cng nh s di chuyn ca cc gen t genome ring bit ca cc bo quan (ty th, lc lp) sang genome trong nhn u chu nh hng ca thi gian, u phn nh qu trnh tin ho ca cc loi. Mt khc, c c s so snh chnh xc hn, ton din hn, cn xt n cu trc si chromatin, cu hnh khng gian ba chiu ca nhim sc th cng nh ca ton b genome trong nhn.

1.2.1. Genome ca t bo prokaryot (t bo nhn s)Genome trong t bo prokaryot khng ln nn s lng genome ca cc loi vi khun c xc nh trnh t ngy cng nhiu. Nh cc thng tin d liu v cu trc h gen prokaryot, s phn b cc gen, cch thc kim sot hot ng cng nh chc nng ca chng ngy cng phong ph v tr nn r rng. Genome prokaryot c kch thc nh hn rt nhiu so vi genome eukaryot. Bn cnh nhim sc th cha phn ln thng tin di truyn, t bo prokaryot cn c nhiu loi plasmid. Trc y, plasmid c xem l nhng phn t ADN dng vng cha cc gen khng quan trng. V d, plasmid thng mang gen lin quan n tnh chng chu khng sinh. Do , t bo vn c th tn ti ngay khi thiu vng cc gen ny. Tuy nhin, khi nim plasmid c m rng ra khi thc nghim tm thy mt s t bo prokaryot c cha phn t ADN kch thc nh, dng thng v mang cc gen tng t nh plasmid dng vng. V vy, plasmid c hiu l nhng on ADN kch thc nh mang mt s gen khng quyt nh s sng cn ca t bo. Hn na, mt loi plasmid i khi c tm thy trong cc loi t bo prokaryot khc nhau. Mt khc, plasmid c kh nng bin np t loi t bo prokaryot ny sang loi khc. V vy, mc d c cha gen nhng plasmid dng nh khng c xem l mt phn ca genome.

12

Hu ht genome prokaryot nh hn 5 Mb (5.000.000 nucleotide) v thng c phn b trn mt nhim sc th dng vng. Mt s t bo prokaryot c genome l phn t ADN dng thng. c bit, mt s genome prokaryot l phn t ARN hoc kt hp c hai loi ADN v ARN. Ngoi ra, genome prokaryot c th bao gm cc gen phn b trn cc on thng ADN hoc trn c hai loi phn t ADN dng thng v dng vng. V d, nhim sc th dng thng c pht hin ln u tin Borrelia burgdorferi vo nm 1989. Nhim sc th ny di 910 kb gm 853 gen. Bn cnh , t bo Borrelia burgdorferi cn c ti 17 plasmid dng vng v dng thng vi tng chiu di l 533 kb lin quan ti 430 gen. Hu ht cc gen phn b trn plasmid khng quan trng, ch c mt s t gen cn thit cho qu trnh tng hp purine v protein mng t bo. Do , trong tng s 17 plasmid, mt vi plasmid cha cc gen ny c xem l mt b phn ca genome trong t bo Borrelia burgdorferi. Nhng dn liu thc nghim nhn c khi phn tch genome v cc plasmid Borrelia burgdorferi gy tranh ci gia cc nh sinh hc khi so snh genome Borrelia burgdorferi vi Treponema pallium. Theo phn loi, y l hai loi vi khun c quan h gn gi nhau. Ging nh a s cc t bo prokaryot khc, genome loi th hai l mt phn t ADN dng vng c kch thc 1138 kb vi 1041 gen. iu th v l khng mt gen no Treponema pallium tng ng vi cc gen phn b trn plasmid ca loi th nht. Phi chng cc plasmid va c t nhin bin np vo Borrelia burgdorferi? Genome prokaryot khng ng gi trong cu trc nucleosome (nh genome eukaryot) v khng c bao bc bi mng nhn. Nhim sc th dng vng c cu trc khng gian ging nh nhng cnh hoa ca bng hoa, mi cnh l mt on ADN c cu trc siu xon (supercoil). Cc cnh khng u nh nhau v c nh vo li protein. Genome vi khun c khong 40-50 cnh. Cu trc kiu bng hoa ny c gi l nucleoid (Hnh 1.6). Cu trc nucleoid gip genome ch chim mt th tch rt nh trong t bo. Ngoi ra, cu trc khng gian ny ca nhim sc th c duy tr nh cc phn t ARN kch thc nh tng tc vi protein. Do , ngay khi b t gy, cu trc siu xon ca nhim sc th cng ch m ra mt cch cc b cnh b tn thng ch khng xy ra trn ton b genome. Hai enzym ADN gyrase v ADN topoisomerrase gi vai tr chnh cng phi hp vi phc protein khc lm nhim v ng gi ADN vi khun. Thc nghim pht hin c t nht 4 protein tham gia phc ny, trong protein HU c chc nng tng t nh histone t bo eukaryot. Mc d c cu trc rt khc vi histone nhng HU dng tetramer to thnh li c qun quanh bi on ADN khong 60 bp. Nh vy, protein HU c chc nng tng t histone trong vic qui nh nghim ngt cu trc khng gian ca si nhim sc th. Tuy nhin, chng ta cha xc nh c cc li ny c phn b u n hay ch tp trung ti "nh hoa" nucleoid.

Hnh 1.6: M hnh cu trc nucleoid E.coli gm 40-50 vng siu xon kt nh vi li protein v si ARN. Khi c t gy xy ra mt vng siu xon, nhim sc th ch m xon cc b vng ny (theo Snustad v Simmons, 2000).

13

Nm 1995, genome Haemophilus influenzae l genome u tin c xc nh ton b trnh t. n nm 1998 c hn 18 genome vi khun khc c c hon ton. Trong s ny, Mycoplasma genitalium c kch thc nh nht gm 580.070 bp v Mycobacterium tuberculosis c kch thc ln ti 4.411.529 bp. E.coli c nghin cu chi tit nht v c xem l i tng m hnh ca di truyn, ho sinh v sinh hc phn t. H gen E.coli gm 4.639.221 bp vi 4.288 trnh t c cc c tnh cu trc ca gen m cho protein (putative protein coding sequences). Mt phn ba s trnh t ny c xc nh l cc gen trong khi 38% cha bit c chc nng. Cc trnh t nucleotide c gi nh l gen nhng cha bit sn phm protein m chng m cho th c gi chung l khung c m (open reading frame-ORFs). Mt khung c m thng bt u bi b ba m di truyn cho methynonine (start codon) v kt thc bi mt trong s ba m dng tng hp protein (stop codon). Mc d c kch thc nh hn nhiu so vi genome eukaryot, nhng genome prokaryot c mt phn b cc gen cao hn, s on ADN khng cha m di truyn t hn. Ni cch khc, khong cch gia cc gen ngn hn. V d, khong cch trung bnh gia hai gen E.coli l 118 bp. Cc gen v ORFs chim 87,8%; cc gen m cho ARNs chim 0,8%; cn thnh phn ADN lp li khng cha gen ch chim c 0,7%. Mt khc hu ht cc gen prokaryot u tn ti n bn (single-copy gen) v cc gen khng c intron. Kt qu so snh trnh t nucleotide ca ton b genome E.coli vi cc trnh t ADN lu tr trong ngn hng d liu cho php pht hin 6 gen mi m cho ARNt, 12 gen lin quan n sinh tng hp v lp rp roi cng nh 2 gen m cho cc enzym tham gia vo con ng phn hy cc hp cht hu c vng. R rng vic so snh trnh t h gen gia E.coli v cc sinh vt prokaryot khc c bit c ngha trong vic xc nh cc gen mi cng nh chc nng ca chng. Ngoi ra, khi so snh s lng gen tham gia vo mt qu trnh sinh hc cc vi khun khc nhau cho php nh gi s lng gen ti thiu cn thit cho qu trnh . V d, qu trnh trao i cht lin quan n khong 243 gen E.coli, 112 gen Haemophilus influenzae nhng ch cn n 31 gen Mycoplasma genitalium. Hn na, vic so snh s lng gen phn b trong nhng genome c kch thc nh nht nh M.genitalium, M. pneumoniae cho php nh gi c s lng gen ti thiu cn thit duy tr s sng cho c th n gin nht. M.genitalium c 470 gen v M. pneumoniae c 679 gen. So snh cc gen v chc nng ca chng hai loi vi khun ny cho php c tnh s gen ti thiu cn c duy tr s sng l 256 gen. Tuy nhin nh k thut di truyn phn t gy t bin nh hng chnh xc tng gen, thc nghim tng dn s lng gen cn b t bin v nhn thy t nht cn c 300 gen m bo s sng cho vi sinh vt n gin nht. Ngoi ra, vic so snh cc gen ging v khc nhau gia cc vi khun c quan h gn gi trong tin ho c bit c ngha xc nh nhng gen ring bit ca tng loi, tc l nhng gen ch th dng phn bit loi ny vi loi kia. V d, trong s 470 gen c M.genitalium th 350 gen cng tn ti Bacillus subtilis. Nh vy ch c 120 gen to nn s khc bit gia hai loi vi khun ny. Tuy nhin, nhng nghin cu v cch thc hot ng ca cc gen ring bit ny, chc nng ca tng sn phm protein m gen m cho cng nh cc qu trnh ho sinh m chng tham gia cha a n kt lun r rng v vai tr ca 120 gen c th cho M.genitalium.

1.2.2. Genome ca t bo eukaryot (t bo nhn thc)Genome ca t bo eukaryot bao gm cc nhim sc th phn b trong nhn v ADN phn b trong mt s bo quan nh lc lp, ty th. Tuy nhin, do hu ht s lng ADN cng

14

nh cc gen tp trung ch yu trong nhn nn ADN (nhim sc th) phn b trong nhn c cc nh sinh hc quan tm rt nhiu. Cc nhim sc th l cc phn t ADN lin kt vi protein, dng thng. Khng c mi lin h rng buc no gia ba thng s sinh hc: s lng nhim sc th, kch thc genome v tnh phc tp ca loi. V d, nm men S.cerevisiae c xem l sinh vt eukaryot bc thp nhng li c s lng nhim sc th nhiu gp 4 ln rui gim D. melanogaster. Ngoi ra, kch thc genome k nhng ln gp 30 ln h gen ca ngi nhng s lng nhim sc th ch bng mt na. Hn na, mt s nhim sc th c kch thc rt nh (cc nhim sc th mini) nhng c mt phn b cc gen rt cao. V d, h gen ca g gm 39 nhim sc th, trong 6 nhim sc th bnh thng chim 66% ADN nhng ch c 25% cc gen phn b trn 6 nhim sc th . Ba mi ba nhim sc th cn li u l nhim sc th mini chim 1/3 ADN v c ti 75% cc gen. Nhng so snh l th ny cho thy s b him gia tin ho v cu trc genome trong cc sinh vt khc nhau m hin ti sinh hc cha gii thch c. Kch thc genome trong nhn eukaryot thay i t 12 Mb (nm men S.cerevisiae) n 120.000 Mb (thc vt F.assyriaca). Genome bao gm thnh phn ADN khng lp li v ADN lp li. Phn ln cc gen phn b trong thnh phn ADN khng lp li v s lng ca chng tng cng vi tnh phc tp ca loi. Tuy nhin, iu c bit lu l tnh phc tp khng ch ph thuc vo s lng gen m cn c xc nh bi thnh phn ADN lp li. V vy, khng phi lun lun tn ti mi tng quan t l thun gia kch thc genome v tnh phc tp ca loi. V d, kch thc genome ca ngi khong 109 bp trong khi genome mt s loi lng c hoc thc vt c th t n 1011 bp. Genome ca ngi c gii m hon ton (2001) bao gm cc thnh phn ADN c trnh by trn hnh 1.7.

Hnh 1.7: Cc loi ADN trong genome ngi (theo Brown, 2001).

1.3

Cu trc si nhim sc trong t bo eukaryot

15

Trong nhn t bo eukaryot, ADN lin kt vi protein to ra cu trc gi l chromatin (si nhim sc). C th phn bit cc protein ny lm hai nhm chnh: histone v non-histone. Thnh phn protein non-histone thay i gia cc m, t chc, gia cc loi. Mi loi protein non-histone ch chim mt s lng rt nh so vi tng s protein non-histone hoc vi bt k loi protein histone no. Tuy nhin cc protein non-histone gi mt vai tr rt quan trng qui nh cu trc khng gian c th ca tng vng nhim sc th. Hot ng ca nhiu gen, c bit cc gen lin quan n pht trin phi, khng ch ph thuc vo trnh t nucleotide m cn ph thuc vo cu trc ca nhim sc th. Thng tin di truyn cha trong cu trc khng gian ca nhim sc th c gi l thng tin di truyn ngoi sinh (epigenetic information). S dng dung dch c lin kt ion yu, c th tch ra khi nhn t bo cc si nhim sc dng si n, ng knh khong 30 nm, gm cc ht nh ging nh chui ht cm (ng knh ht khong 10 nm). Cc ht nh ny c gi l nucleosome. Chng khng phn b ng u mi vng trn si nhim sc. Khi si ADN b ct bi nuclease, cc nucleosome tch ra ring bit. Mi nucleosome gm mt on ADN di 146 bp qun 2 vng quanh li protein cha 8 tiu phn ca 4 loi histone H2A, H2B, H3, H4. Phn u NH2 ca histone khng nm trong cu trc nucleosome m tn ti t do. on ADN nm gia hai nucleosome c gi l ADN ni (linker DNA). on ny c kch thc khong 50-70 bp. Protein histone H1 lin kt vi ADN linker nm gia 6 nucleosome v ng gi chng li thnh mt cu trc c bit gi l solenoid (ging nh mt bng hoa 6 cnh). Cc solenoid qun chng ln nhau thnh si xon. Nh cu trc c bit nn th tch ADN chim trong nhn gim i rt nhiu. Si ADN qun quanh li histone trong cu trc nucleosome v c ng gi trong cc solenoid c trt t rt cao. Mt iu th v c t ra l cc cu trc ny thay i nh th no khi mt on ADN c s dng phin m (to ARNm) hoc sa cha khi xy ra sai hng? Liu khi chng c b ph v tm thi nh trong qu trnh ti bn ADN hay khng? Cc phn t histone c gii phng hay vn lin kt vi ADN? Gii p nhng cu hi ny c nhiu kt qu khc nhau cho thy vic phin m khng nht thit yu cu ph v cu trc chromatin. Tuy nhin chc chn c xy ra nhng thay i trong cc tng tc protein-ADN, histone-histone, histone-non histone. iu c bit c ngha l vic thm bt cc nhm chc trn tng phn t protein tham gia lin kt to nucleosome khin cho cu trc nucleosome thay i. Li histone c th khng b phn r thnh cc tiu phn nhng b dch chuyn mt cch cc b trn si nhim sc bi cc protein iu bin chromatin (remodelling chromatin proteins). Cc protein ny gip cho vic tho g cc b si ADN khi cu trc nucleosome m khng nh hng n cc vng khc. ADN c gii phng ra khi nucleosome khng c ngha chng trng thi t do, v nh vy ADN rt d b ph hy bi cc tc nhn khc nhau trong t bo, c bit bi cc nuclease. Lc ADN thng lin kt vi cc protein c hiu (cc factor) cn thit cho s phin m hoc cc phc cn thit cho qu trnh sa cha ADN. Th nghim cho thy khi cc factor phin m tng tc vi ADN, chng c kh nng ngn cn histone lin kt vi ADN. iu ny c th l gii vic tn ti nhng vng tr vi nuclease nm xen cc v tr nhy cm trong mt gen. Khi protein bm vo ADN, ADN c bo v khi s phn ct ca enzym.

1.3.1. Histone trong cu trc nucleosomeMi nucleosome c 146 bp ADN qun hai vng quanh li histone gm 8 tiu n v 2x[H3-H4] v 2x[H2A-H2B]. Cc histone ca li c cu trc tng t nh nhau, gm cc on peptide (domain) tn cng u NH2 (N-terminal), domain chung gip histone gp khc

16

v phn tn cng COOH (C-terminal). Domain cn cho histone gp khc cn lin quan n tng tc gia cc histone v gia histone vi ADN. Cc nucleosome c gn vi nhau nh histone H1. Protein ny lin kt li histone vi ADN ni (ADN linker) nm gia nucleosome. di ca cc ADN linker khng c nh nh nhau. Histone H1 thit lp nn cu trc c trt t cao cho si nhim sc. Cc histone tham gia cu trc li u c phn u NH2 nm ngoi li, phn b t do theo cc hng khc nhau (Hnh 3-GT). Chiu di ca on phn b t do thay i t 16 n 44 acid amin (H3-44; H2B-32; H4-26 v H2A-16 acid amin). Cc on ny gi vai tr quan trng i vi s co c ca si nhim sc. Nghin cu ng hc qu trnh thay i cu hnh ca si nhim sc cho thy n c th tn ti ba dng: khng co c (unfolded), co c mc trung bnh (moderately folded) v co m c (extensively foded). on t do ca H3 v H4 cn thit si nhim sc c mc co c va phi. c bit on t do ca H3 khng th thay th c. di v v tr ra khi phn li ca on ny cn thit cho vic hnh thnh cu trc khng gian ca si nhim sc. Nh vy cng vi histone H1, cc on t do ca bn loi histone trong cu trc li u cn thit duy tr cu trc khng gian ba chiu cho nucleosome, duy tr trng thi co m c ca si nhim sc cng nh tng tc gia cc si nhim sc vi nhau. Ngoi ra, chng cn l cc v tr tng tc vi cc protein non-histone. Sau khi c tng hp, c bn loi histone u chu cc bin i nh ubiquitin ho, phosphoryl ho, glycosyl ho v c bit c ngha l qu trnh methyl ho v acetyl ho. Hu ht cc bin i ny xy ra vng N-terminal. Qu trnh phosphoryl ho v methyl ho c th tc ng qua li vi nhau, nh hng n s co c ca nhim sc th khi bc vo mitose. Ring trng hp ubiquitin ho xy ra phn ui C-terminal ca histone, gip cho cu trc nucleosome b ph v tm thi trong qu trnh ti bn hoc tng hp ARN. Nh vy, cc bin i ho hc ca histone tc ng n cu trc khng gian ca nucleosome v hot ng ca gen, trc ht qu trnh phin m. Cc histone trong cu trc li b acetyl ho ti cc acid amin lysine c hiu phn b phn N-terminal. Ngoi tr histone H2A, cc histone li khc thng c 4 n 5 v tr c gn nhm acetyl. Mt nucleosome c th c ti 26 v tr mang nhm acetyl. Acetyl ho histone c mt vai tr quan trng, quyt nh n cu trc cngi nhim sc. Nh si nhim sc khng co c, ADN c gii phng ra khi nucleosome, s tng tc gia cc nucleosome b ph hy, gy thay i lin kt gia cc domain N-terminal ca histone vi cc protein nonhistone, hoc lin kt gia cc protein vi ADN. Nhng bin i ny gp phn hot ho phn ng tng hp ARN. Ch cn 46% trong tng s 26 v tr c bit b acetyl ho cng ph v trt t cu trc ca si nhim sc v tng cng qu trnh sao chp ARN cc gen. Thng thng acetyl ho v kh acetyl histone lin quan n hot ha hay km hm hot ng ca gen. Mi loi histone c th c gn nhm acetyl nhng v tr c hiu bi cc enzym ring bit. iu gy ra nhng tc ng khc nhau n biu hin ca gen. Ngoi ra, qu trnh acetyl ho cn lm thay i cu trc ca phc iu bin chromatin (remodeling chromatin complexes). Phc ny c chc nng ph v tm thi cu trc li histone hay dch chuyn nucleosome trn si nhim sc. Chng thng tng tc vi vng N-terminal ca histone. Khi vng ny c mang nhm acetyl, phc iu bin chromatin c th lm cho cc histone H2A-H2B b di chuyn ra khi cu trc li nucleosome. Nh , cc promoter c bc l, cho php qu trnh tng hp ARNm c bt u. ng hc ca phn ng acetyl ho v kh acetyl rt linh ng, phc tp, ph thuc vo hot tnh ca cc enzym lin quan. Bin i thun nghch gia hai dng acetyl ho v kh acetyl ca histone ph thuc vo hai loi enzym histone acetyl transferase (HAT) v histone

17

deacetylase (HDAC) cng vi cc protein ng hot ha (coactivator) vi HAT hoc ng c ch (corepressor) vi HDAC. R rng, hai qu trnh acetyl ho v kh acetyl c tc dng ngc nhau trong vic lm thay i cu trc si nhim sc v hot ng ca cc gen. Cc enzym deacetylase HDAC lm gim mc acetyl ho histone, dn n km hm qu trnh phin m. Ngc li, enzym acetyl transferase HAT tng cng acetyl ho kch thch qu trnh phin m. Mt khc, cnh tranh gia hai phn ng acetyl ho v kh acetyl gip si nhim sc thay i cu trc linh hot, p ng kp thi vi tng cng hoc km hm hot ng ca gen. ng vt c xng sng, bn loi histone H2A, H2B, H3 v H4 t thay i gia cc loi. Tuy nhin protein H1 gm mt s dng c k hiu t H1a n H1e, H1t v H5. V tr phn b ca cc loi histone H1 ny cha c xc nh r rng. Mt khc trong cc t bo tinh trng, histone c thay th bi protein protamine. Hn na, histone c tnh kim do cu trc bc I ca chng c khong 20-30% arginine v lysine. y l cc acid amin tch in dng (+). Nh vy thay i in tch ca histone lin quan cht ch n kh nng tng tc vi ADN v bn vng ca tng tc v acid nucleic c in tch m quyt nh bi nhm phosphate.

1.3.2. Methyl ho ADNBn thn ADN cng chu cc bin i do gn thm cc nhm chc khc nhau. V d, hin tng methyl ho cytosine hoc adenine. Nhng thay i ny c tnh c th cho tng vng nhim sc th, tc ng n cu trc khng gian ca si nhim sc v tham gia kim sot hot ng ca cc gen. Nhng c th ring ca tng vng nhim sc th c di truyn cho th h sau. Sai lch trong cu trc khng gian ca si nhim sc c th lm xut hin tnh trng mi ngay khi trnh t nucleotide khng sai hng. Do , phn t ADN c cha hai dng thng tin: thng tin di truyn (genetic information) quyt nh bi trnh t nucleotide v thng tin ngoi sinh (epigenetic information) quyt nh bi tnh phc tp v cu hnh khng gian ca genome. Hin tng methyl ho xy ra vi c ADN prokaryot v eukaryot. S methyl ho ADN prokaryot c xem nh l mt c ch bo v h gen, trong khi eukaryot methyl ho ng vai tr quan trng trong dng thng tin th hai. chnh l mt trong cc c ch kim sot hin tng nh du DNA (DNA imprinting), tc l tnh trng ca gen c biu hin ph thuc vo ngun gc di truyn t b hay m. Cn lu DNA imprinting hon ton khc vi di truyn theo gii tnh. Hin tng nh du ADN s c xem xt chi tit phn sau. Methyl ho ADN c ngha c bit i vi hot ng ca gen eukaryot, nht l cc gen trong qu trnh hnh thnh pht trin c th. Phn ng methyl ho xy ra nhng v tr c hiu. Khong 2-7% ADN t bo ng vt b methyl ho. Hu ht nhm methyl c tm thy cytosine (C) phn b trong cp nucleotide CpG. T l cytosine b methyl ho thay i rt khc nhau gia cc loi. Hu nh khng pht hin c methyl-cytosine nm men S.cerevisiae. Khong 10% cytosine b methyl ho ng vt c xng sng v 30% thc vt. Ch n nm cui cng ca thp k 20 mi khng nh c c hin tng methyl ho ADN Drosophila. Tuy nhin ch c khong 0,4% ton b h gen rui gim b methyl ho. Hn na cytosine gn gc methyl nm trong cu trc CpT v CpA ch khng phi trong trt t CpG (nh i vi ng vt bc cao). c bit thc vt bc cao, hin phn ng methyl ho c th xy ra vi cytosine trong mi cu trc CpG, CpNpG v CpNpN, trong N = A, T hoc C.

18

Khi a cc gen b methyl ho hoc b kh methyl vo genome t bo nhn (th nghim chuyn gen) th ch nhng gen khng c nhm methyl mi hot ng. Mt khc, vng ADN khng c nhm methyl thng trng vi vng c cc v tr nhy cm ADNase. Thc nghim nhn thy rt nhiu gen khi ang phin m tng hp ARN u khng c nhm methyl vng cha promoter v exon th nht (u 5'), mc d cc exon tip sau v pha u 3' c cha nhm ny. R rng methyl ho c tc dng ngn cn gen hot ng. Ngc li, nu kh nhm ny th gen li c hot ho. Do phn bit vi CpG b methyl ho, cc cp CpG khng c gc methyl, lp i lp li nhiu ln pha trc u 5' ca gen khong 1-2 kb c gi l cm CpG (CpG island). Khong 56% cc gen trong genome ngi c phn b gn vi cm CpG. Nhng gen hot ng trong mi loi t bo (housekeeping genes) u c cm CpG khng b methyl ho. Tuy nhin, i vi cc gen c hiu (ch hot ng trong t chc chuyn bit) th phn ng methyl ho cm CpG ca chng c kim sot cht ch. Cm ny khng b methyl ho trong t bo cn n sn phm ca gen nhng li b gn gc methyl trong nhng t bo m gen khng biu hin. Nh vy mt gen hot ng, ngoi vic xut hin cc v tr nhy cm vi nuclease gn promoter, ADN vng cha gen cn b kh nhm methyl. Khi a ADN b methyl ho vo t bo, n tip tc b methyl ha khng ngng qua mi ln nhn i ADN. Ngc li, nu a ADN khng c nhm methyl vo t bo, chng khng b methyl ho sau mi ln ti bn. Phn ng gn nhm methyl vo cytosine c xc tc bi cc enzym methyltransferase. C th phn bit cc enzym ny thnh 2 nhm. Nhm th nht lm nhim v duy tr gc methyl nhng v tr cytosine trn si ADN va c tng hp trong qu trnh ti bn ADN. Vic gn gc methyl mi ny da vo nhm mCpG trn si khun. Chng c gi chung l cc enzym duy tr nhm methyl (maintenance methyltransferase). Nhm th hai gm cc enzym xc tc phn ng gn gc methyl vo v tr cytosine trn phn t ADN m v tr ny trc khng c nhm methyl. V d, gn nhm methyl vo cm CpG promoter khi cn km hm hot ng ca gen. Ngoi ra, qu trnh methyl ho cytosine trong trt t CpNpG hoc CpNpN i hi protein v cc phn t ARN kch thc ngn (20-25 nucleotide) nhn bit nhng cytosine . Nh , cu trc si nhim sc cng nh hot ng ca gen b thay i. Enzym demethylase c th m nhn phn ng kh nhm methyl. Tuy nhin, enzym ny cha c tm thy trong t bo ng vt. Kt qu nghin cu gn y (20022005) cho thy mt s enzym tham gia sa cha ADN c lin quan n vic loi b cytosine mang nhm methyl. Lc , on ADN cha mC b loi i v thay th bi cytosine khng mang nhm methyl.

1.4

Cc gen trong genome eukaryot

Mt trong nhng sai khc c bn trong cu trc gen gia sinh vt prokaryot v eukaryot l hin tng gen b gin on (interupted gene). Hin tng ny c khm ph ln u tin nm 1977 v c tm thy ph bin mi sinh vt eukaryot. K l l hin tng ny cng c pht hin mt s thc khun th (bacteriophage). Khi so snh trnh t nucleotide trn mt gen vi phn t ARNm c phin m t gen , cc nh khoa hc pht hin thy gen c cha nhng on khng mang m di truyn. Nhng on ny khng tm thy trong phn t ARNm c s dng lm khun tng hp protein. Chng c gi l cc intron. Nh vy bn cnh vic cha nhng on mang m di truyn (gi l exon), a s cc gen eukaryot cn cha cc intron. Mc d khng cha m di truyn v b ct i khi phn t ARNm, t bin xy ra intron c th ngn cn phn ng ni cc exon vi nhau, do to nn phn t ARNm sai hng khng s dng c dch m tng hp protein.

19

Khi phn t ARN c phin m t mt gen, n phi tri qua qu trnh loi b cc intron, ni cc exon vi nhau (phn ng splicing). Phn ng ct ni ny xy ra vi cc loi ARNm, ARNr v ARNt. to ra phn t ARN hon thin, vic ct intron, ni cc exon tun theo nhng qui lut nghim ngt v chnh xc m bo th t ca chng. iu th v l cc exon ca mt phn t ARNm c ni vi nhau. Him trng hp ni cc exon ca cc phn t ARNm khc nhau. Do cc intron khng mang m di truyn nn t bin xy ra trn chng thng khng c biu hin cu trc ca chui polypeptide. Tuy nhin cc t bin c th nh hng n phn ng splicing khi chng xy ra cc v tr cn thit ct ni intronexon. iu ng lu vi ADN ca ty th v lc lp, intron ca gen ny c th l exon ca gen khc v sn phm protein ca hai gen c chc nng hon ton c lp. Ngoi ra, mt s gen c phin m to ra ARNm nhng chng khng c dch m. Nhng phn t ARNm ny vn tri qua phn ng ct ni intron-exon to ra cc on ARN ngn. Chng tip tc c phn hu thnh cc phn t ARN kch thc nh 22-25 nucleotides (miRNAs: micro RNAs). Cc phn t miRNAs tham gia vo nhiu qu trnh kim sot hot ng ca mt s gen trong genome, ch yu qu trnh sau phin m. Trong c ch kim sot ny, miRNAs lm nhim v nhn bit ARNm ca mt s gen khc phn hy cc ARNm ny. y l mt c ch kim sot hot ng ca gen sau phin m c pht hin vo nhng nm cui thp k 20. sinh vt bc cao, cc gen m cho protein hay cc ARNt, ARNr hu nh u b gin on. di trung bnh ca exon khong 200 bp trong khi ca intron c th ln hn 10 kb hoc thm ch t ti 50-60 kb. Ngoi ra, hin tng cc gen nm gi ln nhau (overlapping genes) rt him xy ra ADN nm trong nhn t bo eukaryot. Hin tng ny hay gp trong genome prokaryot v vi gen phn b trong cc bo quan ca t bo eukaryot. Hn na, mt gen ny c th nm trong mt gen khc, tc l gen th hai c phn b trong intron ca gen th nht. V d, in hnh cho trng hp gen trong gen (genes-within-genes) genome ca ngi l gen m cho neutrofibromatosis loi I. Intron 27 ca gen ny c cha 3 gen khc, mi gen u c exon v intron ring ca mnh (Hnh 1.8).

Hnh 1.8: Cu trc gen trong gen intron 27 ca gen m cho neurofibromatosis. Intron 27 c cha 3 gen nh OGMP, EV12B v EV12A. Mi gen ny u c intron (I) v exon (phn sm mu).

C th phn loi cc gen ty theo cu trc ca gen hoc theo chc nng ca cc sn phm do chng m ho. Genome n bi cc c th a bo c khong 1/4 n 1/2 s gen m cho protein l cc gen n l (single copy gene), khng tn ti bn sao th hai. S gen cn li thng tn ti hai hoc nhiu bn sao trong genome. Cc bn sao ca mt gen khng bt buc phi ging nhau hon ton do trong qu trnh tin ho chng chu nhng t bin nh thm, mt, thay th hoc chuyn on cc nucleotide. Cc gen hnh thnh t mt gen t tin c xp vo mt h gen (family genes). Cc gen trong cng mt h c th tp trung thnh mt nhm (trn mt nhim sc th) hoc phn tn trong genome (trn cc nhim sc th khc nhau). Sn phm ca cc thnh vin trong mt h c chc nng ging ht nhau hoc c lin quan n nhau mc d cc gen ny thng hot ng nhng thi im nht nh v trong cc loi t bo bit ho khc nhau. V d, vic tng hp cc protein globin (c m bi cc gen trong cng mt h gen) xy ra nhng giai on nht nh trong qu trnh pht trin phi

20

thai v c th trng thnh. Ngoi ra cn c nhng trnh t nucleotide ging vi mt gen bit nhng trnh t khng c phin m hoc khng c dch m. Chng c gi l gi gen (pseudogen). Mt s gen gm nhiu bn sao ging ht nhau lp i lp li lin tc trn mt vng nhim sc th (tandem repeat genes). V d, gen m cho ARNr, ARNt, histone vv... Nh vy, cc gen eukaryot c th phn thnh cc loi chnh nh sau: gen n l, cc gen thuc mt h gen, gen lp i lp li lin tc v cc pseudogen.

1.4.1. Cc gen trong cng mt h genCho n nay, hu ht cc gen m cho protein c nghin cu sinh vt eukaryot u khng phi l nhng gen n l. Khong mt na cc gen bit trong genome ng vt c xng sng u c cc bn sao ging ht hoc tng t (s bn copy c th t 2 n 20). Hin tng tn ti nhiu bn sao ging hoc tng t ca mt gen c th gy ra do sai lch trong trao i cho gia hai nhim sc th tng ng trong phn bo gim nhim (meiosis). iu lm cho mt nhim sc th c s lng bn copy tng ln trong khi nhim sc th kia c s lng gim i (Hnh 1.9).

Hnh 1.9: Sai lch trong trao i cho gia hai nhim sc th (mi nhim sc th c hai bn sao ca mt gen) khin mt nhim sc th ch mang mt bn sao trong khi nhim sc th th hai mang ba bn sao.

Sn phm ca cc thnh vin trong mt h gen c chc nng ging nhau nhng thng c s dng nhng thi im pht trin khc nhau hoc trong cc loi t bo bit ho khc nhau. Trnh t acid amin ca chng ch tng t m khng ging nhau hon ton. Khi mt thnh vin trong h gen b bt hot, thnh vin khc c th c hot ho thay th mc d bnh thng thnh vin th hai khng hot ng cng vi gen ban u. Cc gen globin l th d in hnh v mt h gen (Hnh 1.10). mi loi ng vt, cc gen ny c cu trc tng t do chng c cng ngun gc t mt gen t tin. T bo trong c th trng thnh c globin tn ti dng tetramer gm hai chui polypeptide v hai chui . Cc gen m cho cc chui ny nm trn hai nhim sc th khc nhau. Do hot ng ca chng phi c phi hp ng thi sao cho s lng hai loi polypeptide c to ra mt cch tng ng vi nhau v mt s lng. T bo mu ca phi cng cha globin dng tetramer nhng gm hai chui tng t v tng t . Cc gen m cho chui v chui tng t u thuc mt h gen trong khi cc gen m cho chui v chui tng t thuc h gen khc. Ngoi ra, trong mi h cn c cc pseudogen (gen gi) v mt s thnh vin khc m sn phm ca chng i khi vn c s dng.

21

Hnh 1.10: H gen globin v ngi tp trung thnh cc nhm trn hai nhim sc th. Chng gm cc gen m cho globin v cc pseudogen (). Cc gen hot ng theo trnh t t tri sang phi ph hp vi qu trnh pht trin t phi n c th trng thnh.

H gen globin chim 28 kb trn nhim sc th 16, gm cc gen , 1, 2 v . Sn phm ca hai gen 1, 2 ging ht nhau. H gen globin chim 50 kb trn nhim sc th 11 gm 5 gen hot ng (, G, A, , ) v mt pseudogen . Sn phm ca hai gen ch khc nhau duy nht mt acid amin ti v tr 136 (Glicine v Alanine). Cc chui polypeptide lin kt vi nhau to ra cc dng globin khng ging nhau v c s dng nhng giai on pht trin khc nhau ca c th (Bng 1.1).Bng 1.1: Cc dng globin thay i trong qu trnh pht trin ngi Giai on pht trin Hemoglobin M phi (8 tun) 22, 22, 12 Thai nhi (3-9 thng) 2 2 C th trng thnh (t khi sinh) 2 2 (~ 2%), 2 2 (~97%), 22 (~1%)

Bn cnh h gen m cho globin tp trung ti hai vng trn nhim sc th 11 v 16, h gen m cho aldolase c xem l v d in hnh v s phn b ri rc ca mt h gen trn cc nhim sc th khc nhau. H gen ny gm 5 gen thnh vin phn b trn 5 nhim sc th 3, 9, 10, 16 v 17. Mc d phn tn trong khp genome, cc gen ny c tng ng rt cao v trnh t nucleotide cng nh trnh t acid amin tung ng.

1.4.2. Gen lp i lp li lin tcThng thng cc thnh vin trong mt h gen khng ging nhau hon ton. S sai khc gia chng m bo tnh hot ng c lp ca tng gen v c duy tr qua chn lc. Tuy nhin cng c mt vi trng hp c bit, s lng cc thnh vin trong h rt ln v chng ging ht nhau, thng tp hp thnh cc nhm phn b trn cc nhim sc th khc nhau. Mi nhm c th bao gm t hai cho n hng trm gen, gen n ni tip gen kia. Vic lp i lp li lin tip cc bn sao ca mt gen trn mt on ADN (trn mt vng nhim sc th) c th nhm mc ch p ng nhanh, s lng rt ln sn phm ca gen khi t bo yu cu, v d nh cn p ng kp thi cc phn t ARNr cho giai on sinh trng nhanh (phi) hoc cc loi protein histone cho qu trnh ti bn ADN. Gen m cho ARNr: ARN ribosome chim 80-90% tng s ARN c trong t bo. S gen m cho chng thay i t 7 E.coli, 100-200 eukaryot bc thp n vi trm ng vt bc cao. Trong nhn t bo eukaryot, hu ht cc gen m cho ARNr tp trung thnh tng nhm chim mt vng trn nhim sc th (vng ADNr). ARNr gm cc loi chnh ARNr-5S, ARNr-5.8S, ARNr-18S v ARNr-28S (tng ng vi hai tiu phn nh v ln ca ribosome). Phn t ARNr-5S c m bi gen ring bit v c tng hp bi ARN polymerase III.

22

Genome ca ngi c cha khong 2000 gen m cho ARNr 5S. Tt c cc gen ny u tp trung trn mt vng ca nhim sc th s 1. Ba loi ARNr 5.8S, 18S v 28S c tng hp t mt gen bi ARN polymerase I (Hnh 1.11). Mt phn t tin thn ARNr c phin m t gen, sau b ct bi cc ribonuclease to thnh cc phn t ARNr 18S, 5.8S v 28S. on nucleotide nm gia cc phn t ARN ny s b phn hy. Mi mt nhm gen m cho ARNr gm nhiu gen ging ht nhau, khong cch gia mi gen thay i tu theo loi, thm ch ngay trong cng mt loi. Genome ngi c khong 280 bn sao ca gen m cho ba loi ARNr, tp trung thnh 5 vng (mi vng c t 50 - 70 bn copy), phn b trn 5 nhim sc th 13, 14, 15, 21 v 22. ng vt c v, mi gen thng chim 13kb, nm cch nhau khong 30 kb. Khong cch ny c vai tr trong khi ng qu trnh tng hp ARNr hoc gip cho ARN polymerase d dng bm vo promoter.

Hnh 1.11: Mt n v phin m (mt gen m cho ARNr) mang thng tin di truyn cho cc phn t ARNr 18S, 5.8S v 28S Gen ny c lp i lp li lin tc. Khong cch gia cc gen thay i tu theo tng loi sinh vt.

Gen m cho protein histone: Protein histone tham gia lin kt vi ADN hnh thnh cu trc nucleosome. C bn loi histone khc nhau. Histone H2A, H2B, H3 v H4 tng tc vi nhau to cu trc li. Li ny c qun quanh bi on ADN 146 bp to thnh nucleosome. Histone H1 lin kt vi ADN linker nm gia cc nucleosome. Histone chim khong 0,5-1% tng s protein ca t bo eukaryot. Vic tng hp protein ny xy ra trong sut 1/3 chu k t bo ( pha S). Tuy nhin phn t ARNm histone c thi gian bn sng ngn (vi pht). C l v l do , c rt nhiu gen m cho histone (50-500) phn b thnh cc nhm trn nhim sc th. Chng nm ni tip nhau, mi nhm chim khong 5-6 kb ( ng vt c xng sng) (Hnh 1.12). Cng ging nh nhm gen m cho ARNr, khong cch gia cc gen trong cng mt nhm v gia cc nhm thay i gia cc loi, thm ch ngay trong tng c th. C th phn bit cc gen m cho histone thnh hai nhm. Nhm th nht gm cc gen m cho histone dng trong qu trnh ti bn ADN. Nhm gen ny khng c intron v phn t ARNm phin m t chng khng c ui polyA. y l iu khc bit vi cc ARNm eukaryot. Nhm gen th hai gm nhng gen m cho histone tham gia vo qu trnh bin i cu trc khng gian ca nhim sc th (lin quan n thng tin di truyn ngoi sinh). Cc gen thuc nhm ny c cha intron v phn t ARNm tng ng c gn ui polyA.

23

Hnh 1.12: Bn phn b cc gen m cho histone Cu gai (A) v Rui gim (B). Mi nhm c lp i lp li trn mt vng nhim sc th. Chiu tng hp ARNm cho mi loi histone khng ging nhau (chiu mi tn) chng t ngay trong mt nhm, cc gen hot ng c lp nhau.

1.4.3. Pseudogen (gen gi)Mi thnh vin trong mt h gen u c th hot ng ty thuc trng thi t bo. Tuy nhin c nhng thnh vin m khng bao gi pht hin c sn phm ca chng mc d chng ging ht hoc c trnh t nucleotide tng ng rt cao vi cc thnh vin khc. Nhng gen c gi l cc Pseudogen (tm dch l cc gen gi, thng k hiu l ). Pseudogen khng to c sn phm cui cng l protein, mc d chng c th c phin m tng hp ARNm. Cu trc pseudogen c th ch gm ton exon hoc gm cc exon v intron hoc c trnh t nucleotide ging ht hay tng t cc gen hot ng khc nhng khng c promoter. Thc nghim cho thy t bin xy ra cc pseudogen khin qu trnh phin m khng th khi ng c, hoc khin qu trnh tng hp ARNm dng khng ng ch, hoc ngn cn phn ng ct ni intron-exon to phn t ARNm. Thm ch ngay khi phn t ARNm c to ra, n cha cc tn hiu lm dng qu trnh tng hp protein sm hn cn thit. Hu ht cc h gen u c cc pseudogen, mc d vi s lng rt nh. Cc gen ny c th xut hin do sai lch trong trao i cho gia cc allen ca hai nhim sc th tng ng. Theo thi gian, cc t bin thm, bt, chuyn on hoc thay th nucleotide ngy cng tch t trn cc pseudogen. Ngoi ra khng th loi tr kh nng enzym reverse transcriptase tng hp phn t ADN trn khun mu cc ARNm v cc bn sao ADN ny c ghp vo genome. Do , pseudogen thng khng c promoter, khng cha intron, khng c cc on nucleotide 5 v 3 nm trc m khi u v nm sau m kt thc phn ng tng hp protein. Hai on trc v sau ny c gi l on khng dch m (5 and 3untranslated regions).

1.5

Thnh phn ADN lp li trong genome eukaryot

1.5.1. ADN v tinh (satelitte DNA) v ADN tiu v tinh (minisatelitte DNA)Bn cnh cc h gen v cc gen lp i lp li lin tip, genome trong t bo eukaryot cn cha nhng vng ADN gm cc oligonucleotide (thng t 5, 10 n 150, 300 bp) c lp i lp li rt nhiu ln. iu to ra nhng c tnh vt l ring bit ca loi ADN ny. Da vo ngi ta c th phn on v tch chng ra khi ADN genome. Chng c gi l cc

24

ADN v tinh (DNA satellite). T l ADN v tinh thay i gia cc loi chim t 10 n 30% h gen. Trong hu ht t bo ng vt c v, ADN v tinh thng tp trung xung quanh tm ng (centromere) v vng cui hai u nhim sc th (telomere). S phn b ca chng c vai tr nht nh trong qu trnh phn chia t bo v m bo di ca telomere qua cc ln ti bn ADN. Khi cc nhim sc th phn ly v hai cc trong phn bo, cc protein c hiu bm dnh vo nhng v tr c bit tm ng kim tra, iu khin s di chuyn . ADN v tinh gi vai tr ca nhng v tr c bit ny. Ni chung chng khng c phin m sang phn t ARN. Ngoi ra, ADN v tinh tm ng c nhn bn cui cng trong qu trnh ti bn nhim sc th. Rt c th hin tng lp i lp li ca mt loi ADN ti tm ng nhm ngn cn s xut hin tm ti bn ti v tr ny. cn trng ADN v tinh thng bao gm cc on nucleotide rt ngn (khong 5-15 bp), cn ng vt c v thnh phn ny a dng hn v thng phn b thnh tng nhm trn nhim sc th. ngi, c t nht hn 10 loi ADN v tinh. Mi loi c th chim ti 0,5-1% tng s genome, tng ng khong 107 bp. i vi tng c th ring bit, trong mi loi ADN v tinh, cc on oligonucleotide c th lp li hon ton chnh xc nh nhau hoc c th xy ra s thay th, loi b hay thm vo mt vi nucleotide. Tuy nhin nhng bin i ny ph thuc tng vng trn nhim sc th. Chc nng ca ADN v tinh phn b ri rc trong genome cha c sng t. Nhng nm cui ca thp k 20, sinh hc hin i chng minh c cc on lp li phn b gn hoc nm ngay trong gen c vai tr kim sot hot ng ca gen . Thng thng cc on ADN lp li khng c phin m. Chng b bt hot do cc cytosine v histone H3 b methyl ho lysine 9 nhng histone H4 b kh nhm acetyl. Khi cc oligonucleotide gm khong 25-50 bp c lp li nhiu ln chim mt on ADN t 1 n 5 kb, thm ch n 20 kb th chng c gi l ADN tiu v tinh (minisatellite DNA) hoc ADN lp li ngu nhin a hnh VNTR (variable number tandem repeat). Tng t nh ADN v tinh, vic tn ti ca ADN tiu v tinh c lin quan n cu trc nhim sc th bi v loi ADN ny thng bt gp telomere. Tuy nhin, chc nng ca ADN tiu v tinh phn b ri rc trong genome cha c lm sng t. Ngoi ra khi s nucleotide rt t (1-4 bp) c lp li nhiu ln thnh tng on khong 200 bp th chng c gi l ADN vi v tinh (microsatellite DNA). ADN vi v tinh thng bao gm 1 n 4 nucleotide lp li khong 10 n 20 ln. S lng loi ADN ny rt ln trong genome, v vy chng c dng lm ch th phn t trong vic xc nh v tr ca gen trn bn . V d, trong genome ngi, ADN vi v tinh CA (CACACA...) lp i lp li chim khong 0,5% (15Mb), trong khi s lp li ca mt nucleotide A (AAA...) cng chim n 0,3%. Mc d chc nng ca ADN vi v tinh cha c bit nhng chng c mt c mt ngha rt quan trng trong lp bn ton b genome. Trong mi mt qun th, cc ADN vi v tinh tng t nh nhau, tuy nhin s ln lp li cng nh nhng bin i trong mi loi ph thuc vo tng c th. Ni mt cch khc, mi loi tiu v tinh tn ti trong mi c th ca qun th, nhng s ln lp li cng nh cc bin i trong trnh t nucleotide li c trng cho tng c th. Tnh cht ny c p dng phn bit cc c th khc nhau v phn tch quan h huyt thng (k thut DNA-fingerpring...).

1.5.2. Cc on ADN c kh nng di chuyn

25

Tn s trao i cho gia cc ADN tiu v tinh ln hn khong 10 ln so vi trao i cho xy ra gia cc on nhim sc th tng ng trong phn bo gim nhim. l mt trong nhng nguyn nhn to ra s khc bit gia genome ca cc c th trong mt loi. Ngoi ra s a dng ca genome cn do cc on ADN c kh nng di chuyn (thng c gi l transposon). Khi di chuyn, cc transposon gy ra vic sp xp, t chc li genome ca tng c th nh to cc on ADN mi hoc thay i chc nng hot ng ca cc on ADN v tr chng ghp vo v tch ra. Chng c th di chuyn ti v tr bt k v hon ton khng yu cu mi quan h no gia hai v tr mi v c. Khi tch ra khi v tr c, transposon c th mang theo cc on ADN ph cn, gy s mt on ti v tr c. Ngc li, khi ghp vo v tr mi, chng gy ra hin tng thm on hoc chuyn on v tr mi. Do , transposon ging nh cc vector chuyn ch ADN t ni ny sang ni khc trong mt genome hoc t genome ny sang genome khc. Ngoi ra, trao i cho gia cc transposon tng ng hai v tr khc nhau trn mt hoc trn hai nhim sc th cng to ra nhng bin i tng t. Nhng bin i dn n sp xp li genome, to tnh a dng gia chng v tnh c th ring ca tng c th. c bit, s thay i v tr ca cc transposon cn c th gy nh hng n hot ng ca cc gen phn b xung quanh ngay khi chng khng lm thay i trt t nucleotide nhng gen ny. Do hot ng ca cc gen lin quan n s di chuyn ca transposons (thng l cc gen nm trong transposon) c kim sot rt cht ch. C ch kim sot ch yu thng qua bin i cu trc khng gian vng nhim sc th cha transposon nh methyl ho ADN, methyl ho histone H3, deacetyl histone H4 vv.... Cch thc di chuyn v ghp vo genome ca cc on ADN c bit ny tun theo hai cch lin quan n dng trung gian ADN hoc ARN. Nhng on ADN no m s di chuyn ca chng gn lin vi dng trung gian ARN c gi l retroelement hoc ADN retrotransposon. Vic di chuyn ca retroelement xy ra tng t vi cch thc xm nhim ca virus m genome ca chng l phn t ARN (nhng virus ny c gi l retrovirus). Mt khi xm nhim vo t bo, ARN ca retrovirus c sao chp bi reverse transcriptase to ra ADN. Phn t ADN ny s c ghp vo genome ca t bo ch. Khi virus sinh si, phn ADN li c dng phin m to ra cc phn t ARN mi cn thit cho vic ng gi to virus mi. Trong s cc loi retroelement, cn lu n yu t ERVs (endogenous retrovirus) v cc retrotransposons. Chng u l nhng on ADN c kh nng di chuyn trong genome. Tuy nhin ERVs c chung mt c im l hai u c tn cng bi hai on nucleotide lp li vi kch thc ln (long terminal repeat-LTRs). LTRs gi vai tr quyt nh trong qu trnh di chuyn. Ngoi ra, retrotransposons bao gm cc yu t LINEs (Long Interspersed Nuclear Elements) hoc SINEs (Short Interspersed Nuclear Elements) l nhng on lp li di hoc ngn phn b ri rc trn cc nhim sc th. Yu t LINEs khng cha LTRs nhng c mang gen m cho reverse transcriptase trong khi SINEs khng c gen nhng c kh nng "vay mn" enzym ny do cc retroelements khc tng hp. Trong genome ca ngi, yu t LINE-1 c ti 3500 bn sao di nguyn vn 6,1 kb v hng trm nghn bn sao c kch thc ngn hn. Bn cnh trnh t Alu gm hng triu bn sao l v d in hnh ca yu t SINEs. Mc d phn t ARN c tng hp t Alu nhng sn phm protein khng c to thnh. D sao s tn ti ca cc ARN ny cng lm tng c hi gip Alu ghp vo genome. Cc transposon ADN c kh nng thay i v tr trong genome eukaryot khng qua dng trung gian ARN chim t l t hn so vi cc retroelement. V d, genome ngi, ch c

26

khong 100 loi ADN transposon. Tuy nhin, ADN transposon c mt ngha c bit quan trng i vi s a dng ho genome. Mt s transposon c mt trong genome ca cc loi sinh vt khc nhau. V d, yu t mariner c chiu di 1250 bp c tm thy rui gim Drosophila cng nh rt nhiu ng vt khc, k c ngi. Phi chng cc transposon ny c thin chc t nhin trong tin ho l chuyn ch gen gia cc genome khc nhau? Cc transposon c chung c im l hai u tn cng ca mi transposon c cha hai on oligonucleotide lp li ngc chiu (inverted repeats). Cc transposon c th chia lm hai loi da vo kh nng di chuyn c lp hay phi ph thuc vo s c mt ca transposon khc. *Loi th nht gm cc on ADN c kh nng di chuyn c lp. Chng cha gen m cho cc protein iu khin qu trnh , v d enzym nhn bit hai u transposon ct chng ra khi v tr c v ghp vo v tr mi. Do , chng tch ra khi v tr c, ghp vo v tr mi hon ton c lp. Nh kh nng ny, chng to ra cc t bin khng bn vng. *Loi th hai gm cc transposon khng c kh nng t hot ng, tc l chng khng c kh nng di chuyn do khng cha gen m cho cc enzym cn thit. Vic di chuyn ca transposon loi ny ph thuc vo s c mt ca transposon c kh nng hot ng c lp (transposon nhm 1) cng nhm. Hai transposon c th xp vo cng nhm khi chng c cu trc tng ng vi nhau, c bit l cc on oligonucleotide phn b hai u transposon. y l v tr enzym nhn bit v ct ni transposon v tr c v mi. Khi cc transposon loi ny di chuyn, chng to ra nhng t bin bn vng nu nh trong th h ni tip chng phn ly c lp (phn ly theo nh lut Mendel) vi transposon c kh nng hot ng c lp cng nhm. Cc transposon n gin nht vi khun c gi l on gn IS (Insertion Sequences). Chng c th nm trn chromosome hoc trn cc plasmid. din t vic ghp ca IS vo v tr no , k hiu hai ln du hai chm c s dng (::). V d, :: IS1 m t transposon IS1 gn vo genome ca bacteriophage . Transposons vi khun khng gi mt chc nng no trong t bo. Trnh t nucleotide mt u IS thng lp li nhng ngc chiu so vi u kia. Hai trnh t hai u mt IS c gi l trnh t lp li ngc chiu (inverted repeat). V d, cu trc ca mt IS c trnh t nh sau: GGTAT-Xn-ATACC (trong n l s nucleotide nm gia hai u lp li ngc chiu). Do khi si p IS tch thnh hai si n th mi si ny c kh nng hnh thnh lin kt b sung ti hai u ca IS to cu trc dng vng (stem-loop) (Hnh 1.13).

Hnh 1.13: Cu trc dng vng c to ra do lin kt to cp b sung gia hai trnh t lp li ngc chiu ca mt IS trn mt si n ADN.

Ngoi cc IS, vi khun cn c cc on ADN c kh nng di chuyn vi kch thc di hn, gi l transposon Tn. Cc Tn thng phn b trn plasmid (phn t ADN dng vng, kch thc thng khng ln) v c kh nng ghp xen vo bt k v tr no trong genome. Chng thng mang thng tin di truyn m cho cc protein chng chu khng sinh.

27

Gia IS v Tn c mi quan h v trnh t cc nucleotide. Cc Tn thng c gii hn hai u bi mt loi IS no .

Hnh 1.14: Cu trc ca transposon Tn-9.

Hnh 1.14 m t cu trc ca transposon Tn-9. Transposon ny mang hai gen; mt m cho tnh chng chu chloramphenicol (Rch) v gen kia m cho protein cn thit cho s di chuyn. Hai u ca Tn-9 c gii hn bi IS-1 m trnh t nucleotide ca IS ny sp xp theo cng mt chiu. Mt s transposon cha gen m cho cc enzym transposase lm nhim v nhn bit chui nucleotide lp li ngc chiu (inverted repeat) ct transposon. ADN ca v tr mi b ct sao cho mi si n lch nhau vi nucleotide (ct thnh u so le). Transposon ni vo cc u ct, to ra hai khong trng (gaps). Khong trng c sa cha theo nguyn tc to cp b sung. Do cc nucleotide ca u so le v tr mi c sao chp thnh hai bn, mi bn mt u v trnh t sp xp cc nucleotide ging nhau. V vy chng c gi l lp li cng chiu (direct repeat) (Hnh 1.15). Chiu di ca chng thng khong 7-9 bp. Da vo s c mt ca cc on cng chiu v ngc chiu c th xc nh c v tr transposon ghp vo hoc chuyn i.

Hnh 1.15 : Mt transposon c hai u tn cng gm 7 nucleotide (1234567) lp li ngc chiu, gn vo v tr c 5 nucleotide (ATGCA) trong genome. Sau khi ghp ni, on ngn ATGCA c lp li nhng sp xp theo cng mt chiu.

Qu trnh di chuyn ca mt transposon t v tr c (donor) sang v tr mi (recipient) xy ra theo hai c ch khc nhau: C ch sao y bn chnh (transposon c mt c hai v tr) v c ch tch ra khi v tr c di chuyn n v tr mi. Trong c ch th nht, trnh t nucleotide

28

ca transposon c sao chp t v tr cho v c ghp vo v tr nhn. Nh vy mi ln di chuyn th s lng bn sao c tng ln. Qu trnh ny lin quan n hai loi enzym: transposase (tc ng vo hai u bn gc transposon) v resolvase (tc ng ln bn sao). Trong c ch th hai, mt transposon c th tch ra khi v tr c v ghp vo v tr mi. Nh vy s lng transposon khng thay i. Kiu di chuyn ny ch i hi enzym transposase. Khi transposon chuyn i, v tr c b gy. N c ni li nh c ch sa cha ADN trong t bo. sinh vt eukaryot, cc transposon cn c gi l yu t kim sot (controlling elements). Chng c nghin cu t nhng nm 1940. Tuy nhin c ch hot ng ca chng mc phn t ch mi c sng t trong nhng nm gn y. Cc nghin cu in hnh c tin hnh vi transposon ng v rui gim Drosophila. Transposons di chuyn, sp xp v khi ng cc gen nhng thi im c trng cho qu trnh sinh trng pht trin ca c th. Hai loi transposon Ac v Ds c nghin cu kh k ng. Chng cng thuc vo mt nhm transposon, u c hai trnh t lp li ngc chiu ging nhau. Di chuyn ca cc transposon Ds ph thuc vo s c mt ca Ac. Trnh t nucleotide ca Ac gm 4563 bp, c gii hn hai u bi 11 bp lp li ngc chiu, tip n 8 bp lp li cng chiu ca genome. Mi Ds u c on lp li ngc chiu ging nhau mc d chiu di ca chng thay i (Hnh 1.16).

29

Hnh 1.16: Cu trc ca transposon Ac/Ds. Cc Ds c chiu di khc nhau (do Ac b t bin mt on) hoc c th cha on ADN hon ton khng tng ng vi Ac, hoc c th nm xen vo nhau. Tuy nhin tt c cc transposon ny u c gii hn bi 11 bp lp li ngc chiu.

Cc transposon ng thng ghp vo gn cc gen, lm ri lon hot ng ca chng dn n vic xut hin tnh trng mi nhng khng gy t bin cht. S di chuyn ca transposon ghp vo v tr allen ca mt gen bt k trn nhim sc th xy ra t bo soma s tc ng n biu hin ca allen trong qu trnh pht trin ca cy. Tri qua phn bo nguyn nhim (mitose), con chu ca t bo cha allen t bin s c biu hin tnh trng mi (thng quan st c hnh dng, mu sc ca ht ng). Thay i ny xy ra trong qu trnh pht trin soma c gi l "variegation" hay cn gi l hin tng mosaic (xut hin cc m). rui gim Drosophila melanogaster, yu t P c kh nng di chuyn c pht hin khi tin hnh lai gia con c dng P vi con ci dng M. Hu ht con lai b bt dc, nhim sc th b t gy, b t bin. Hin tng ri lon di truyn ny ch xy ra theo mt chiu, tc l php lai gia con ci dng P vi con c dng M vn to ra cc con lai bnh thng. Hin tng ny gy ra do genome ca cc c th thuc dng P c cha yu t di chuyn P. Yu t di nht gm c 2907 bp c cha gen m cho transposase. iu ng ch l mc d c chiu di khc nhau, cc yu t P u c mang cc trnh t nhn bit bi transposase. Quan st qun th rui gim trong thin nhin cho thy s lng P thay i t vi bn sao n 50 copy/genome. Hn na, nhng loi rui gim pht hin trc nm 1950 u khng c P trong genome. Phi chng P ch mi xut hin trong genome rui trong nhng nm cui th k 20. Liu s c mt ca chng c phi do virus xm nhim rui gim gy nn? Hin tng tng t cng c quan st thy vi khun b nhim thc khun th mang IS. Yu t IS xut hin trong genome vi khun thng qua qu trnh tip hp (transduction). C ch kim sot s di chuyn ca P ph thuc vo yu t tn ti trong t bo cht ca trng (di truyn theo m). Khi yu t ny c mt th chng km hm s di chuyn ca P. V vy, t bo trng ca con ci dng P th tinh vi con c dng M vn cho con lai bnh thng do yu t trong t bo trng ngn cn P chuyn ch. Tuy nhin, t bo trng dng M th tinh vi con c dng P cho php P di chuyn gy ra nhng ri lon bt thng trong cu trc genome. iu khin con lai b bt dc hoc xut hin cc tnh trng l.

1.6

Tng tc ca T-ADN vi genome thc vt

30

S di chuyn ADN t genome vi khun sang genome thc vt c nghin cu kh k i vi tng tc gia Argobacterium tumefaciens hoc A.rhizogenes vi hu ht cc cy hai l mm. Hin tng di chuyn ADN ny gy nhng bin i v mt di truyn, biu hin vic xut hin cc nt sn trn thn cy hoc mc rt nhiu lng r ti ni b nhim vi khun. Bnh xut hin nt sn hoc mc nhiu r trn thn ch xy ra khi c mt Argobacteria. Tuy nhin sau bnh c duy tr khng ph thuc s tn ti ca vi khun. l do mt s gen vi khun c chuyn vo genome cy ch v hot ng gy bnh. Cc gen vi khun c kh nng di chuyn v hot ng trong t bo thc vt nm trn plasmid Ti (Tumor inducing) ca A.tumefaciens gy bnh nt sn hoc trn plasmid Ri (Root-hairs inducing) ca A.rhizogenes gy bnh mc lng r. Cng ging nh cc khi u ng vt, cc t bo thc vt c ADN vi khun ghp vo genome b chuyn sang trng thi mi, s pht trin v bit ho ca chng hon ton khc vi cc t bo bnh thng. l do hot ng ca cc gen vi khun (prokaryot) trong genome ca thc vt (eukaryot). Bnh thng nhng gen ny c mt trong genome vi khun nhng chng ch c bt m sau khi ghp vo genome thc vt. Qu trnh ny c tnh cht c hiu, tc l mt loi vi khun ch c kh nng gy nt sn trn mt s loi cy ch ny m khng tng tc c vi cc loi cy khc. Vic to nt sn hay thc cht qu trnh chuyn gen t vi khun sang genome thc vt dn n bin i trng thi sinh l ca t bo thc vt i hi cc iu kin sau: a/ Phi c hot ng ca cc gen trn 3 vng chvA, chvB, pscA nm trn nhim sc th ca vi khun khi ng vic bm dnh vi khun vo thn cy. b/ Plasmid Ti phi mang vng vir - ADN (nm ngoi on T-ADN). Vng ny mang cc gen cn thit cho vic tch v vn chuyn T-ADN t vi khun sang t bo thc vt. Vi khun xm nhim vo t bo cy ch ti v tr tn thng trn thn cy. Cy c vt thng do s h hng ngu nhin ca mng t bo thc vt hoc do vi khun tit ra hn hp nhng cht c m bi cc gen vir. Hot ng ca cc gen ny c hot ho bi hp cht phenolic ca cy (v d nh acetosyringone, catechol, cc dn xut ca chalcone...). Ngoi ra, s c mt cc monosaccharides nh glucose, arabinose trong mi trng cng khin cho nhm gen vir ca vi khun nhy cm hn vi cc hp cht phenolic do cy tit ra. Sn phm ca nhng gen trn vng vir cn lin quan ch yu n vic ct T-ADN ra khi plasmid v vn chuyn n vo t bo ch. Bng cc th nghim b sung chc nng (complementation test), thc nghim pht hin t nht c 21 polypeptide sn phm ca cc gen vir cng nh xc nh c chc nng ca hu ht cc protein ny trong qu trnh vn chuyn T-ADN. Protein VirA ng vai tr quan trng trong vic qui nh tnh c hiu gia cc loi cy ch vi Agrobacteria. Trong thc t, Agrobacteria khng c kh nng xm nhp vo cy mt l mm. C th protein VirA khng nhn bit c cc tn hiu do cy mt l mm tit ra. Protein VirC1 nhn bit v tng tc vi cc nucleotide nm u bn phi ca T-ADN. Mc d hai u T-ADN c trt t tng i ging nhau (ch sai khc nhau 2 nucleotide trong tng s 25 nucleotide cn thit cho s vn chuyn T-ADN) nhng cc nucleotide u bn phi gi vai tr quyt nh ct TADN ra khi plasmid. t bin u ny khin T-ADN khng c ct ra khi plasmid trong khi t bin u bn tri hon ton khng nh hng n qu trnh vn chuyn T-ADN t t bo vi khun vo trong nhn t bo cy ch. iu cho thy vic ct T-ADN c bt u pha bn phi v tin dn sang bn tri. iu c bit lu l ch c mt si n TADN c ct ra v vn chuyn sang t bo thc vt. Si n c gi l si T. u 5' ca si T tng ng vi u bn phi ca on T-ADN. Cc protein VirD1 v VirD2 lin quan n phn ng ct si T ra khi plasmid. Tip theo , protein VirE2 tng tc vi si T dc theo chiu di ca si. Protein VirD2 gi vai tr quan trng trong qu trnh vn chuyn.

31

Cu trc VirD2 gm nhiu vng c hot tnh khc nhau, lin quan n cc chc nng nh ct, vn chuyn si T. Bn cnh vic tham gia phn ng ct si T ti u bn phi ca T-ADN, protein VirD2 cn lin kt vi u 5' ca si ny to thnh phc. Nh T-ADN c vn chuyn di dng phc ra khi t bo vi khun v xm nhp vo nhn t bo cy ch. Bng cc th nghim trn cy chuyn gen, ngi ta pht hin c protein VirD2 c mt trong nhn t bo thc vt. Vn chuyn si T ra khi t bo vi khun v ghp vo genome t bo cy ch l mt qu trnh phc tp, i hi s tham gia nhiu protein. Trong s , operon virB nm trn vng vir gi mt vai tr c bit. Operon ny di 9,5 kb m cho 11 proteins, a s l cc protein tit hoc phn b trn mng t bo. Chng bao gm ATPase (VirB11) v cc protein k nc to nn knh dn trn mng. Cc nh nghin cu cho rng mt trong cc protein c m bi operon ny phn b pha ngoi mng lm nhim v tng tc vi protein ca t bo thc vt, to knh dn a T-ADN vo nhn t bo cy ch. c/ Cc gen trn vng T-ADN c ghp vo genome t bo thc vt gy bin i trng thi cc t bo ny. T-ADN l mt on ADN c chiu di khong 23 kb (tu thuc vo tng loi A.tumefaciens) nm trn plasmid Ti. Hai u ca on ADN ny c cha 25 bp lp i lp li ging nhau hon ton ch sai khc nhau hai nucleotide (imperfect repeat sequence). Cc nucleotide u bn phi gi vai tr quan trng trong vic ct T-ADN. Cc nucleotide u bn tri ng vai tr trong vic ghp T-ADN vo genome cy ch. T-ADN gm hai nhm gen. Nhm th nht gm cc oncogen m c ch hot ng ca chng khc bit gia A.tumefaciens v A.rhizogenes. iu dn n s hnh thnh cc nt sn hoc bnh lng r. Trong trng hp xut hin nt sn, T-ADN mang ba oncogen m cho cc enzym tham gia vo phn ng tng hp cc hocmon sinh trng auxin v cytokinin. Ch khi T-ADN c ghp vo genome thc vt, cc oncogen nm trn T-ADN mi hot ng mt cch t ng. Do t bo cy ch no c T-ADN ghp vo h gen lp tc pht trin khng bnh thng do ri lon hocmon sinh trng m T-ADN m cho. Nt sn xut hin ti v tr cy b nhim A. tumefaciens l tp hp ca cc t bo bnh thng v t bo b bin i h gen. Trong trng hp vi bnh mc nhiu lng r, R-ADN ca A.rhizogenes c cha cc oncogen m sn phm ca chng lm thay i ngng nhy cm ca t bo thc vt i vi nng hocmon c mt trong mi trng. T gy ri lon s pht trin ca cc t bo c R-ADN ghp vo khin cho rt nhiu r xut hin ti v tr nhim. Nhm gen th hai c mt trn on T-ADN gm cc gen m cho cc enzym tham gia tng hp nhng phc cht dinh dng cn thit cho sinh trng v pht trin ca vi khun. Cc phc cht ny c gi chung l opines. Vi khun s dng opines nh ngun cacbon v nit. c bit, khi T-ADN mang gen m cho mt loi opine no th ngay trn plasmid Ti, nm ngoi on T-ADN, c cc gen tham gia qu trnh chuyn ho loi opine ny, gip cho vi khun sinh trng v pht trin. iu ng lu l opine c tng hp li tr thnh tn hiu kch thch hot ng ca operon cha cc gen ng ho opine nm trn plasmid Ti. V opine l protein c m bi cc gen vi khun, s c mt ca chng trong t bo thc vt c xem l ch th pht hin s chuyn ghp thnh cng ca T-ADN vo genome thc vt. T-ADN c vn chuyn vo trong nhn t bo ch v c ghp vo genome. C th xut hin nhiu bn sao ca T-ADN trong mt genome. Dng vng ca T-ADN i khi c tm thy trong t bo cy ch. y l dng trung gian hay ch l s lin kt ngu nhin gia hai u tri v phi ca T-ADN ang l vn cn lm sng t. Khi ghp vo genome vt

32

ch, cc gen trn on T- ADN mi c hot ng. Nh vy iu ng ch l cc gen trn T-ADN (prokaryot) ch hot ng di s kim sot ca cc yu t phin m trong genome eukaryot. Ni chung, mt gen c iu khin bi mt promoter. Argobacterium c kh nng a cc gen l vo genome thc vt. V vy, chng c s dng nh cc vector chuyn ch gen mt khi cc gen gy nt sn trn T-ADN b thay th bi gen nghin cu. Ngoi ra, promoter ca cc gen trn T-ADN u l nhng promoter hot ng mnh trong t bo nhn. V vy chng c s dng lm promoter bo co hoc promoter iu khin gen l trong k thut chuyn gen. K thut ny c ng dng rt rng ri trong nng nghip. V d, a cc gen chng chu su bnh, gen chu c mi trng trng trt khc nghit... vo cc cy trng qu him hoc cho nng sut cao.

1.7

ADN trong ty th v lc lp

i vi t bo eukaryot, ADN khng ch phn b trong nhn m cn c mt ty th v lc lp. Hu ht phn t ADN trong cc bo quan ny dng mch vng. Tuy nhin cng c mt s trng hp ADN trong bo quan c th tn ti c hai dng mch vng v mch thng. V d, phn t ADN trong ty th ca Paramecium, Chlamydomonas v mt s loi nm men lun lun l si ADN mch thng. Mi t bo thng cha nhiu ty th hoc lc lp. Hn na, mi bo quan c th c nhiu phn t ADN. Do , s lng ADN ty th (ADNmt) hoc ADN lc lp (ADNcp) c th t n hng nghn bn sao trong mt t bo. V d mi t bo ngui c ti 8000 phn t ADNmt, trong mt ty th c khong 10 phn t. T bo trng ca ng vt c v c cha ti 108 bn sao ca ADNmt. Vi to Chlamydomonas cha khong 1000 phn t ADN lc lp trong mt t bo. Ngoi ra, kch thc phn t ADN bo quan khng t l vi tnh phc tp ca c th. Phn t ADNmt c th thay i rt rng t 16-17 kb ng vt c xng sng n 2500 kb mt s thc vt c hoa. Do kch thc nh hn nhiu so vi genome trong nhn nn ADN cc bo quan cha s lng gen t hn v cc gen phn b st nhau hn (khong cch gia hai gen rt nh, thm ch ch vi nucleotide). Phn t ADNmt hay ADNcp cha nhng gen m cho protein thc hin chc nng chuyn ho c th ca ty th hay lc lp nh cc protein tham gia chui h hp. Ngoi ra, ADN trong ty th v lc lp cn cha gen m cho ARNr, ARNt v protein ribosome dng ring cho bo quan.

1.7.1. ADN ty thTrnh t nucleotide ca phn t ADNmt mt s sinh vt c xc nh. Kt qu ny gip chng ta hiu r hn cu trc v trt t sp xp cc gen trn phn t ADN ca bo quan. ng vt c xng sng, ADN ty th c kch thc nh gm cc gen khng c intron v hu nh khng c khong trng gia cc gen. V d, ADNmt ca ngi gm 16.659 bp tng ng vi 37 gen, trong 22 gen m cho cc phn t ARNt, 13 gen m cho cc polypeptide lin quan n phn ng oxy ho kh. nm men, ADNmt c kch thc ln hn so vi ng vt (78.000 bp) do mt s gen c intron v khong cch gia cc gen kh ln. ADNmt nm men c t nht 33 gen, trong s ny c 2 gen m cho ARNr, 23 gen m cho ARNt, 1 gen m cho protein ribosome v 7 gen m cho polypeptide tham gia phn ng oxy ho kh. c bit, ADNmt ca thc vt c kch thc ln nht v cu trc phc tp a dng nht. Trnh t ADNmt ca Marchantia polymorpha, thc vt nguyn thu khng c h mao dn, c xc nh hon ton. y l phn t mch vng c kch thuc 186 kb tng ng vi 94 khung c m (ORFs).

33

Trong s 94 ORFs ny, thc nghim mi xc nh c mt s gen m s lng intron ca mt gen ln n 32. i vi thc vt c h mch, ADNmt cn ln hn nhiu. V d, ng hay da hu, ADNmt tng ng vi 570 kb v 300 kb. cc loi thc vt bc cao, cc gen c th phn b v tr khc nhau trn phn t ADNmt mc d sn phm ca gen c cng mt chc nng trong t bo.

1.7.2. ADN lc lpThc vt c ba loi lc lp khc nhau tu thuc vo hp cht m chng c nh tinh bt, cc sc t hoc cc cht bo. C ba loi ny u c cha phn t ADN (ADNcp) vi kch thc thay i t 85 n 292 kb to v 120 n 160 kb thc vt bc cao. c bit mt s thc vt nh to xanh Acetabularia, ADNcp ln n 2000 kb. Phn t ADNcp ca mt s thc vt c xc nh trnh t nucleotide. Lc lp thuc l Nicotiana tobacum c ADNcp gm 155.844 bp tng ng vi khong 150 gen. S lng phn t ADNcp trong mi t bo ph thuc vo s lc lp trong mt t bo v s ADNcp trong mi lc lp. V d, t bo to n bo Chlamydomonas reinhardtii ch c mt lc lp cha khong 100 phn t ADNcp. S gen phn b trn ADNcp bao gm gen m cho ARNr, ARNt, protein ribosome v mt s polypeptide tham gia phn ng quang hp, hp th nng lng nh sng mt tri.

1.8

Genomics

1.8.1 So snh genomeDa vo trnh t nucleotide ca mt s genome in hnh, cc nh sinh hc c th phn tch cu trc, hot ng v chc nng ca cc gen, lm sng t c vai tr ca ADN lp li, ADN nm gia cc gen, ADN khng cha m di truyn (cc vng 5 v 3 khng c dch m) v cc on intron ca tng gen vv... iu c bit c ngha l khi so snh cc genome vi nhau, chng ta c c nhng hiu bit tng quan v hot ng ca genome cc sinh vt khc nhau, mi quan h gia chng, s a dng sinh hc v mc tin ho. V d, ton b trnh t nucleotide ca genome Arabidopsis c xc nh cui nm 2000 nhm mc ch pht hin, phn lp cc gen quan trng ca cc cy nng nghip da vo s tng ng ca chng vi cc gen ca Arabidopsis. y l thc vt u tin c genome c xc nh ton b trnh t do kch thc genome tng i nh (130-140 Mbp, nh hn khong 200 ln so vi cc thc vt khc). B nhim sc th n bi ca Arabidopsis gm 5 nhim sc th. Ngoi ra, Arabidopsis c vng i ngn, d trng v c th mc quanh nm. Hnh dng cy nh chim rt t din tch nn hon ton thch hp vi iu kin nui trng trong phng th nghim. Trnh t nucleotide ca genome cc sinh vt m hnh c a vo cc loi ngn hng ADN khc nhau tu thuc vo mc ch nghin cu. Ba ngn hng d liu chnh hin nay lu tr hu ht cc thng tin v ADN l EMBL (thuc Vin Tin hc chu u- European Informatics Institude), GenBank (thuc Trung tm Cng ngh Sinh hc ca M-US National Centre for Biotechnology) v DDBS (thuc Ngn hng d liu ADN ca Nht-DNA Database of Japan). Bn cnh trnh t ton b h gen, cc loi ADN khc nh cDNA, ADN ch-ESTs (Expressed Sequence Tags) vv cng c lu gi phc v cho vic so snh, phn tch v xc nh chc nng ca genome, ca gen v sn phm (protein hoc ARN) tng ng.

34

So snh genome gia cc loi sinh vt vi nhau cho php rt ra ba c im ni bt: Th nht l s lng nhim sc th rt khc nhau ngay gia nhng loi rt gn nhau. Th hai l cc gen thng phn b khng theo qui lut. Mt gen hoc mt h gm nhiu gen m cho sn phm cng chc nng c th phn b trn cc nhim sc th khc nhau, nm thnh nhm hoc ri rc trong genome. V d, s phn b ca gen m cho ARNr c trnh by trn hnh 1.17. Th ba l kch thc genome thay i khng hon ton t l vi tnh phc tp ca loi. Nhn chung, kch thc genome thng phn nh tnh phc tp ca loi. Tuy nhin, iu khng ng ngha gia vic tng s lng cc gen vi mc tin ho. Ch khi so snh trnh t ton b genome ca mt s sinh vt cng nh hot ng ca mt s gen quan trng trong sinh trng pht trin m cc nh sinh hc mi nhn thy tnh phc tp lin quan ch yu n vic tng s lng cc on ADN lp li. V d, genome ca mt s loi lng c hoc thc vt c kch thc khong 1011 bp, trong thnh phn ADN lp li chim hn 60-70%. Genome ca ngi nh hn, ch khong 3x109 bp. Chc chn rng ch ring kch thc genome khng th quyt nh tnh phc tp hay mc tin ho ca cc loi.

Hnh 1.17: Phn b ca ADNr tng ng vi ARNr 45S v ARNr 5S trong cc loi Triticeae

Bn cnh so snh tng th ton b genome gia cc loi, vic phn tch chi tit i vi mt gen nht nh cn lin quan n v tr cc intron, cc exon, cc on ADN iu khin hot ng ca gen. y l nhng yu t quan trng so snh tm ra mi quan h gia cc loi. Ngoi ra, tng s gen ni chung, s lng cc gen c nhiu bn sao trong genome, t l cc loi ADN lp li v thnh phn ca chng cng nh s di chuyn ca cc gen t ADN ring bit trong cc bo quan (ty th, lc lp) sang genome trong nhn u chu nh hng ca thi gian, tc l u phn nh qu trnh tin ho ca cc loi. Mt khc, c c s so snh chnh xc hn, ton din hn, cn xt n cu trc si nhim sc, cu hnh khng gian ba chiu ca nhim sc th cng nh ca ton b genome phn b trong nhn.

1.8.2 Genome ngiD n xc nh trnh t genome ngi (h gen trong nhn) c cp n t nhng nm 1984-1988. D n c bt u vo u thp k 90 vi s tham gia ca hn 20 nhm nghin cu t cc nc M, Nht, c, Anh, Php v Trung quc do t chc quc t Genome Ngi (Human Genome Organization-HUGO) v cng ty t nhn Celera Inc. cng tin hnh c lp vi nhau. D n c trin khai vi ba buc c bn: th nht l lp bn ca tt c cc gen (khong 70.000 n 100.000 gen), tip n l xc nh bn vt l ca 24 nhim sc th mc chi tit nht (m cc k thut hin i c th p ng c) v cui cng l c trnh t nucleotide ca ton b genome. Genome ngi c xem gm c hai phn phn b trong nhn v trong ty th. Phn t ADN ty th c dng vng vi kch thc 16.569 bp. Kch thc ny qu nh, c th coi l

35

khng ng k so vi genome trong nhn. Tuy nhin, do ty th khng c c ch sa cha ADN nn cc t bin (thm, mt hoc o on) thng c tch ly trong phn t ADN ca bo quan ny. Mt khc, mi t bo c khong 800 ty th, mi ty th c hn 10 phn t ADN. Cc phn t ny khng ging nhau do cha cc t bin to nn tnh a dng rt cao ca ADN ty th gia cc t bo ngay trong mt c th. Cui nm 2000, hn 96% trnh t nucleotide ca genome ngi c cng b. Genome ngi c kch thc khong 3,2 x106 kb, tc l 3,2 Gb (Gigabase-n v ln nht dng o chiu di trn bn vt l). Trong khong 2,95 Gb l vng cht nhim sc (euchromatin). Ch c 1,1 n 1,4% cha gen m cho khong 30.000-40.000 protein, trong ch mi xc nh c 1/3, cn li l cc protein d on (predicted protein). Genome ngi c ti 1,4 triu ch th SNPs. Thnh phn ADN lp li (SINEs, LINEs, LTRs v transposon) chim gn mt na genome (~43%). Tuy nhin hu ht cc transposon v LTRs u trng thi khng hot ng.

1.8.3 Nghin cu Genomics thc vtS lng cc gen v nhng thng tin v genome ca rt nhiu loi sinh vt ni chung cng nh ca thc