anatomy of a gene

Upload: mskiki

Post on 10-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Anatomy of a Gene

    1/33

  • 8/8/2019 Anatomy of a Gene

    2/33

    BASIC GENETIC MECHANISMS

  • 8/8/2019 Anatomy of a Gene

    3/33

    How did we know that genes are made of DNA?

    Streptococcus pneumoniae comes in 2 forms that differ from one another in their

    microscopic appearance and in their ability to cause disease. Cells of the pathogenic

    strain, which are lethal when injected into mice, are encased in a slimy, glistening

    polysaccharide capsule, designated the S form. The harmless strain of lacks thisprotective coat; it forms colonies that appear flat and rough, referred to as the R form.

    Fred Griffith found in the 1920s that a substance present in the virulent S strain could

    permanently change, or transform, the nonlethal R strain into the deadly S strain.

  • 8/8/2019 Anatomy of a Gene

    4/33

    Avery, MacLeod, and McCarty in the 1930s prepared an extract from

    the disease-causing S strain and identified the transforming

    principle that would permanently change R-strain pneumococci into

    the lethal S strain as DNA. This was the first evidence that DNA couldserve as the genetic material.

  • 8/8/2019 Anatomy of a Gene

    5/33

    (A) In 1952, Hershey and Chase worked with T2 viruses, which are

    made of protein and DNA. (B) To determine whether the genetic

    material of the T2 virus is protein or DNA, the researchers radioactively

    labeled the DNA in one batch of viruses with 32P and the proteins in a

    2nd

    batch of viruses with35

    S. These labeled viruses were then allowedto infect E. coli, and the mixture was disrupted by brief pulsing in a

    Waring blender to separate the infected bacteria from the empty viral

    heads. When radioactivity was measured, they found that most of the32P-labeled DNA had entered the bacterial cells, while most of the 35S-

    labeled proteins remained in solution with the spent viral particles.

  • 8/8/2019 Anatomy of a Gene

    6/33

    In molecular terms, a GENE is the entire DNAsequence required for synthesis of a functional

    protein or RNA molecule.

    A gene includes: exons (coding), control or

    regulatory regions and introns (non-coding).

    Most bacterial and yeast genes lack introns,

    whereas most genes in multicellular organisms

    contain them. The total length of intron

    sequences often is much longer than that of exon

    sequences. A simple eukaryotic transcription unit produces a

    single monocistronic mRNA, which is translated

    into a single protein.

    WHAT IS A GENE?

  • 8/8/2019 Anatomy of a Gene

    7/33

  • 8/8/2019 Anatomy of a Gene

    8/33

    A bacterial operon comprises a single transcription

    unit, which is transcribed from a particular

    promoter into a single primary transcript. Genesand transcription units are distinguishable in

    prokaryotes.

    Most eukaryotic genes and transcription units

    generally are identical, and the two terms are used

    interchangeably.

  • 8/8/2019 Anatomy of a Gene

    9/33

    A complex eukaryotic transcription unitis

    transcribed into a primary transcript that can be

    processed into 2 or more different monocistronicmRNAs depending on the choice of splice sites or

    polyadenylation sites. Eukaryotic transcription units

    are classified into 2 types, depending on the fate of

    the 10 transcript:

    1. The 10 transcript produced from a simple

    transcription unit is processed to yield a single

    type of mRNA, encoding a single protein.

    2. In complex transcription units, the 10 RNA

    transcript can be processed in more than one way,leading to formation of mRNAs containing

    different exons. Each mRNA is monocistronic, with

    translation usually initiating at the first AUG in the

    mRNA.

  • 8/8/2019 Anatomy of a Gene

    10/33

    (Top) If a 10

    transcript

    contains

    alternativesplice sites, it

    can be

    processed into

    mRNAs with the

    same 5 and 3exons but

    different

    internal exons.

    (Bottom) If a 10

    transcript has

    two poly(A)sites, it can be

    processed into

    mRNAs with

    alternative 3

    exons.

  • 8/8/2019 Anatomy of a Gene

    11/33

    If alternative promoters (f or g) are active in different cell types, mRNA1,

    produced in a cell type in which f is activated, has a different exon (1A) than

    mRNA2 has, which is produced in a cell type in which g is activated (and

    where exon 1B is used). Mutations in control regions (a and b) and those

    designated c within exons shared by the alternative mRNAs affect theproteins encoded by both alternatively processed mRNAs. In contrast,

    mutations (d and e) within exons unique to one of the alternatively processed

    mRNAs affect only the protein translated from that mRNA. For genes that are

    transcribed from different promoters in different cell types (bottom),

    mutations in different control regions (f and g) affect expression only in the

    cell type in which that control region is active.

  • 8/8/2019 Anatomy of a Gene

    12/33

    (a) The tryptophan (trp) operon is a continuous segment of the E. colichromosome, containing 5 genes (blue) that encode the enzymes necessary for the stepwise

    synthesis of tryptophan. The order of the genes in the bacterial genome parallels the sequential

    function of the encoded proteins in the tryptophan pathway. (b) The 5 genes encoding the enzymes

    required for tryptophan synthesis in yeast(Saccharomyces cerevisiae) are carried on 4 different

    chromosomes. Each gene is transcribed from its own promoter to yield a primary transcript that is

    processed into a functional mRNA encoding a single protein.

  • 8/8/2019 Anatomy of a Gene

    13/33

    MAJOR CLASSES OF EUKARYOTIC DNA AND THE HUMAN GENOME

  • 8/8/2019 Anatomy of a Gene

    14/33

    LINES, SINES, retroviral-like elements, and DNA-only transposons are all mobile

    genetic elements that have multiplied in our genome by replicating themselves andinserting the new copies in different positions. Simple sequence repeats are short

    nucleotide sequences (less than 14 nucleotide pairs) that are repeated for long

    stretches. Segmental duplications are large blocks of the genome (1000200,000

    nucleotide pairs) that are present at two or more locations in the genome. Over half

    of the unique sequence consists of genes and the remainder is probably regulatory

    DNA. Most of the DNA present in heterochromatin has not yet been sequenced.

  • 8/8/2019 Anatomy of a Gene

    15/33

    PROTEIN-CODING GENES

    1. Solitary genes - roughly 2550% of the protein-

    coding genes represented only once in the haploidgenome

    2. Duplicated genes constitute the second group of

    protein coding genes with close but nonidentical

    sequences that generally are located within 550

    kb of one another. In vertebrate genomes,duplicated genes constitute half the protein-

    coding DNA sequences.

    3. Gene family is a set of duplicated genes that

    encode proteins with similar but nonidentical

    amino acid sequences. The encoded, closelyrelated, homologous proteins constitute a protein

    family. A few protein families, such as protein

    kinases, transcription factors, and vertebrate

    immunoglobulins, include hundreds of members.

  • 8/8/2019 Anatomy of a Gene

    16/33

    GENE FAMILY FUNCTION #

    Translation, ribosomal structure and biogenesis 61

    Transcription 5

    Replication, repair, recombination 13

    Cell division and chromosome partitioning 1

    Molecule chaperones 9

    Outer membrane, cell-wall biogenesis 3

    Secretion 4

    Inorganic ion transport 9

    Signal transduction 1

    Energy production and conversion 18

    Carbohydrate metabolism and transport 14

    Amino acid metabolism and transport 40

    Nucleotide metabolism and transport 15

    Coenzyme metabolism 23

    Lipid metabolism 8

    General biochemical function predicted;

    specific biological role unknown33

    Function unknown 1

    Numbersof gene

    families,

    classified

    by

    function,

    that are

    common to

    all 3

    domains ofthe living

    world

  • 8/8/2019 Anatomy of a Gene

    17/33

    TANDEMLY REPEATED GENES encode rRNAs, tRNAs,

    histones

    rRNAs are encoded in tandem arrays in genomic DNA.

    Multiple copies of tRNA and histone genes also occur,often in clusters, but not generally in tandem arrays.

    REPETITIOUS DNA are concentrated in specific

    chromosomal locations

    1. Simple-sequence or satellite DNA consists largely of

    quite short sequences repeated in long tandem arraysand is preferentially located in centromeres (they assist

    in attaching chromosomes to spindle fibers during

    mitosis), telomeres, and specific locations within the arms

    of particular chromosomes.

    Repeats containing 113 bp are often called micro-satellites and cause about 14 neuromuscular diseases

    (myotonic dystrophy, spinocerebelllar ataxia).

    The length of a particular simple-sequence tandem array

    is quite variable between individuals in a species. These

    differences form the basis for DNA fingerprinting.

  • 8/8/2019 Anatomy of a Gene

    18/33

    2. Mobile DNA elements are moderately repeated DNA

    sequences interspersed at multiple sites throughout

    the genomes of higher eukaryotes. They are less

    frequent in prokaryotes.

    a. DNA transposons are mobile DNA elements that

    transpose to new sites directly as DNA.

    b. Retrotransposons are first transcribed into anRNA copy of the element, which then is reverse-

    transcribed into DNA.

    A common feature of all mobile elements is the

    presence of short direct repeats flanking the

    sequence. Enzymes encoded by mobile elements themselves

    catalyze insertion of these sequences at new sites in

    genomic DNA.

  • 8/8/2019 Anatomy of a Gene

    19/33

    . (a)Eukaryotic DNA

    transposons (orange)moveviaaDNA intermediate,

    which is excised fromthe

    donorsite.(b)Retrotransposons (green)

    are firsttranscribed intoan

    RNAmolecule, whichthenisreverse-transcribed into

    double-strandedDNA. In

    bothcases, the double-

    strandedDNA intermediate

    is integratedintothe target-

    site DNA tocompletemovement. Thus DNA

    transposons move byacut-

    and-paste mechanism,

    whereas retrotransposons

    move byacopy-and-pastemechanism.

  • 8/8/2019 Anatomy of a Gene

    20/33

    Retrotransposons are much more abundant in vertebrates. However, DNAtransposons which are similar in structure to bacterial IS elements occur (e.g.,

    the Drosophila P element). The relatively large central region of an IS element,

    which encodes one or two enzymes required for transposition, is flanked by an

    inverted repeat at each end. The sequences of the inverted repeats are nearly

    identical, but they are oriented in opposite directions. The sequence is

    characteristic of a particular IS element. The 5 and 3 short direct (as opposedto inverted) repeats are not transposed with the insertion element; rather, they

    are insertion-site sequences that become duplicated, with one copy at each

    end, during insertion of a mobile element. The length of the direct repeats is

    constant for a given IS element, but their sequence depends on the site of

    insertion and therefore varies with each transposition of the IS element.

    Arrows indicate sequence orientation.

  • 8/8/2019 Anatomy of a Gene

    21/33

    LTR retrotransposons or viral retrotransposons (8% of

    human genomic DNA) are flanked by long terminal

    repeats (LTRs), similar to those in retroviral DNA; theyencode reverse transcriptase and integrase.

    They move in the genome by being transcribed into RNA,

    which then undergoes reverse transcription and

    integration into the host-cell chromosome.

    The central protein-coding region is flanked by 2 long terminal repeats (LTRs),

    which are element-specific direct repeats. Like other mobile elements, integrated

    retrotransposons have short target-site direct repeats at each end. The protein-

    coding region constitutes 80% or more of a retrotransposon and encodes reverse

    transcriptase, integrase, and other retroviral proteins.

  • 8/8/2019 Anatomy of a Gene

    22/33

    The left LTR directs cellular RNA polymerase II to initiate transcription at the

    first nucleotide of the left R region. The resulting primary transcript extends

    beyond the right LTR. The right LTR, now present in the RNAprimary transcript, directs cellular enzymes to cleave the primary

    transcript at the last nucleotide of the right R region and to add a poly(A)

    tail, yielding a retroviral RNA genome. A similar mechanism generates the

    RNA intermediate during transposition of retrotransposons. The short

    direct-repeat sequences (black) of target-site DNA are generated during

    integration of the retroviral DNA into the host-cell genome.

  • 8/8/2019 Anatomy of a Gene

    23/33

  • 8/8/2019 Anatomy of a Gene

    24/33

    The genomic RNA is packaged in the virion with a retrovirus-specific

    cellular tRNA hybridized to a complementary sequence near its 5 end called the primer-

    binding site (PBS). The retroviral RNA has a short direct-repeat terminal sequence (R) at

    each end. The overall reaction is carried out by reverse transcriptase.

  • 8/8/2019 Anatomy of a Gene

    25/33

    Nonviral retrotransposons are the most abundant

    mobile elements in mammals. They form two classes

    in mammalian genomes: LINEs and SINEs (long andshort interspersed elements.

    Both LINEs and SINEs lack LTRs and have an A/T-

    rich stretch at one end. They move by a nonviral

    retrotransposition mechanism mediated by LINE

    encoded proteins involving priming by chromosomal

    DNA.

    SINE sequences exhibit extensive homology with

    small cellular RNAs transcribed by RNA polymerase

    III. Alu elements, the most common SINEs in humans,

    are 300-bp sequences found scattered throughout

    the human genome.

  • 8/8/2019 Anatomy of a Gene

    26/33

    The length of the target-site direct repeats varies among

    copies of the element at different sites in the genome.

    Although the full-length L1 sequence is 6 kb long,

    variable amounts of the left end are absent at over 90% of

    the sites where this mobile element is found. The shorteropen reading frame (ORF1), 1 kb in length, encodes an

    RNA-binding protein. The longer ORF2, 4 kb in length,

    encodes a bifunctional protein with reverse transcriptase

    and DNA endonuclease activity.

  • 8/8/2019 Anatomy of a Gene

    27/33

    Only ORF2 protein is represented.

    Newly synthesized LINE DNA isshown in black.

  • 8/8/2019 Anatomy of a Gene

    28/33

    Some moderately repeated DNA sequences are

    derived from cellular RNAs that were reverse-

    transcribed and inserted into genomic DNA at sometime in evolutionary history.

    Processed pseudogenes are derived from mRNAs,

    lack introns; a feature that distinguishes them from

    pseudogenes, which arose by sequence drift of

    duplicated genes.

    The human globin gene cluster contains two pseudogenes

    (white); these regions are related to the functional globin-type

    genes but are not transcribed. Each red arrow indicates the

    location of an Alu sequence, an 300-bp noncoding repeated

    sequence that is abundant in the human genome.

  • 8/8/2019 Anatomy of a Gene

    29/33

    Mobile DNA elements were earlier viewed as

    selfish molecular parasites. Today, they are

    viewed as contributors to the evolution of

    higher organisms by promoting:

    the generation of gene families via gene

    duplication

    the creation of new genes via shuffling of

    preexisting exons

    formation of more complex regulatory

    regions that provide multifaceted control of

    gene expression

  • 8/8/2019 Anatomy of a Gene

    30/33

    Mobile DNA elements most likely influenced evolution

    significantly by serving as recombination sites and by

    mobilizing adjacent DNA sequences. They have also beenfound in mutant alleles associated with several

    human genetic diseases.

    Recombination between interspersed repeats in the introns of separate

    genes produces transcription units with a new combination of exons.

    A double crossover between two sets of Alu repeats results in an exchange

    of exons between the two genes.

  • 8/8/2019 Anatomy of a Gene

    31/33

    Transposase can

    recognize and cleave the DNA at the ends of the transposon

    inverted repeats. In gene 1, if the transposase cleaves at the leftend of the transposon on the left and at the right end of the

    transposon on the right, it can transpose all the intervening DNA,

    including the exon from gene 1, to a new site in an intron of gene 2.

    The net result is an insertion of the exon from gene 1 into gene 2.

  • 8/8/2019 Anatomy of a Gene

    32/33

    Some LINEs have weak

    poly(A) signals. If such a LINE is in the 3-most intron of gene 1,

    during transposition its transcription may cntinue beyond its ownpoly(A) signals and extend into the 3 exon, transcribing the

    cleavage and polyadenylation signals of gene 1 itself. This RNA

    can then be reverse transcribed and integrated by the LINE ORF2

    protein into an intron on gene 2, introducing a new 3 exon (from

    gene 1) into gene 2.

  • 8/8/2019 Anatomy of a Gene

    33/33