lec 8 genetics
DESCRIPTION
lectureTRANSCRIPT
THE FLOW OF GENETIC INFORMATION
DNA RNA PROTEIN
DNA
1
2 3
1. REPLICATION (DNA SYNTHESIS)2. TRANSCRIPTION (RNA SYNTHESIS)3. TRANSLATION (PROTEIN SYNTHESIS)
DNA Structure and Chemistry
a). Evidence that DNA is the genetic informationi). DNA transformation – know this termii). Transgenic experiments – know this processiii). Mutation alters phenotype – be able to define
genotype and phenotypeb). Structure of DNA
i). Structure of the bases, nucleosides, and nucleotidesii). Structure of the DNA double helixiii). Complementarity of the DNA strands
c). Chemistry of DNAi). Forces contributing to the stability of the double helixii). Denaturation of DNA
Thymine (T)
Guanine (G) Cytosine (C)
Adenine (A)
Structures of the bases
Purines Pyrimidines
5-Methylcytosine (5mC)
[structure of deoxyadenosine]
Nucleoside
Nucleotide
Nomenclature
Purinesadenine adenosineguanine guanosinehypoxanthine inosine
Pyrimidinesthymine thymidinecytosine cytidine
+ribose uracil uridine
Nucleoside NucleotideBase +deoxyribose +phosphate
• polynucleotide chain• 3’,5’-phosphodiester bond
ii). Structure of the DNA double helix
Structure of the DNApolynucleotide chain
5’
3’
A-T base pair
G-C base pair
Chargaff’s rule: The content of A equals the content of T, and the content of G equals the content of C in double-stranded DNA from any species
Hydrogen bonding of the bases
Double-stranded DNA
Major groove
Minor groove
5’ 3’
5’ 3’3’ 5’
“B” DNA
Chemistry of DNA
Forces affecting the stability of the DNA double helix
• hydrophobic interactions - stabilize - hydrophobic inside and hydrophilic outside
• stacking interactions - stabilize - relatively weak but additive van der Waals forces
• hydrogen bonding - stabilize - relatively weak but additive and facilitates stacking
• electrostatic interactions - destabilize - contributed primarily by the (negative) phosphates - affect intrastrand and interstrand interactions - repulsion can be neutralized with positive charges
(e.g., positively charged Na+ ions or proteins)
Stacking interactions
Charge repulsion
Ch
arg
e re
pu
lsio
n
Model of double-stranded DNA showing three base pairs
Denaturation of DNA
Double-stranded DNA
A-T rich regions denature first
Cooperative unwinding of the DNA strands
Extremes in pH or high temperature
Strand separationand formation ofsingle-strandedrandom coils
Electron micrograph of partially melted DNA
• A-T rich regions melt first, followed by G-C rich regions
Double-stranded, G-C rich DNA has not yet melted
A-T rich region of DNAhas melted into asingle-stranded bubble
Hyperchromicity
The absorbance at 260 nm of a DNA solution increases when the double helix is melted into single strands.
260
Ab
sorb
ance
Absorbance maximumfor single-stranded DNA
Absorbancemaximum fordouble-stranded DNA
220 300
100
50
0
7050 90
Temperature oC
Pe
rce
nt
hyp
erc
hro
mic
ity
DNA melting curve
• Tm is the temperature at the midpoint of the transition
Average base composition (G-C content) can bedetermined from the melting temperature of DNA
50
7060 80
Temperature oC
Tm is dependent on the G-C content of the DNA
Pe
rce
nt
hyp
erc
hro
mic
ity
E. coli DNA is 50% G-C
Genomic DNA, Genes, Chromatin
a). Complexity of chromosomal DNAi). DNA reassociationii). Repetitive DNA and Alu sequencesiii). Genome size and complexity of genomic DNA
b). Gene structurei). Introns and exonsii). Properties of the human genome iii). Mutations caused by Alu sequences
c). Chromosome structure - packaging of genomic DNAi). Nucleosomes
ii). Histonesiii). Nucleofilament structureiv). Telomeres, aging, and cancer
DNA reassociation (renaturation)
Double-stranded DNA
Denatured,single-strandedDNA
Slower, rate-limiting,second-order process offinding complementarysequences to nucleatebase-pairing
k2
Faster,zipperingreaction toform longmoleculesof double-strandedDNA
Cot1/2
DNA reassociation kinetics for human genomic DNA
Cot1/2 = 1 / k2 k2 = second-order rate constant Co = DNA concentration (initial) t1/2 = time for half reaction of each
component or fraction
50
100
0
% D
NA
re
ass
oc
iate
d
I I I I I I I I I
log Cot
fast (repeated)
intermediate (repeated)
slow (single-copy)
Kinetic fractions: fast intermediate slow
Cot1/2
Cot1/2
high k2
106 copies per genome ofa “low complexity” sequence
of e.g. 300 base pairs
1 copy per genome ofa “high complexity” sequence
of e.g. 300 x 106 base pairs
low k2
Type of DNA % of Genome Features
Single-copy (unique) ~75% Includes most genes 1
Repetitive Interspersed ~15% Interspersed throughout genome between
and within genes; includes Alu sequences 2
and VNTRs or mini (micro) satellites Satellite (tandem) ~10% Highly repeated, low complexity sequences
usually located in centromeres and telomeres
2 Alu sequences are about 300 bp in length and are repeated about 300,000 times in the genome. They can be found adjacent to or within genes in introns or nontranslated regions.
1 Some genes are repeated a few times to thousands-fold and thus would be in the repetitive DNA fraction
50
100
0
I I I I I I I I I
fast ~10%
intermediate ~15%
slow (single-copy) ~75%
Classes of repetitive DNA
Interspersed (dispersed) repeats (e.g., Alu sequences)
TTAGGGTTAGGGTTAGGGTTAGGG
Tandem repeats (e.g., microsatellites)
GCTGAGG GCTGAGGGCTGAGG
viruses
plasmids
bacteria
fungi
plants
algae
insects
mollusks
reptiles
birds
mammals
Genome sizes in nucleotide pairs (base-pairs)
104 108105 106 107 10111010109
The size of the humangenome is ~ 3 X 109 bp;almost all of its complexityis in single-copy DNA.
The human genome is thoughtto contain ~30,000 to 40,000 genes.
bony fish
amphibians
5’ 3’
promoter region
exons (filled and unfilled boxed regions)
introns (between exons)
transcribed region
translated region
mRNA structure
+1
Gene structure
The (exon-intron-exon)n structure of various genes
-globin
HGPRT(HPRT)
total = 1,660 bp; exons = 990 bp
histone
factor VIII
total = 400 bp; exon = 400 bp
total = 42,830 bp; exons = 1263 bp
total = ~186,000 bp; exons = ~9,000 bp
Properties of the human genome
Nuclear genome
• the haploid human genome has ~3 X 109 bp of DNA• single-copy DNA comprises ~75% of the human genome• the human genome contains ~30,000 to 40,000 genes• most genes are single-copy in the haploid genome• genes are composed of from 1 to >75 exons• genes vary in length from <100 to >2,300,000 bp• Alu sequences are present throughout the genome
Mitochondrial genome
• circular genome of ~17,000 bp• contains <40 genes
Familial hypercholesterolemia• autosomal dominant• LDL receptor deficiency
Alu sequences can be “mutagenic”
From Nussbaum, R.L. et al. "Thompson & Thompson Genetics in Medicine," 6th edition (Revised Reprint), Saunders, 2004.
LDL receptor gene
Alu repeats present within introns
Alu repeats in exons
4
4
4
5
5
5 6
6
6
Alu Alu
AluAlu
X
4 6Alu
unequalcrossing over
one product has a deleted exon 5(the other product is not shown)
Chromatin structure
EM of chromatin shows presence ofnucleosomes as “beads on a string”
Nucleosome structure
Nucleosome core (left)• 146 bp DNA; 1 3/4 turns of DNA• DNA is negatively supercoiled• two each: H2A, H2B, H3, H4 (histone octomer)
Nucleosome (right)• ~200 bp DNA; 2 turns of DNA plus spacer• also includes H1 histone
Histones (H1, H2A, H2B, H3, H4)• small proteins• arginine or lysine rich: positively charged• interact with negatively charged DNA• can be extensively modified - modifications in
general make them less positively chargedPhosphorylationPoly(ADP) ribosylationMethylationAcetylation
Hypoacetylation by histone deacetylase (facilitated by Rb)
“tight” nucleosomes assoc with transcriptional repression
Hyperacetylation by histone acetylase (facilitated by TFs)“loose” nucleosomes assoc with transcriptional activation
Nucleofilament structure
Condensation and decondensation of a chromosome in the cell cycle
Telomeres and aging
Metaphase chromosome
centromere
telomere telomere
telomere structure
young
senescent
Telomeres are protective“caps” on chromosomeends consisting of short5-8 bp tandemly repeatedGC-rich DNA sequences,that prevent chromosomesfrom fusing and causingkaryotypic rearrangements.
(TTAGGG)many
(TTAGGG)few
• telomerase (an enzyme) is required to maintain telomere length in germline cells
• most differentiated somatic cells have decreased levels of telomerase and therefore their chromosomes shorten with each cell division
<1 to >12 kb
The mammalian cell cycle
G1
S
G2M
G0
DNA synthesis and histone synthesis
Growth and preparation forcell division
Rapid growth and preparation forDNA synthesis
Quiescent cells
phase
phase
phase
phase
Mitosis
DNA replication is semi-conservative
Parental DNA strands
Daughter DNA strands
Each of the parental strands serves as a template for a daughter strand
origins of DNA replication (every ~150 kb)
replication bubble
daughter chromosomes
fusion of bubbles
bidirectional replication
Origins of DNA replication on mammalian chromosomes
5’3’
3’5’
5’3’
3’5’3’5’
5’3’
Initiation of DNA synthesis at the E. coli origin (ori)
5’3’
3’5’
origin DNA sequence
binding of dnaA proteins
A A A
dnaA proteins coalesce
DNA melting inducedby the dnaA proteinsA
AA
AA
A
AA
AA
A
A B C
dnaB and dnaC proteins bind to the single-stranded DNA
dnaB further unwinds the helix
A
A
A
AA
A B C
dnaB further unwinds the helix and displaces dnaA proteins
GdnaG (primase) binds...
A
A
A
AA
AB C
G...and synthesizes an RNA primer
RNA primer
B C
G
5’ 3’template strand
RNA primer(~5 nucleotides)
Primasome dna B (helicase) dna C dna G (primase)
OH3’ 5’
3’
5’ 3’
RNA primer
newly synthesized DNA
5’
5’
DNA polymerase
Discontinuous synthesis of DNA
3’5’
5’ 3’
3’ 5’
Because DNA is always synthesized in a 5’ to 3’ direction,synthesis of one of the strands...
5’3’ ...has to be discontinuous.
This is the lagging strand.
5’3’
3’5’
5’3’
3’5’
5’ 3’
3’ 5’
5’3’
3’5’
5’3’
leading strand (synthesized continuously)
lagging strand (synthesized discontinuously)
Each replication fork has a leading and a lagging strand
• The leading and lagging strand arrows show the direction of DNA chain elongation in a 5’ to 3’ direction• The small DNA pieces on the lagging strand are called
Okazaki fragments (100-1000 bases in length)
replication fork replication fork
RNA primer
5’3’
3’5’
3’5’
direction of leading strand synthesis
direction of lagging strand synthesis
replication fork
5’3’
3’5’
3’5’
Strand separation at the replication fork causes positivesupercoiling of the downstream double helix
• DNA gyrase is a topoisomerase II, which breaks and reseals the DNA to introduce negative supercoils ahead of the fork• Fluoroquinolone antibiotics target DNA gyrases in many gram-negative bacteria: ciprofloxacin and levofloxacin (Levaquin)
5’3’ 5’
3’
Movement of the replication fork
Movement of the replication fork
RNA primerOkazaki fragment
RNA primer
5’
3’
RNA primer5’
DNA polymerase III initiates at the primer andelongates DNA up to the next RNA primer
5’
5’3’
5’
newly synthesized DNA (100-1000 bases) (Okazaki fragment)5’
3’
DNA polymerase I inititates at the end of the Okazaki fragment and further elongates the DNA chain while simultaneously removing the RNA primer with its 5’ to 3’ exonuclease activity
pol III
pol I
newly synthesized DNA (Okazaki fragment)5’
3’
5’3’
DNA ligase seals the gap by catalyzing the formationof a 3’, 5’-phosphodiester bond in an ATP-dependent reaction
5’3’
3’5’
Proteins at the replication fork in E. coli
Rep protein (helicase)
Single-strandbinding protein (SSB)
BC
G Primasome
pol I
pol III
pol III
DNA ligase
DNA gyrase - this is a topoisomerase II, whichbreaks and reseals double-stranded DNA to introducenegative supercoils ahead of the fork
Components of the replication apparatus
dnaA binds to origin DNA sequencePrimasome dnaB helicase (unwinds DNA at origin) dnaC binds dnaB dnaG primase (synthesizes RNA primer)DNA gyrase introduces negative supercoils ahead
of the replication forkRep protein helicase (unwinds DNA at fork)SSB binds to single-stranded DNADNA pol III primary replicating polymeraseDNA pol I removes primer and fills gapDNA ligase seals gap by forming 3’, 5’-phosphodiester bond
Properties of DNA polymerases
DNA polymerases of E. coli_
pol I pol II pol III (core)Polymerization: 5’ to 3’ yes yes yesProofreading exonuclease: 3’ to 5’ yes yes yesRepair exonuclease: 5’ to 3’ yes no no
DNA polymerase III is the main replicating enzymeDNA polymerase I has a role in replication to fill gaps and excise primers on the lagging strand, and it is also a repair enzyme and is used in making recombinant DNA molecules
• all DNA polymerases require a primer with a free 3’ OH group• all DNA polymerases catalyze chain growth in a 5’ to 3’ direction• some DNA polymerases have a 3’ to 5’ proofreading activity
Types and rates of mutation
Type Mechanism Frequency________ Genome chromosome 10-2 per cell division mutation missegregation
(e.g., aneuploidy)
Chromosome chromosome 6 X 10-4 per cell division mutation rearrangement
(e.g., translocation)
Gene base pair mutation 10-10 per base pair per mutation (e.g., point mutation, cell division or
or small deletion or 10-5 - 10-6 per locus per insertion generation
Mutation
Many polymorphisms exist in the genome
• the number of existing polymorphisms is ~1 per 500 bp• there are ~5.8 million differences per haploid genome• polymorphisms were caused by mutations over time• polymorphisms called single nucleotide polymorphisms
(or SNPs) are being catalogued by the HumanGenome Project as an ongoing project
Types of base pair mutations
CATTCACCTGTACCAGTAAGTGGACATGGT
CATGCACCTGTACCAGTACGTGGACATGGT
CATCCACCTGTACCAGTAGGTGGACATGGT
transition (T-A to C-G) transversion (T-A to G-C)
CATCACCTGTACCAGTAGTGGACATGGT
deletionCATGTCACCTGTACCAGTACAGTGGACATGGT
insertion
base pair substitutions transition: pyrimidine to pyrimidine transversion: pyrimidine to purine
normal sequence
deletions and insertions can involve one or more base pairs
Spontaneous mutations can be caused by tautomers
Tautomeric forms of the DNA bases
Adenine
Cytosine
AMINO IMINO
Guanine
Thymine
KETO ENOL
Tautomeric forms of the DNA bases
Mutation caused by tautomer of cytosine
Cytosine
Cytosine
Guanine
Adenine
• cytosine mispairs with adenine resulting in a transition mutation
Normal tautomeric form
Rare imino tautomeric form
Mutation is perpetuated by replication
• replication of C-G should give daughter strands each with C-G
• tautomer formation C during replication will result in mispairing and insertion of an improper A in one of the daughter strands
• which could result in a C-G to T-A transition mutation in the next round of replication, or if improperly repaired
C G C G
C G C A
AC T A
Chemical mutagens
Deamination by nitrous acid
N
NH
NH
N
NH2
O
N
NH
NH
NH
NH2
O
O
Attack by oxygen free radicalsleading to oxidative damage
guanine
8-oxyguanine (8-oxyG)
• many different oxidative modifications occur• by smoking, etc.• 8-oxyG causes G to T transversions
• the MTH1 protein degrades 8-oxy-dGTP preventing misincorporation• mutation of the MTH1 gene causes increased tumor formation in mice
Ames test for mutagen detection
• named for Bruce Ames• reversion of histidine mutations by test compounds• His- Salmonella typhimurium cannot grow without histidine
• if test compound is mutagenic, reversion to His+ may occur• reversion is correlated with carcinogenicity
Thymine dimer formation by UV light
Summary of DNA lesions
Missing base Acid and heat depurination (~104 purinesper day per cell in humans)
Altered base Ionizing radiation; alkylating agents Incorrect base Spontaneous deaminations
cytosine to uraciladenine to hypoxanthine
Deletion-insertion Intercalating reagents (acridines) Dimer formation UV irradiation Strand breaks Ionizing radiation; chemicals (bleomycin) Interstrand cross-links Psoralen derivatives; mitomycin C Tautomer formation Spontaneous and transient
Mechanisms of Repair
• Mutations that occur during DNA replication are repaired whenpossible by proofreading by the DNA polymerases
• Mutations that are not repaired by proofreading are repairedby mismatch (post-replication) repair followed byexcision repair
• Mutations that occur spontaneously any time are repaired byexcision repair (base excision or nucleotide excision)
Deamination of cytosine can be repaired
More than 30% of all single base changes that have been detected as a cause of genetic disease have occurred at 5’-mCpG-3’ sites
Deamination of 5-methylcytosine cannot be repaired
cytosine uracil
thymine5’-methyl-cytosine
DNA repair activity
Life
spa
n
1
10
100humanelephant
cow
hamsterratmouseshrew
Correlation between DNA repairactivity in fibroblast cells fromvarious mammalian species andthe life span of the organism
Defects in DNA repair or replicationAll are associated with a high frequency of chromosome
and gene (base pair) mutations; most are also associated with a predisposition to cancer, particularly leukemias
• Xeroderma pigmentosum• caused by mutations in genes involved in nucleotide excision repair• associated with a >1000-fold increase of sunlight-induced skin cancer and with other types of cancer such as melanoma
• Ataxia telangiectasia• caused by gene that detects DNA damage• increased risk of X-ray• associated with increased breast cancer in carriers
• Fanconi anemia• caused by a gene involved in DNA repair• increased risk of X-ray and sensitivity to sunlight
• Bloom syndrome• caused by mutations in a a DNA helicase gene• increased risk of X-ray• sensitivity to sunlight
• Cockayne syndrome• caused by a defect in transcription-linked DNA repair• sensitivity to sunlight
• Werner’s syndrome• caused by mutations in a DNA helicase gene• premature aging
3. RNA Structure and Transcription
a). Chemistry of RNAi). Bases found in RNAii). Ribose sugariii). RNA polynucleotide chainiv). Secondary and tertiary structure
b). Characteristics of prokaryotic RNAi). Classes of prokaryotic RNAii). Structure of prokaryotic messenger RNA
c). Transcription initiation in prokaryotesi). Transcriptionii). Promoter structureiii). Prokaryotic RNA polymerase structureiv). Initiation of transcription and the sigma cycle
d). Regulation of the lactose operoni). Function of the lactose operonii). Negative control: Lac repressor and induceriii). Positive control: CAP and cAMP
The major bases found in DNA and RNA
DNA RNA
Adenine Adenine Cytosine Cytosine Guanine Guanine Thymine Uracil (U)
uracil-adenine base pairthymine-adenine base pair
RNA polynucleotide chain
• 2’ -OH makes 3’, 5’ phosphodiester bond unstable
DNA polynucleotide chain
Tertiary structure
Secondary structure
• ribosomal RNA (rRNA)16S (small ribosomal subunit)23S (large ribosomal subunit)5S (large ribosomal subunit)
• transfer RNA (tRNA)• messenger RNA (mRNA)
Structure of prokaryotic messenger RNA
5’
3’
PuPuPuPuPuPuPuPu AUGShine-Dalgarno sequence initiation
The Shine-Dalgarno (SD) sequence base-pairs with a pyrimidine-rich sequence in 16S rRNA to facilitate the initiation of protein synthesis
Classes of prokaryotic RNA
AAUtermination
translated region
Transcription
RNA polymerase
closed promoter complex
open promoter complex
initiation
elongation
termination
RNA product
Promoter structure in prokaryotes
5’ PuPuPuPuPuPuPuPu AUG
Promoter
+1 +20-7-12-31-36
5’mRNA
mRNA
TTGACAAACTGT
-30 region
TATAATATATTA
-10 region
8479 53 45%82T T G
64AC A
79T
44T
96%T
95A
59A
51A
consensus sequences
-30 -10
transcription start site
Pribnow box
+1[ ]
Prokaryotic RNA polymerase structure
RNA polymerase of bacteria is a multisubunit protein
Subunit Number Role
2 uncertain
(Rifampicin target) 1 forms phosphodiester bonds
’ 1 binds DNA template
1 recognizes promoter and facilitates initiation
’ ’ + holoenzyme core polymerase sigma factor
RNA polymerase holoenzyme (+ factor)
• closed promoter complex (moderately stable)• the sigma subunit binds to the -10 region
• once initiation takes place, RNA polymerase does not need very high affinity for the promoter• sigma factor dissociates from the core polymerase after a few elongation reactions
• elongation takes place with the core RNA polymerase
• open promoter complex (highly stable)• the holoenzyme has very high affinity for promoter regions because of sigma factor
• sigma can re-bind other core enzymes The sigma cycle
Mechanism of RNA synthesis
• RNA synthesis usually initiated with ATP or GTP (the first nucleotide)• RNA chains are synthesized in a 5’ to 3’ direction
A = T
U = A
A = T
U = A
RNA RNA
lac I P Opromoter - operator
lac repressor
lac Z lac Y lac A
The lactose operon in E. coli
-galactosidase permease acetylase
LACTOSE GLUCOSE + GALACTOSE-galactosidase
•the function of the lactose (lac) operon is to produce the enzymes required to metabolize lactose for energy when it is required by the cell
• promoter binds CAP and RNA polymerase• operator binds the lac repressor
Regulation of the lactose operon - negative control
lac I P Opromoter - operator
lac repressor
lac I P lac Z lac Y lac A
• the repressor tetramer binds to the operator and prevents RNA polymerase from binding to the promoter
lac Z lac Y lac A
NO TRANSCRIPTION
RNA pol • RNA polymerase is blocked from the promoter
NO TRANSCRIPTION
• when lactose becomes available, it is taken up by the cell• allolactose (an intermediate in the hydrolysis of lactose) is produced• one molecule of allolactose binds to each of the repressor subunits• binding of allolactose results in a conformational change in the repressor• the conformational change results in decreased affinity of the repressor for the operator and dissociation of the repressor from the DNA
Alleviation of negative control - action of the inducer of the lac operon
allolactose
lac I P lac Z lac Y lac A
lac I P lac Z lac Y lac A
• IPTG (isopropyl thiogalactoside) is also used as a (non-physiological) inducer
lac I P lac Z lac Y lac A
NO TRANSCRIPTION
RNA pol
O
• repressor (with bound allolactose) dissociates from the operator• negative control (repression) is alleviated, however...
• RNA polymerase cannot form a stable complex with the promoter
lac I P O lac Z lac Y lac A
Regulation of the lactose operon - positive control
• in the presence of both lactose and glucose it is not necessary for the cell to metabolize lactose for energy• in the absence of glucose and in the presence of lactose it becomes advantageous to make use of the available lactose for energy• in the absence of glucose cells synthesize cyclic AMP (cAMP)• cAMP1 serves as a positive regulator of catabolite operons (lac operon)• cAMP binds the dimeric cAMP binding protein (CAP)2
• binding of cAMP increases the affinity of CAP for the promoter• binding of CAP to the promoter facilitates the binding of RNA polymerase
1 cAMP = 3’, 5’ cyclic adenosine monophosphate
active CAP inactive CAPcAMP
+
NO TRANSCRIPTION 2 also termed catabolite activator protein
lac I
lac repressor
lac Z lac Y lac A
-galactosidase permease acetylase
RNA pol
TRANSCRIPTION AND TRANSLATION OCCUR
inactive repressor
Activation of lac operon transcription
• the function of the lactose (lac) operon is to produce the enzymes required to metabolize lactose for energy when it is required by the cell
6. RNA Processing
a). Steps in mRNA processingi). Cappingii). Cleavage and polyadenylationiii). Splicing
b). Chemistry of mRNA splicingc). Spliceosome assembly and splice site recognition
i). Donor and acceptor splice sitesii). Small nuclear RNAs
d). Mutations that disrupt splicinge). Alternative splicing
Steps in mRNA processing (hnRNA is the precursor of mRNA)• capping (occurs co-transcriptionally)• cleavage and polyadenylation (forms the 3’ end)• splicing (occurs in the nucleus prior to transport)
exon 1 intron 1 exon 2
cap
cap
cap poly(A)
cap poly(A)
Transcription of pre-mRNA and capping at the 5’ end
Cleavage of the 3’ end and polyadenylation
Splicing to remove intron sequences
Transport of mature mRNA to the cytoplasm
Capping occurs co-transcriptionally shortly after initiation• guanylyltransferase (nuclear) transfers G residue to 5’ end• methyltransferases (nuclear and cytoplasmic) add methyl
groups to 5’ terminal G and at two 2’ ribose positions onthe next two nucleotides
capping involves formation of a 5’- 5’ triphosphate bond• cap function
• protects 5’ end of mRNA (increases mRNA stability)• required for initiation of protein synthesis
pppNpN
mGpppNmpNm
Polyadenylation• cleavage of the primary transcript occurs approximately 10-30 nucleotides 3’-ward of the AAUAAA consensus site• polyadenylation catalyzed by poly(A) polymerase• approximately 200 adenylate residues are added
• poly(A) is associated with poly(A) binding protein (PBP)• function of poly(A) tail is to stabilize mRNA
mGpppNmpNmAAUAAA
mGpppNmpNmAAUAAA AA
A
A
AA
3’
cleavage
polyadenylation
Chemistry of mRNA splicing• two cleavage-ligation reactions• transesterification reactions - exchange of one
phosphodiester bond for another - not catalyzed bytraditional enzymes• branch site adenosine forms 2’, 5’ phosphodiester bond
with guanosine at 5’ end of intron
G-p-G-U A-G-p-G
2’OH-A
-5’ 3’
intron 1
exon 1 exon 2
Pre-mRNA
First clevage-ligation (transesterification) reaction
branch site adenosine
G-OH 3’ A-G-p-G
U-G-5’-p-2’-A
5’ 3’A
A
O -
G-p-G5’ 3’
U-G-5’-p-2’-AA
3’ G-A
Splicingintermediate
Lariat
exon 1
exon 1
exon 2
exon 2
intron 1
intron 1
Second clevage-ligation reaction
Spliced mRNA
• ligation of exons releases lariat RNA (intron)
Mutations that disrupt splicing• o-thalassemia - no -chain synthesis• +-thalassemia - some -chain synthesis
Normal splice pattern:
Exon 1 Exon 2 Exon 3Intron 1 Intron 2
Donor site: /GU Acceptor site: AG/
Intron 2 acceptor site mutation: no use of mutant site; use of cryptic splice site in intron 2
Exon 1 Exon 2Intron 1
mutant site: GG/
Intron 2 cryptic acceptor site: UUUCUUUCAG/G
Translation of the retained portion of intron 2 results in premature termination of translation due to a stop codon within the intron, 15 codons fromthe cryptic splice site
Patterns of alternative exon usage• one gene can produce several (or numerous) different
but related protein species (isoforms)
Cassette
Mutually exclusive
Internal acceptor site
Alternative promoters
The Troponin T (muscle protein) pre-mRNA
is alternatively spliced to give rise to64 different isoforms of the protein
Constitutively spliced exons (exons 1-3, 9-15, and 18)
Mutually exclusive exons (exons 16 and 17)
Alternatively spliced exons (exons 4-8)
Exons 4-8 are spliced in every possible way
giving rise to 32 different possibilitiesExons 16 and 17, which are mutually
exclusive,double the possibilities; hence 64 isoforms
7. Protein Synthesis and the Genetic Code
a). Overview of translationi). Requirements for protein synthesisii). messenger RNAiii). Ribosomes and polysomesiv). Polarity of protein synthesis
b). Transfer RNAi). tRNA as an adaptorii). Amino acid activationiii). Aminoacyl tRNA synthetasesiv). “Charged” tRNA
c). The genetic codei). Codon-anticodon interactionsii). Initiation codon in prokaryotes vs. eukaryotesiii). Reading frame
d). Mutations affecting translationi). Frameshift mutationsii). Missense and nonsense mutations
Overview of translation
• last step in the flow of genetic information• definition of translation• requirements for protein synthesis
• mRNA• ribosomes• initiation factors• elongation and termination factors• GTP• aminoacyl tRNAs
• amino acids• aminoacyl tRNA synthetases• ATP
Messenger RNA (mRNA)
m7Gppp
Cap
5’5’ untranslated region
AUG
initiation codon
translated (coding) region
(AAAA)n
poly(A) tail
3’ untranslated region
UGAtermination codon
3’AAUAAA
Ribosomes• prokaryotic ribosome
• eukaryotic ribosome
70S ribosome
80S ribosome
50S subunit 23S rRNA 5S rRNA 35 proteins
60S subunit 28S rRNA 5S rRNA 5.8S rRNA 49 proteins
30S subunit 16S rRNA 21 proteins
40S subunit 18S rRNA 33 proteins
Polysomes• direction of translation is 5’ to 3’ along the mRNA
• direction of protein synthesis is N terminus to C terminus
UGA5’
large ribosomal subunit
small ribosomal subunit
AUG
polysome
nascentpolypeptide
NN
subunits dissociate
Transfer RNA• tRNA is the “adaptor” molecule in protein synthesis• acceptor stem
• CCA-3’ terminus to which amino acid is coupled• carries amino acid on terminal adenosine
•anticodon stem and anticodon loop
Amino acid activation and aminoacyl tRNA synthetases
• aminoacyl tRNA synthetases are the enzymes that “charge” the tRNAs• 20 amino acids• one aminoacyl tRNA synthetase for each amino acid• can be several different “isoacceptor” tRNAs for each amino acid• all isoacceptor tRNAs for an amino acid use the same synthetase
• each aminoacyl tRNA synthetase binds• amino acid• ATP• isoacceptor tRNAs
H2N-C-C-OHH
R--
O=ATP
H2N-C-C-O-P-O-ribose-adenineH
R--
O=
amino acid
adenylated (activated)amino acid
PPi
uncharged tRNA
H2N-C-C-OH
R--
O=
aminoacyl(charged)
tRNA
AMP
3’
Amino acid activationand
tRNA charging
The genetic code
• consists of 64 triplet codons (A, G, C, U) 43 = 64
• all codons are used in protein synthesis• 20 amino acids• 3 termination (stop) codons: UAA, UAG, UGA
• AUG (methionine) is the start codon (also used internally)
• multiple codons for a single amino acid = degeneracy
• 5 amino acids are specified by the first two nucleotides only
• 3 additional amino acids (Arg, Leu, and Ser) are specified bysix different codons
The Genetic Code
UUUUUCUUAUUG
CUUCUCCUACUG
AUUAUCAUAAUG
GUUGUCGUAGUG
UCUUCCUCAUCG
CCUCCCCCACCG
ACUACCACAACG
GCUGCCGCAGCG
UAUUACUAAUAG
CAUCACCAACAG
AAUAACAAAAAG
GAUGACGAAGAG
UGUUGCUGAUGG
CGUCGCCGACGG
AGUAGCAGAAGG
GGUGGCGGAGGG
Phe
Leu
Leu
Val
Ile
Met
Ser
Pro
Thr
Ala
Tyr
Stop
His
Gln
Asn
Lys
Asp
Glu
Cys
Arg
Ser
Arg
Gly
StopTrp
Codon-anticodon interactions• codon-anticodon base-pairing is antiparallel• the third position in the codon is frequently degenerate• one tRNA can interact with more than one codon (therefore 50 tRNAs)• wobble rules
• C with G or I (inosine)• A with U or I• G with C or U• U with A, G, or I• I with C, U, or A
5’ 3’
A U G
U A C
3’ 5’ tRNAmet
mRNA
5’ 3’
C U A G
G A U
3’ 5’ tRNAleu
mRNA
wobble base
• one tRNAleu can read two of the leucine codons
Inosine = Cytidine Inosine = Adenosine
Inosine = Uridine Guanosine = Uridine
Wobble Interactions
Initiation in prokaryotes and eukaryotes• initiation can occur at internal AUG codons in prokaryotic mRNA• initiation in eukaryotes occurs only at the first AUG codon•lac operon in E. coli is transcribed as a polycistronic mRNA with multiple AUG codons
lac I
•
• eukaryotic mRNA
P O lac Z lac Y lac AAUG AUG AUG
AUGSD AUGSDAUG
initiation codon with Shine-Dalgarno site
initiation codon with Shine-Dalgarno site
internal Met codondoes not have
Shine-Dalgarno site
5’
5’ cap AUG
initiation can only occur atfirst AUG codon downstream of the 5’ cap
AUG
internal (downstream) Met codon cannot serve as an initiation site
AUG
Reading frame• reading frame is determined by the AUG initiation codon• every subsequent triplet is read as a codon until reaching a stop codon
...AGAGCGGA.AUG.GCA.GAG.UGG.CUA.AGC.AUG.UCG.UGA.UCGAAUAAA... MET.ALA.GLU.TRP.LEU.SER.MET.SER
• a frameshift mutation
...AGAGCGGA.AUG.GCA.GA .UGG.CUA.AGC.AUG.UCG.UGA.UCGAAUAAA...
• the new reading frame results in the wrong amino acid sequence andthe formation of a truncated protein
...AGAGCGGA.AUG.GCA.GAU.GGC.UAA.GCAUGUCGUGAUCGAAUAAA... MET.ALA.ASP.GLY
Mutations affecting translation• hemoglobin Wayne (3’ terminal frameshift mutation)
Normal -globin .ACG.UCU.AAA.UAC.CGU.UAA.GCU GGA GCC UCG GUA.THR.SER.LYS.TYR.ARG
Wayne -globin .ACG.UCA.AAU.ACC.GUU.AAG.CUG.GAG.CCU.CGG.UAG.THR.SER.ASN.THR.VAL.LYS.LEU.GLU.PRO.ARG
mutated region
• missense mutations (e.g., AGC Ser to AGA Arg)• nonsense mutations (e.g., UGG Trp to UGA Stop)• read through, reverse terminator, or sense mutations
(e.g., UAA Stop to CAA Gln) as in hemoglobin Constant Spring
• silent mutations (e.g., CUA Leu to CUG Leu) do not affect translation
8. Protein Synthesis and Protein Processing
a). Ribosome structureb). Protein synthesis
i). Initiation of protein synthesisii). Peptide bond formation; peptidyl transferaseiii). Elongation and terminationiv). Inhibitors of protein synthesis
Antiviral action of interferonInduction of 2-5A synthaseInduction of eIF2 kinase
Antibioticsc). Protein processing
i). Synthesis of secreted and integral membrane proteinsii). Glycosylation and protein targetingiii). Proteolytic processing
Learning Objectives for Lecture 8:
• Understand the structure of the ribosome in the context of the translation process • Understand the steps in the initiation of protein synthesis • Understand the mechanism of peptide bond formation, and that it is RNA catalyzed • Understand the processes of elongation and termination • Understand how interferon inhibits viral protein synthesis • Understand the mechanisms by which antibiotics inhibit protein synthesis and how some organisms become resistant to antibiotics • Understand how secreted and membrane-bound proteins are synthesized • Understand how proteins are glycosylated and what the functions of the carbohydrates are • Understand the role of proteolytic processing in protein maturation
Ribosome structure
A
P P PPPPPP
P-sitepeptidyl tRNA site
A-siteaminoacyl tRNA site
mRNA5’
Small subunit
Large subunit
Ribosome with bound tRNAs and mRNA
mRNA5’ cap
40S subunit
M
eIF2
AUG
Initiator tRNA bound to thesmall ribosomal subunit with the
eukaryotic initiation factor-2 (eIF2)
Initiation of protein synthesis: mRNA binding
The small subunit finds the 5’ cap andscans down the mRNA to the first AUG codon
mRNA5’
40S subunit
M
eIF2
AUG
• the initiation codon is recognized• eIF2 dissociates from the complex• the large ribosomal subunit binds
60S subunit
mRNA5’
M
AUG
• aminoacyl tRNA binds the A-site
• first peptide bond is formed
• initiation is complete
GCC
A
mRNA5’
M
AUG GCC
A
C
NH2
CH3-S-CH2-CH2-CH O=C
Peptide bond formation
• peptide bond formation iscatalyzed by peptidyl transferase
• peptidyl transferase is contained withina sequence of 23S rRNA in theprokaryotic large ribosomal subunit;therefore, it is probably withinthe 28S rRNA in eukaryotes
• the energy for peptide bond formationcomes from the ATP used in tRNA charging
• peptide bond formation results in a shiftof the nascent peptide from the P-siteto the A-site
NH2
CH3-S-CH2-CH2-CH O=C O
tRNA
NH2
CH3-CH O=C O
tRNA
N
P-site A-site
OH
tRNA
NHCH3-CH O=C O
tRNA
Cech (2000) Science 289:878-879Ban et al. (2000) Science 289:905-920Nissen et al. (2000) Science 289:920-930
Large ribosomal subunit
Protein (purple)lies on the surface
23S RNA (orange and white) makesup the core of the subunit
• Structure shows only RNAin the active site
• Adenine 2451 carries outacid-base catalysis
P
UCA
PPPP
P
UCA GCA GGG UAG
A
PPP
P
Elongation
GCA GGG UAG
• following peptide bond formationthe uncharged tRNA dissociatesfrom the P-site
• the ribosome shifts one codon alongthe mRNA, moving peptidyl tRNAfrom the A-site to the P-site; thistranslocation requires theelongation factor EF2
• the next aminoacyl tRNA thenbinds within the A-site; this tRNAbinding requires the elongationfactor EF1
• energy for elongation is provided bythe hydrolysis of two GTPs:• one for translocation• one for aminoacyl tRNA binding
EF1
EF2
P
UCA GCA GGG UAG
PPP
P
Termination
• when translation reaches the stopcodon, a release factor (RF) bindswithin the A-site, recognizing thestop codon
• release factor catalyzes the hydrolysisof the completed polypeptide fromthe peptidyl tRNA, and the entirecomplex dissociates
RF
P
UCA GCA GGG UAG
PPPPP
PP
Induction and action of interferon
virus
virus invades cell
cell makes interferonin response to viral RNA
cell cannotprotect itself
virus replicatescell succumbs
interferon binds toreceptors on neighboring cells
and activates the cells
cell synthesizesantiviral proteinsin response to
interferon activationvirus invades neighboring cell
cell protected from viralinfection by antiviral proteins
Functions of two antiviral proteins
interferoninduces
ATPviral dsRNA
2-5A synthaseoligo 2-5 adenylate (2-5A)
[-A-2’-p-5’-A-2’-p-5’A-] N
eIF2viral dsRNA
eIF2 kinaseeIF2
P
active inactive:viral protein synthesis cannot initiate
inactiveendonuclease
activeendonuclease:
viral mRNA degraded
Inhibitors of protein synthesis
Inhibitor Process Affected Site of Action Kasugamycin initiator tRNA binding 30S subunitStreptomycin initiation, elongation 30S subunitTetracycline aminoacyl tRNA binding A-siteErythromycin peptidyl transferase 50S subunitLincomycin peptidyl transferase 50S subunitClindamycin peptidyl transferase 50S subunit Chloramphenicol peptidyl transferase 50S subunit
Staphylococcus resistance to erythromycin
• certain strains of Staphylococcus can carry a plasmid that encodesan RNA methylase
• this RNA methylase converts a single adenosine residue in 23S rRNAto N6-dimethyladenosine
• this is the site of action of erythromycin, lincomycin, and clindamycin• N6-dimethyladenosine blocks the action of these antibiotics• the organism that produces erythromycin has its own RNA methylase
and thus is resistent to the antibiotic it makes
Protein maturation: modification, secretion, targeting
5’ AUG
polysome for secreted protein
2. the signal recognition particlea (SRP)
binds the signal peptideb and halts translation
1. translation initiates as usual on a cytosolic mRNA
athe signal recognition particle (SRP) consists of protein and RNA (7SL RNA); it binds to the signal peptide, to the ribosome, and to the SRP receptor on the ER membranebthe signal peptide is a polypeptide extension of 10-40 residues, usually at the N-terminus of a protein, that consists mostly of hydrophobic amino acidscER = endoplasmic reticulum
ER lumen c
cytosol
3. the SRP docks with the SRP receptor on the cytosolic side of the ER membrane and positions the signal peptide for insertion through a pore
SRP SRP receptor
Translation of a secreted protein
5’
ER lumen
cytosol
4. translation resumes and the nascent polypeptide moves into the ER lumen
5. signal peptidase, which is in the ER lumen, cleaves off the signal peptide
7. the ribosomes dock onto the ER membrane; the rough ER is ER studded with polysomes
6. the SRP is released and is recycled
5’
ER lumen
cytosol
UGA
8. translation continues with the nascent polypeptide emerging into the ER lumen
9. at termination of translation, the completed protein is within the ER and is further processed prior to secretion
completed protein is processed andsecreted
• Examples of secreted proteins:• polypeptide hormones (e.g., insulin)• albumin• collagen• immunoglobulins
• Integral membrane proteins are also synthesized by the same mechanisms; they may be considered “partially secreted”• Examples of integral membrane proteins:
• polypeptide hormone receptors (e.g., insulin receptor)• transport proteins• ion channels• cytoskeletal anchoring proteins (e.g., band 3)
Glycosylation of proteins• most integral membrane proteins and secreted proteins are glycosylated• during translation on the ER membrane the protein begins to be glycosylated• various oligosaccharide modifications occur in the ER and in the Golgi complex
• O-linked (Ser, Thr linked) oligosaccharides (linked to hydroxyl group)• N-linked (Asn linked) oligosaccharides (linked to amide group)
Biosynthesis of N-linked oligosaccharides (first 7 steps)
ER lumen
Cytosol
P
(1) UMP, (1) UDP
Dolichol phosphate (polyprenol lipid carrier)
N-acetylglucosamine (GlcNAc) =
Mannose =
(2) UDP-
PP(5) GDP-
(5) GDP
PPreorientation
Monosaccharides are transferredby specific glycosyltransferases
from nucleotide sugars
PP ER lumen
Dolicol-phosphates are thesugar donors in the ER lumen;
they are synthesized in the cytosolprior to being translocated to the lumen
Cytosol
PP
PP
(4)
(3)Dolicol-P-mannose =
Dolicol-P-glucose =
P
PP
P
Biosynthesis of N-linked oligosaccharides (second 7 steps)
ER lumen
Cytosol
PP
Linkage is to the amide group of an asparaginefollowed by any (X) amino acid (except proline)
followed by serine or threonine
Transfer of oligosaccharide chainto the growing polypeptide
AsnIXI
Ser (Thr)
Following synthesis, the protein is transferredto the Golgi complex, where trimming and further
building of the oligosaccharides occurs
Transfer of oligosaccharide to protein
AsnI
XI
Ser (Thr)
AsnI
XI
Ser (Thr)
Trimming by glycosidases;Building by glycosyltransferases
A complex type oligosaccharide
fucose = galactose = sialic acid =come from nucleotide sugars translocated
across the Golgi membrane
Golgi lumen
CytosolThe type of carbohydrate determines whether
the protein is targeted to the membrane,to a vesicle, or is secreted
= common core structure
Formation of complex type oligosaccharides
Targeting of proteins to lysosomes (I-cell disease)
Asn
Asn
UDP-
P
P
Asn
P
P
• Proteins containing mannose-6-phosphate are targeted to lysosomes
• Patients with I-cell (for inclusion body) disease have a deficiency in the enzyme that transfers GlcNAc phosphate to mannose residues in the Golgi
• Phosphate groups are added to mannose by transfer of GlcNAc phosphate from UDP-GlcNAc
• The resulting deficiency in lysosomal hydrolases results in an accumulation (inclusions) of material in the lysosomes
• These proteins include the lysosomal hydrolases
• As a result, the hydrolases cannot be targeted to the lysosomes
Proteolytic processing
Processing of insulin (synthesized in the ER of pancreatic -cells)
N
CPreproinsulin
cleavage ofsignal peptideby signalpeptidase
Signal peptide
C
SI
S
SI
S
N
Proinsulin
C
SI
S
SI
S
N
C-chain
Cleavage by trypsin-like enzymesreleases the C-peptide
C
SI
S
SI
S
NInsulin
CN
Disulfide bondformation
Further trimming by a carboxypeptidase B-like enzyme removes two basic residues from each of the new ends
C-chain The C-chain is packaged in the secretoryvesicle and is secreted along with active insulin
B-chain
A-chain
Preproopiomelanocortin
• multiple functional polypeptides from a single precursor• processed in a cell-specific manner
26aa 48aa 12aa 40aa 14aa 21aa 40aa 18aa 26aaN C
Signalpeptide
Proopiomelanocortin
Corticotropin(ACTH)
-MSH -Lipotropin
-MSH -MSHEndorphin
-LipotropinEnkephalin (5aa)
31aa
5aa