genome organization & protein synthesis and processing in plants
Post on 19-Dec-2015
230 views
TRANSCRIPT
![Page 1: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/1.jpg)
Genome Organization & Protein Synthesis and Processing in Plants
![Page 2: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/2.jpg)
Viral genomesViral genomes: ssRNA, dsRNA, ssDNA, dsDNA, linear or ciruclar
Viruses with RNA genomes: •Almost all plant viruses and some bacterial and animal viruses•Genomes are rather small (a few thousand nucleotides)Viruses with DNA genomes (e.g. lambda = 48,502 bp):•Often a circular genome.Replicative form of viral genomes•all ssRNA viruses produce dsRNA molecules•many linear DNA molecules become circularMolecular weight and contour length: • duplex length per nucleotide = 3.4 Å• Mol. Weight per base pair = ~ 660
![Page 3: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/3.jpg)
Procaryotic genomes
• Generally 1 circular chromosome (dsDNA)
• Usually without introns• Relatively high gene density (~2500
genes per mm of E. coli DNA)• Contour length of E.coli genome: 1.7
mm• Often indigenous plasmids are present
![Page 4: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/4.jpg)
PlasmidsExtra chromosomal circular DNAs• Found in bacteria, yeast and other fungi• Size varies form ~ 3,000 bp to 100,000 bp.• Replicate autonomously (origin of replication)• May contain resistance genes• May be transferred from one bacterium to another• May be transferred across kingdoms• Multicopy plasmids (~ up to 400 plasmids/per cell)• Low copy plasmids (1 –2 copies per cell)• Plasmids may be incompatible with each other• Are used as vectors that could carry a foreign gene
of interest (e.g. insulin)
-lactamase
ori
foreign gene
![Page 5: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/5.jpg)
Eukaryotic genome
• Moderately repetitive– Functional (protein coding, tRNA coding)– Unknown function
• SINEs (short interspersed elements)– 200-300 bp
– 100,000 copies
• LINEs (long interspersed elements)– 1-5 kb
– 10-10,000 copies
![Page 6: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/6.jpg)
Eukaryotic genome
• Highly repetitive– Minisatellites
• Repeats of 14-500 bp• 1-5 kb long• Scattered throughout genome
– Microsatellites• Repeats up to 13 bp• 100s of kb long, 106 copies• Around centromere
– Telomeres• Short repeats (6 bp)• 250-1,000 at ends of chromosomes
![Page 7: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/7.jpg)
Eucaryotic genomes• Located on several chromosomes• Relatively low gene density (50 genes per mm
of DNA in humans)• Contour length of DNA from a single human
cell = 2 meters• Approximately 1011 cells = total length 2 x 1011
km• Distance between sun and earth (1.5 x 108 km)• Human chromosomes vary in length over a 25
fold range • Carry organelles genome as well
![Page 8: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/8.jpg)
Mitochondrial genome (mtDNA)
• Multiple identical circular chromosomes
• Size ~15 Kb in animals• Size ~ 200 kb to 2,500 kb in plants• Over 95% of mitochondrial proteins
are encoded in the nuclear genome.• Often A+T rich genomes. • Mt DNA is replicated before or during
mitosis
![Page 9: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/9.jpg)
Chloroplast genome (cpDNA)
• Multiple circular molecules • Size ranges from 120 kb to 160 kb• Similar to mtDNA• Many chloroplast proteins are
encoded in the nucleus (separate signal sequence)
![Page 10: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/10.jpg)
“Cellular” GenomesViruses Procaryotes Eucaryotes
Viral genome Bacterial chromosome
Plasmids
Chromosomes(Nuclear genome)
Mitochondrial genome
Chloroplast genome
Genome: all of an organism’s genes plus intergenic DNA Intergenic DNA = DNA between genes
Capsid
Nucleus
![Page 11: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/11.jpg)
Estimated genome sizes
1e1 1e2 1e3 1e4 1e5 1e6 1e7 1e8 1e9 1e10 1e11 1e12
viruses (1024)
bacteria (>100)
fungi
mitochondria (~ 100)
plants
mammals
Size in nucleotides. Number in ( ) = completely sequenced genomes
![Page 12: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/12.jpg)
Size of genomes
Epstein-Barr virus 0.172 x 106
E. coli 4.6 x 106
S. cerevisiae 12.1 x 106
C. elegans 95.5 x 106
A. thaliana 117 x 106
D. melanogaster 180 x 106
H. sapiens 3200 x 106
![Page 13: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/13.jpg)
Chromosome organizationEucaryotic chromosome
Telomere TelomereCentromere
Centromere: • DNA sequence that serve as an attachment for protein during mitosis. • In yeast these sequences (~ 130 nts) are very A+T rich. • In higher eucaryotes centromers are much longer and contain “satellite DNA”
Telomeres:• At the end of chromosomes; help stabilize the chromosome• In yeast telomeres are ~ 100 bp long (imperfect repeats)• Repeats are added by a specific telomerase
p-arm q-arm
5’ – (TxGy)n3’ – (AxCy)n
x and y = 1 - 4n = 20 to 100; (1500 in mammals)
![Page 14: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/14.jpg)
Gene classification
coding genesnon-coding genes
Messenger RNA
Proteins
Structural RNA
Structural proteins Enzymes
transfer RNA
ribosomal RNA
otherRNA
Chromosome(simplified)
intergenic region
![Page 15: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/15.jpg)
What is a gene ?• Definitions
1. Classical definition: Portion of a DNA that determines a single character (phenotype)
2. One gene – one enzyme (Beadle & Tatum 1940): “Every gene encodes the information for one enzyme”
3. One gene – one protein: “One gene contains information for one protein (structural proteins included) one gene – one polypeptide
4. Current definition: A piece of DNA (or in some cases RNA) that contains the primary sequence to produce a functional biological gene product (RNA, protein).
![Page 16: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/16.jpg)
Coding region
Nucleotides (open reading frame) encoding the amino acid sequence of a protein
The molecular definition of gene includes more than just the coding region
![Page 17: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/17.jpg)
Noncoding regions
• Regulatory regions– RNA polymerase binding site– Transcription factor binding sites
• Introns
• Polyadenylation [poly(A)] sites
![Page 18: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/18.jpg)
Gene
Molecular definition:
Entire nucleic acid sequence necessary for the synthesis of a functional polypeptide (protein chain) or functional RNA
![Page 19: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/19.jpg)
Anatomy of a gene
• ORF. From start (ATG) to stop (TGA, TAA, TAG)
• Upstream region with binding site. (e.g. TATA box).
• Poly-a ‘tail’
• Splices. Bounded by AG and GT splice signals.
![Page 20: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/20.jpg)
Bacterial genes
• Most do not have introns
• Many are organized in operons: contiguous genes, transcribed as a single polycistronic mRNA, that encode proteins with related functions
Polycistronic mRNA encodes several proteins
![Page 21: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/21.jpg)
What would be the effect of a mutation in the control region (a) compared to a
mutation in a structural gene (b)?
Bacterial operon
![Page 22: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/22.jpg)
Eucaryotic genes
Exon 190 bp
Exon 2222 bp
Exon 3126 bp
Intron A131 bp
Intron B851 bp
Hemoglobin beta subunit gene
Introns: intervening sequences within a gene that are not translatedinto a protein sequence. Collagen has 50 introns.
Exons: sequences within a gene that encode protein sequencesSplicing: Removal of introns from the mRNA molecule.
Splicing
![Page 23: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/23.jpg)
Regulatory mechanisms
• ‘organize expression of genes’ (function calls)
• Promoter region (binding site), usually near coding region
• Binding can block (inhibit) expression• Computational challenges
– Identify binding sites– Correlate sequence to expression
![Page 24: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/24.jpg)
Eukaryotic genes
• Most have introns
• Produce monocistronic mRNA: only one encoded protein
• Large
![Page 25: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/25.jpg)
Alternative splicing
• Splicing is the removal of introns
• mRNA from some genes can be spliced into two or more different mRNAs
![Page 26: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/26.jpg)
“Nonfunctional” DNA
• Higher eukaryotes have a lot of noncoding DNA
• Some has no known structural or regulatory function (no genes)
80 kb
![Page 27: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/27.jpg)
Types of eukaryotic DNA
![Page 28: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/28.jpg)
Duplicated genes• Encode closely related (homologous)
proteins
• Clustered together in genome
• Formed by duplication of an ancestral gene followed by mutation
Five functional genes and two pseudogenes
![Page 29: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/29.jpg)
Pseudogenes
• Nonfunctional copies of genes
• Formed by duplication of ancestral gene, or reverse transcription (and integration)
• Not expressed due to mutations that produce a stop codon (nonsense or frameshift) or prevent mRNA processing, or due to lack of regulatory sequences
![Page 30: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/30.jpg)
Repetitive DNA• Moderately repeated DNA
– Tandemly repeated rRNA, tRNA and histone genes (gene products needed in high amounts)
– Large duplicated gene families– Mobile DNA
• Simple-sequence DNA– Tandemly repeated short sequences– Found in centromeres and telomeres (and others)– Used in DNA fingerprinting to identify
individuals
![Page 31: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/31.jpg)
Types of DNA repeats
Tandem repeats (e.g. satellite DNA)
Inverted repeats (e.g. in transposons)
5’-CATGTGCTGAAGGCTATGTGCTGCGACG- 3’3’-GTACACGACTTCCGATACACGACGCTGC- 5’
5’-CATGTGCTGAAGGCTCAGCACATCGACG- 3’3’-GTACACGACTTCCGAGTCGTGTAGCTGC- 5’ Stem
Loop
Palindroms = adjacent inverted repeats (e.g. restriction sites)• Form hairpin structures
• Form stem-loop structures
Hairpin
Perfect repeats vs degenerate repeats
![Page 32: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/32.jpg)
Repetitive sequencesChromosomal DNA
Satellite DNA
Caesium chloridedensity gradient
Type No. of Repeats
Size Percent of genome
Highly repetitive
> 1 Mill < 10 bp 10 %
Moderately repetitive
> 1000 ~ 150 - ~300 bp 20 %
Repeats in the mouse genome
![Page 33: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/33.jpg)
DNA repeats and forensics
878 bp556 bp
M F Suspect
Alu sequenceY
X
M F Suspect
528 bp199 bp
X-Y homologous regionsAluSTYa
AluSTXa
AluSTYa
Gender determination1) Standard technique: PCR
amplification of the amelogenin locus (Males = XY => 103 + 109 bp)
2) AluSTXa Alu insertion on X 3) AluSTYa Alu insertion on Y
![Page 34: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/34.jpg)
Mobile DNA
• Move within genomes
• Most of moderately repeated DNA sequences found throughout higher eukaryotic genomes– L1 LINE is ~5% of human DNA (~50,000 copies)– Alu is ~5% of human DNA (>500,000 copies)
• Some encode enzymes that catalyze movement
![Page 35: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/35.jpg)
Transposition
• Movement of mobile DNA
• Involves copying of mobile DNA element and insertion into new site in genome
![Page 36: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/36.jpg)
Why?
• Molecular parasite: “selfish DNA”
• Probably have significant effect on evolution by facilitating gene duplication, which provides the fuel for evolution, and exon shuffling
![Page 37: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/37.jpg)
RNA or DNA intermediate
• Transposon moves using DNA intermediate
• Retrotransposon moves using RNA intermediate
![Page 38: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/38.jpg)
Types of mobile DNA elements
![Page 39: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/39.jpg)
LTR (long terminal repeat)• Flank viral retrotransposons and retroviruses
• Contain regulatory sequencesTranscription start site and poly (A) site
![Page 40: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/40.jpg)
![Page 41: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/41.jpg)
LINES and SINES• Non-viral retro-transposons
– RNA intermediate– Lack LTR
• LINES (long interspersed elements)– ~6000 to 7000 base pairs– L1 LINE (~5% of human DNA)– Encode enzymes that catalyze movement
• SINES (short interspersed elements)– ~300 base pairs– Alu (~5% of human DNA)
![Page 42: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/42.jpg)
Proteins
• Most protein sequences (today) are inferred• What’s wrong with this?• Proteins (and nucleic acids) are modified• ‘mature’ Rna• Computational challenges
– Identify (possible) aspects of molecular life cycle
– Identify protein-protein and protein-nucleic acid interactions
![Page 43: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/43.jpg)
Genetic variation
• Variable number tandem repeats (minisatellites). 10-100 bp. Forensic applications.
• Short tandem repeat polymorphisms (microsatellites). 2-5 bp, 10-30 consecutive copies.
• Single nucleotide polymorphisms
![Page 44: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/44.jpg)
Single nucleotide polymorphisms
• 1/2000 bp.
• Types– Silent– Truncating – Shifting
• Significance: much of individual variation.
• Challenge: correlation to disease
![Page 45: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/45.jpg)
Yeast genome
• 4.6 x 106 bp. One chromosome. Published 1997.
• 4,285 protein-coding genes
• 122 structural RNA genes
• Repeats. Regulatory elements. Transposons.
• Lateral transfers.
![Page 46: Genome Organization & Protein Synthesis and Processing in Plants](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d2b5503460f94a00878/html5/thumbnails/46.jpg)
Yeast protein functionsRegulatory 45 1.05%
Cell structure 182 4.24
Transposons,etc 87 2.03
Transport & binding 281 6.55
Putative transport 146 3.40
Replication, repair 115 2.68
Transcription 55 1.28
Translation 182 4.24
Enzymes 251 5.85
Unknown 1632 38.06