what have we learned from unicellular genomes?. propionibacterium acnes responsible for acne, its...
Post on 22-Dec-2015
219 views
TRANSCRIPT
What have we learned from Unicellular Genomes?
Propionibacterium acnes
• Responsible for acne, its genome sequenced in 2004.
• It lives on human skin in sebaceous follicles; feeds on sebum and this stimulates immune response of inflammation.
• Can we understand pimples?
Anatomy of acne
Propionibacterium acnes genome
• Sequenced by three different groups.– 32 190 sequencing reactions– 8.7-fold coverage of 2 560 265 bp genome– Error rate of 0.0001– Genome contains a single circular
chromosome and no additional plasmids.– Annotation of 2333 putative genes, allowed
for construction of the metabolism.
Propionibacterium acnes genome
• 12% encoded RNA products (rRNA and tRNA).
• 1578 (68%) is orthologous with other organisms and 20% does not match with anything.
GC skewing
• A non-uniform distribution of guanine and cytosine bases on the two strands of DNA.– Origin of replication have the lowest GC skew
(even distribution)– Terminus of replication have higher GC
skewing.
Horizontal Transfer
• Genes appeared in genome through an unknown mechanism.
• To find alien genes, scan the genome with a sliding window for segments that have an abnormal GC content (either higher or lower than the species average) and evaluate the codon bias.– Which codon is used more often than other
codons for a particular amino acid.
Transcriptional Phase Variation
• Variation in the Gs is used to produce transcriptional variation.
• Initiation of transcription depends on the number of consecutive guanines on a particular strand at a critical location upstream of the coding region.
• Regions of replicating bases are difficult to accurately replicate which will affect the transcriptional efficiency.
Which genes cause pimples?
• Metabolic reconstruction:• Can grow anaerobically and aerobically.• Has many enzymes to degrade lipids, esters and
amino acids.• P. acnes digestive enzymes have LPXTG motif
that targets proteins to the extracellular wall; these enzymes chew away on your cells.– cell-wall sorting signal LPXTG responsible for
covalently anchoring proteins to the cell-wall peptidoglycan
– LPxTG, the target for cleavage and covalent coupling to the peptidoglycan by enzymes called sortases
Which genes cause pimples?
• Cells exterior is decorated with hyaluronate lyase that destroys the extracellular matrix binding your skin cells together and thus facilitates further tissue invasion and digestion.
LPxTG Database: Sortase substrates
http://bamics3.cmbi.kun.nl/cgi-bin/jos/sortase_substrates/index.py
Stimulation of immune response
• Genome encodes five CAMP (Christie, Atkins, Munch-Peterson) factors. CAMP factors are secreted proteins that bind to antibodies (IgG and IgM) and can form pores in eukaryotic cell membranes.
• Lysis of our cells trigger an immune response.
CAMP factors
• Proteins from BACTERIA and FUNGI that are soluble enough to be secreted to target ERYTHROCYTES and insert into the membrane to form beta-barrel pores. Biosynthesis may be regulated by HEMOLYSIN FACTORS
Quorum Sensing
• Many bacteria have evolved the ability to condition culture medium by secreting low-molecular-weight signaling pheromones in association with growth phase to control expression of specific genes, a process termed quorum sensing – Bioluminescence– antibiotic biosynthesis– Pathogenicity– plasmid conjugal transfer
Quorum Sensing
• LuxS produces the precursor of autoinducer-2 (AI-2), 4,5,-dihydroxy-2,3-pentanedione (DPD), whilst converting S-ribosylhomocysteine to homocysteine.
Are all bacteria Living in Us Bad for Us?
• An average adult body is composed of about 10 trillion human cells.
• Every milliliter of your large intestine’s content is estimated to contain 10 billion microbes and our intestines contain about 1 L..
• There are about 500 to 1000 different species living in an adult’s intestines.
Bacteroides thetaiotaomicron
• 31 million bases• Assembly of 867 contigs with many gaps.• Finished assembly by PCR• 67 938 sequencing runs into a single 6 260 361
bp circular contig.• Annotated 4779 predicted ORFs with 58%
orthologs of known function, 18% orthologs of proteins with no known function and 24% with no recognizable sequence similarity.
COGs
• Clusters of orthologous group are functional categories of genes.
• They are phylogenetic classiciation of proteins encoded in complete genomes.– Transcription– Energy production, etc.
http://www.ncbi.nlm.nih.gov/COG/
Eukaryotic Clusters
ADH
CDH
Bacteroides thetaiotamicron
• It can metabolize sugars.
• 170 genes for polysaccharide metabolism; paralogs of 23 genes.
• E. coli has only 8 of them.
• It can also import sugars into its own cytoplasm.– Has two genes SusC and SusD represented
by 163 paralogs.
Transposable Elements
• 63 TEs contain ORFs (open reading frames) that help spread tetracycline and erythromycin resistance between individual cells and between species in the microbiota of the gut.
Coding Capacity
• Gene density for B. thetaiotaomicron is 89%.
• Average size of a gene is 1170 bp-largest among bacteria.– M. genitalium 1100 bp– H. pylori 1000 bp– E. coli 950 bp
Can Microbial Genomes Become Dependent upon Human Genes?
• Second smallest bacterial genome of a self-replicating species (589 070 bp).
• A team in TIGR (The Institute for Genomic Research)– 5 people, 8 weeks assembled 8472 high-quality
sequencing reactions.– Overall GC content is 32%– GC skew reveals the origin of replication as DnaA and
DnaN genes.• Right to the OR transcribed from plus strand• Left to the OR transcribed from minus strand• tRNA and rRNA genes have higher GC content, 52 and 44%.
Genome Map
• 470 ORFs; 88% coding capacity; average gene is 1040bp.
• Retained genes for energy metabolism, fatty acid and PL metabolism, replication, transcription, and protein transport.
• Lost DNA when no need for it.– aa synthesis– Cofactors– Cell envelope– Regulatory factors
Synteny
• When a series of genes are conserved in order and orientation between two or more species, the genes are described as syntenic.– M. genitalium and H. influenzae has similar
gene orders with respect to two clusters of ribosomal proteins.
Minimum Number of Genes
• Synthetic biology: to synthesize de novo (from scratch) a functioning genome with as few genes as possible.– Bacillus subtilis – 190 genes– M. genitalium – 260 genes
Bacteria vs. Viruses
• Smallest genome is an Archean N. equitans (490 kb)
• HIV-9200 nt• SARS-29797 nt• Lambda-48502 nt• Acanthamoeba polyphaga-Mimivirus:
infects amoeba– dsDNA-1 181 404 bp with 1262 ORFs linear
chromosome
Mimivirus Genome
• 28%GC rich• 90% coding capacity• Uses biased codons-lacking G or C; uses the
least common codon in amoeba the least.• It has proteins used for translation,
posttranslation modification, DNA repair-sounds more like a eukaryote.– Encodes topoisomerases– Has a self-splicing intron
Is Mimivirus Alive?
• Mimivirus is most closely associated with Eurkaryota
• Infectious after 1 year of incubation at 4 C.• Survived 48 hours of desiccation and 1%
survived 55 C.• Mimivirus can participate in all major steps
of translation.– A life form– Highly modified virus?
Malaria
• 3 billion people in the world in tropical and subtropical climates affected.
• Malaria causing ekaryotic parasite genus Plasmodium
• 2.7 million people die each year.
Plasmodium
• Plasmodium falciparum is the most lethal form transmitted to humans by Anopheles mosquito.– Infected mosquito bites, parasite leaves salivary
glands move to liver and infects hepatocytes. They mature in hepatocytes and hatch out into RBCs.
– A new parasite emerges from RBCs by bursting it, release progeny and metabolic waste causing fever followed by chills.
– A few cells differentiate into gametes move through blood can be ingested by new mosquito and gatmetes form zygotes and meiosis and to salivary glands.
Infection of RBCs
• RBC 6 micron; plasmodium 1.2 micron
• Plasmodium enters RBC by evading immune system by sticking to RBCs.
• Apicoblast organnelle that is made up of a remnant internalized alga retaining its small genome needed for plasmodium survival.
Plasmodium Genome
• Three genomes– Nuclear: chromosomes separated through
pulse-field gel electrophoresis before random fragmentation and cloning; 22 853 764 with 5268 ORFs; 19.4% GC; 52.6% coding capacity; average gene length 2283 bp.
– Mitochondrial: 5967 bp encodes 3 proteins– Apicoplastic: 29 422 bp encodes 30 proteins
Plasmodium is a eukaryote
• 54% of its genes contains one or more intron with an average 13.5%GC (exons have higher GC%).
• 60% of ORFs have no known function
rRNA genes
• In many species rRNA genes appear in linear clusters
• In Plasmodium, rRNA gene distribution var, their expression is host specific; some are expressed in human; the other set is active in mosquito
Centromeres and telomeres
• Centromeres are AT rich (97%) and contain short tandem repeats.
• Telomeres have repeated sequences that vary in length; some genes located nearby telomeres are replicated many times therefore genes have paralogs.– Highly variable gene families, var, rif and
stevor (polymorphic) and may add variation to the extracellular surface of the Plasmodium.
Hydropathy plot
http://expasy.org/cgi-bin/protscale.pl
Hydropathy plot
Plasmodium
• 31% of the encoded polypeptides are predicted to be integral proteins.– 1% cell-to-cell adhesion– 4% evasion of immune system
Apicoblast
• Derived chloroplast
• Synthesizes fatty acids, isoprenoids, and heme groups
• 10% of all proteins help apicoblast DNA replication and repair, transcription, translation, posttranslational glycosylation etc.
Food
• Plasmodium feeds on hemoglobin, digests it in food vacuole;
• It has no genes for aa synthesis; no trehalose (storage sugar in yeast) storage nor glycogen ‘lives at the moment’
Is there a model eukaryote genome?
• yeast
Yeast Genome
• Published in October 1996• 12 068 kb genome of 16 chromosomes• 6272 ORFs
– 38.3% GC with a coding capacity of 70.3%– GC content for eukaryotes generally higher for the
coding portions.– Coding capacity is much lower than bacteria
• Yeast has a gene every 2 kb• Worm has a gene every 6 kb• Humans have a gene every 30 kb
Genome structure
• S. cereviciae experienced genome duplication events.
• Chromosomes V and X, IV and II, and III and XIV are have paralogous regions.– Duplicated region on chr III contains four
genes; one of which is citrate synthase (cit2).• Cit2(chrIII) targets peroxisome and cit1(chrXIV)
targets the mitochondrion.