statistics for microarrays
DESCRIPTION
Statistics for Microarrays. Biological background: Molecular Biology. Class web site: http://statwww.epfl.ch/davison/teaching/Microarrays/ETHZ/. Two types of organisms *. * Every biological ‘rule’ has exceptions!. Mendelian Genetics. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/1.jpg)
Biological background: Molecular Biology
Class web site:
http://statwww.epfl.ch/davison/teaching/Microarrays/ETHZ/
Statistics for Microarrays
![Page 2: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/2.jpg)
Two types of organisms*
* Every biological ‘rule’ has exceptions!
![Page 3: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/3.jpg)
http://www.stg.brown.edu/webs/MendelWeb/MWtoc.html
Mendelian Genetics
![Page 4: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/4.jpg)
Human Chromosomes
![Page 5: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/5.jpg)
Human Chromosome Banding Patterns
![Page 6: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/6.jpg)
Chromosomes and DNA
![Page 7: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/7.jpg)
Mitosis and Meiosis Compared
![Page 8: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/8.jpg)
Nature (1953), 171:737
“We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This structure has novel features which are of considerable biological interest.”
DNA Structure Discovery
![Page 9: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/9.jpg)
DNA• A deoxyribonucleic acid or DNA molecule
is a double-stranded linear polymer composed of four molecular subunits called nucleotides
• Each nucleotide comprises a phosphate group, a deoxyribose sugar, and one of four nitrogen bases: adenine (A), guanine (G), cytosine (C), or thymine (T)
• The two strands are held together by weak hydrogen bonds between complementary bases
• Base-pairing occurs according to the rule: G pairs with C, and A pairs with T
![Page 10: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/10.jpg)
DNA A-type (140D)(low water content)
DNA B-type (7BNA)(Watson-Crick form)
DNA Z-type (2ZNA)(high salt concentration)
Polymorphic DNA Tertiary Structures
![Page 11: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/11.jpg)
Genes are linearly arranged along chromosomes
![Page 12: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/12.jpg)
DNA Structure (overview)
![Page 13: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/13.jpg)
A nucleotide is a phospate, a sugar, and a purine (A, G) or a pyramidine (T, C) base.
The monomeric units of nucleic acids are called nucleotides.
DNA Structure
![Page 14: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/14.jpg)
Proteins
• Proteins: macromolecules composed of one or more chains of amino acids
• Amino acids: class of 20 different organic compounds containing a basic amino group (-NH2) and an acidic carboxyl group (-COOH)
• The order of amino acids is determined by the base sequence of nucleotides in the gene coding for the protein
• Proteins function as enzymes, antibodies, structures, etc.
![Page 15: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/15.jpg)
Amino acid codes
AlaArgAsnAspCysGlnGluGlyHisIleLeuLysMetPheProSerThrTrpTyrVa lAsxGlxSecUnk
ARNDCQEGHILKMFPSTWYVBZUX
AlanineArginineAsparagineAspartic acidCysteineGlutamineGlutamic acidGlycineHistidineIsoleucineLeucineLysineMethioninePhenylalanineProlineSerineThreonineTryptophanTyrosineVa lineAsn or AspGln or GluSelenocysteineUnknown
![Page 16: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/16.jpg)
Primary Protein Structure
![Page 17: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/17.jpg)
Multiple Levels of Protein Strucure
( Protein folding)
![Page 18: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/18.jpg)
Tertiary Structure ofSperm whale myoglobin (1MBN)
![Page 19: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/19.jpg)
(RT)
![Page 20: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/20.jpg)
Nature (1953), 171:737
“It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”
DNA Replication
![Page 21: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/21.jpg)
![Page 22: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/22.jpg)
DNA Replication
• The DNA strand that is copied to form a new strand is called a template
• In the replication of a double-stranded or duplex DNA molecule, both original (parental) DNA strands are copied
• When copying is finished, the two new duplexes, each consisting of one of the original strands plus its copy, separate from each other (semiconservative replication)
![Page 23: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/23.jpg)
Semiconservative Replication
![Page 24: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/24.jpg)
DNA Replication, ctd• Synthesis occurs in the chemical direction 5’3’• Nucleic acid chains are assembled from 5’
triphosphates of deoxyribonucleosides (the triphosphates supply energy)
• DNA polymerases are enzymes that copy DNA• DNA polymerases require a short preexisting DNA
strand (primer) to begin chain growth. With a primer base-paired to the template strand, a DNA polymerase adds nucleotides to the free hydroxyl group at the 3’ end of the primer.
• DNA replication requires assembly of many proteins (at least 30) at a growing replication fork: helicases to unwind, primases to prime, ligases to ligate (join), topisomerases to remove supercoils, RNA polymerase, etc.
![Page 25: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/25.jpg)
![Page 26: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/26.jpg)
DNA Replication Fork
![Page 27: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/27.jpg)
DNA is unwinding
DNA Synthesis
![Page 28: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/28.jpg)
![Page 29: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/29.jpg)
![Page 30: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/30.jpg)
RNA• RNA, or ribonucleic acid, is similar to
DNA, but-- RNA is (usually) single-stranded-- the sugar is ribose rather than deoxyribose
-- uracil (U) is used instead of thymine• RNA is important for protein synthesis
and other cell activities• There are several classes of RNA
molecules, including messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), and other small RNAs
![Page 31: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/31.jpg)
The Genetic Code
• DNA: sequence of four different nucleotides
• Protein: sequence of twenty different amino acids
• The correspondence between the four-letter DNA alphabet and the twenty-letter protein alphabet is specified by the genetic code, which relates nucleotide triplets, or codons, to amino acids
![Page 32: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/32.jpg)
Standard Genetic Code
![Page 33: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/33.jpg)
Variation of genetic codesT1 T2 T3 T4 T5 T6 T9 T10 T12 T13 T14 T15
CUUCUCCUACUG
LeuLeuLeuLeu
----
ThrThrThrThr
----
----
----
----
----
---Ser
----
----
----
AUUAUCAUAAUG
IleIleIleMet
--Met-
--Met-
----
--Met-
----
----
----
----
--Met-
----
----
UAUUACUAAUAG
TyrTyrStopStop
----
----
----
----
--GlnGln
----
----
----
----
--Tyr-
---Gln
AAUAACAAAAAG
AsnAsnLysLys
----
----
----
----
----
--Asn-
----
----
----
--Asn-
----
UGUUCGUGAUGG
CysCysStopTrp
--Trp-
--Trp-
--Trp-
--Trp-
----
--Trp-
--Cys-
----
--Trp-
--Trp-
----
AGUAGCAGAAGG
SerSerArgArg
--StopStop
----
----
--SerSer
----
--SerSer
----
----
--GlyGly
--SerSer
----
T1: standardT2: vert mtT3: yeast mtT4: other mtT5: invert. mtT6: cil. etc nuc.T9: ech. mtT10: eup. nuc.T12:alt yeast nucT13: asc. mtT14: flat. mtT15: bleph. nuc.
![Page 34: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/34.jpg)
Protein Synthesis
![Page 35: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/35.jpg)
Transcription
• Transcription is a complex process involving several steps and many proteins (enzymes)
• RNA polymerase synthesizes a single strand of RNA against the DNA template strand (anti-sense strand), adding nucleotides to the 3’ end of the RNA chain
• Initiation is regulated by transcription factors, including promoters, usually an initiator element and TATA box, usually lying just upstream (at the 5’ end) of the coding region
• 3’ end cleaved at AAUAAA, poly-A tail added
![Page 36: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/36.jpg)
Exons and Introns• Most of the genome consists of non-coding
regions• Some non-coding regions (centromeres and
telomeres) may have specific chomosomal functions
• Other non-coding regions have regulatory purposes
• Non-coding, non-functional DNA often called junk DNA, but may have some effect on biological functions
• The terms exon and intron refer to coding and non-coding DNA, respectively
![Page 37: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/37.jpg)
Intron Splicing
![Page 38: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/38.jpg)
Translation
• The AUG start codon is recognized by methionyl-tRNAi
Met
• Once the start codon has been identified, the ribosome incorporates amino acids into a polypeptide chain
• RNA is decoded by tRNA (transfer RNA) molecules, which each transport specific amino acids to the growing chain
• Translation ends when a stop codon (UAA, UAG, UGA) is reached
![Page 39: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/39.jpg)
Translation Illustrated
![Page 40: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/40.jpg)
From Primary Transcript to Protein
![Page 41: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/41.jpg)
Alternative Splicing (of Exons)
• How is it possible that there are millions of human antibodies when there are only about 30,000 genes?
• Alternative splicing refers to the different ways the exons of a gene may be combined, producing different forms of proteins within the same gene-coding region
• Alternative pre-mRNA splicing is an important mechanism for regulating gene expression in higher eukaryotes
![Page 42: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/42.jpg)
![Page 43: Statistics for Microarrays](https://reader036.vdocuments.net/reader036/viewer/2022081603/56813e40550346895da825d7/html5/thumbnails/43.jpg)
Acknowledgements
• http://www.accessexcellence.org/AB/GG
•http://www.oup.co.uk/best.textbooks/biochemistry/genesvii
• Sandrine Dudoit, UC Berkeley Biostatistics
• Yee Hwa Yang, UC Berkeley Statistics
• Terry Speed, UC Berkeley Statistics and WEHI, Melbourne, Australia