bioinformatics why can’t it tell us everything?. bioinformatics what are our data sets? interested...
TRANSCRIPT
BioinformaticsWhy Can’t It Tell Us Everything?
BioinformaticsWhat are our Data Sets?
• Interested in information flow with cells
• Currently, the key information is mostly a matter of biological macromolecules
• Eventually, information of interest will also include flow of nutrients, energy, and impact of small molecules on macromolecular function
BioinformaticsWhat are our Questions?
• What is in there?• What does it do?• How similar is it to something else?• How does it fold?• Where does it go in a cell?• What does it interact with?• How it is regulated?• Level of confidence?
* Function of organism is determined by function of its cells * Function of cells determined by chemical reactions that take place within them * Chemical reactions occur or not according to presence and activity of enzymes * Enzymes are proteins * Proteins are determined by genes * Therefore, genes determine organismal function
BioinformaticsLogical Reasoning Behind Data Sets
Genomics
Proteomics
Central DogmaFlow of Information
Central DogmaDNA as the Blueprint for Life?
Central DogmaDNA as the Blueprint for Life?
Central Dogma
DNA RNA Protein
Genes & proteins are different molecular languages,
but they are colinear
DNA
Basic Unit (alphabet): Nucleotide (base) Only 4: A, T, G, and C
Double-stranded: A<>T and G<>C
5’..AGCTGCATGCTAGCTGACGTCA….3’ 3’..TCGACGTACGATCGACTGCAGT….5’
“Words” (genes) to encode proteins, RNA
Double helical
DNA Tower in Perth, AUS
DNAStructure Connected to Information
DNAReplication & Transcription as Algorithms
• With rare exceptions, all DNA is replicated
• Crucial tool is ability to go from one strand to another
• Transcription uses same base-pairing rules with U instead of T, but occurs in packets
Transcription = DNA to RNAWhere to Start is a Big Question
Protein
Alphabet: amino acids
There are 20 amino acids
Met Cys Ser Leu Ala Ala Val
ProteinsNumber of Possible 100-mer Peptides?20 possible residues at each
position
For 2-mers, 20 possible at position 1 and 20 possible at position 2, so 20 x 20 = 202 = 400
Same logic for 100-mers, 20100 = 2100 x 10100 =
(210) 10 x 10100 =
~ (103) 10 x 10100 = 10130
beta-pleated sheet
ProteinsFolding Starts Local
alpha-helix
ProteinsFolding Goes Global
ProteinsPredictive Protein Folding as Holy Grail
Protein
Alphabet: amino acids
There are 20 amino acidsEncoded by codons (triplets of nucleotides)
Met Cys
ATGTGCAGCCTAGCTGCCGTC
Ser
CTAGCTGCCGTC
Leu Ala Ala Val
Genetic Code Found on Earth:How Does It Work?
5’-UCGACCAUGGUUGACCAUUGAUUACCACG-3’
Genetic Code
• Triplet• Nonoverlapping• Comma-less• Redundant
Bioinformatics:Mining a Mountain of Data
Where are the putative genes?