proteomics session 1 introduction. some basic concepts in biology and biochemistry

35
Proteomics Session 1 Session 1 Introduction Introduction

Upload: branden-sanders

Post on 13-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Proteomics

Session 1Session 1

IntroductionIntroduction

Some basic conceptsSome basic concepts in biology and biochemistry in biology and biochemistry

The hierarchy of biological organism

From “molecule” to “organism”

The micro environment: Cell

DNA vs. chromosome

DNA

Chromosome

Central dogma: the story of life

RNA

DNA

Protein

DNA structure

Atomic structure Double helix

The basic unit in DNA

AT GC

From DNA to Protein

1. Transcription

2. Translation

Step1: Transcription, generation of mRNA

Amino acid carrier: tRNA

Step2: Translation, protein assembly

Peptide bond formation

Peptide Chain

Protein structure

Primary

SecondaryTertiary

Quaternary

The bonds contribute to protein structure

1. Hydrogen bond

2. Hydrophobic interaction

3. Ionic bond

4. Disulfide bond

Proteins are the molecule tools for most cellular functions

TYPE FUNCTION EXAMPLE Structural proteins Support Collagen, Elastin,

Keratin Storage proteins Storage of amino acid Ovalbumin,

Casein Transport proteins Transport of other

substrate Hemoglobin

Hormonal proteins Coordination of and organism’s activities

Insulin

Receptors proteins Response of cell to chemical stimuli

Receptor in nerve transmit route

Contractile proteins Movement Actin, Myosin Defensive proteins Protecton against

disease Antibodys

Enzymatic proteins Selective acceleraton of chemical reactions

Trypsin, ATPase, GAPDH

What is “bioinformatics”?

Let’s take minutes to see the hot topic” bioinformatics

What is “bioinformatics”?

(Molecular)(Molecular) BioBio – – informaticsinformatics

One idea for a definition?One idea for a definition?

Bioinformatics is conceptualizing biology in terms of molecules (in the sense of physical-chemistry) and then applying “informatics” techniques (derived from disciplines such as applied math and statistics) to understand and organize the information associated with these molecules, on a large-scale.

Bioinformatics is “Bioinformatics is “MISMIS” for Molecular Biology Information. It ” for Molecular Biology Information. It is a practical discipline with is a practical discipline with many many applicationsapplications..

Bioinformatics - History

1980

2005

2000

1990

1985

1995

Single StructuresModeling & GeometryForces & SimulationDocking

Sequences, Sequence-Structure Relationships

AlignmentStructure PredictionFold recognition

GenomicsDealing with many sequencesGene finding & Genome Annotation Databases

Integrative AnalysisExpression & Proteomics DataData miningSimulation again….

Growth of biological databases

1 2 3 5 10 16 24 35 49 72 101 157 217385

652

1,160

2,009

3,841

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

百萬

82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

Source: GenBank

3D StructuresGrowth:

Source: http://www.rcsb.org/pdb/holdings.html

GenBank BASEPAIR GROWTH

What bioinformatics can do for us?What bioinformatics can do for us?

Example: Drug Discovery

Target IdentificationTarget Identification– Which protein to inhibit?Which protein to inhibit?

Lead discovery & optimizationLead discovery & optimization– What sort of molecule will bind to this protein?What sort of molecule will bind to this protein?

ToxicologyToxicology– Side effects, target specificitySide effects, target specificity

PharmacokineticsPharmacokinetics– Metabolization and transportMetabolization and transport

Drug Development Life Cycle

Years

0 2 4 6 8 10 12 14 16

Discovery (2 to 10 Years)

Preclinical Testing(Lab and Animal Testing)

Phase I(20-30 Healthy Volunteers used to check for safety and dosage)

Phase II(100-300 Patient Volunteers used to check for efficacy and side effects)

Phase III(1000-5000 Patient Volunteers used to monitor reactions to long-term drug use)

FDA Review & Approval

Post-Marketing Testing

$600-700 Million!$600-700 Million!

With the aid of bioinformatics

7-15 years

Drug lead screening

5,000 to 10,000 compounds screened

250 Lead Candidates in Preclinical Testing5 Drug Candidates

enter Clinical Testing; 80% Pass Phase I

One drug approved by the FDAOne drug approved by the FDA

30%Pass Phase II

80% Pass Phase III

Complementarily– Shape– Chemical– Electrostatic

??

Drug Lead Screening & Docking

Introduction to proteomicsIntroduction to proteomics

What’s “proteomics” ?

"The analysis of the entire protein complement

expressed by a genome, or by a cell or tissue

type.“

Wasinger VC et al Progress with gene-product mapping of the mollicutes:

Mycoplasma genitalium. Electrophoresis 16 (1995) 1090-1094

Two most applied technologies:

1. 2-D electrophoresis: separation of complex protein mixtures

2. Mass spectrometry: Identification and structure analysis

Why proteomics becomes an important discipline

Significant DNA sequencing results: Significant DNA sequencing results: – 45 microorganism genomes have been sequenced 45 microorganism genomes have been sequenced

and 170 more are in progressand 170 more are in progress– 5 eukaryotes have been completed5 eukaryotes have been completed

Saccharomyces cerevisiaeSaccharomyces cerevisiaeSchizosaccharomyces pombeSchizosaccharomyces pombeArabodopsis thalianaArabodopsis thalianaCaenorhabditis elegansCaenorhabditis elegansDrosophilia melanogasterDrosophilia melanogasterRice, Mouse and Human are nearly doneRice, Mouse and Human are nearly done

However, However, 2/3 of all genes “identified” have 2/3 of all genes “identified” have no no known functionknown function

Only DNA sequence is not enough

StructureStructure

RegulationRegulation

InformationInformation

Computers cannot determine which of these 3 Computers cannot determine which of these 3 roles DNA play solely based on sequence roles DNA play solely based on sequence (although we would all like to believe they can)(although we would all like to believe they can)

Those are what we need to know about proteins

Introduction to Proteomics

DefinitionsDefinitions– 1. 1. Classical -Classical - restricted to large scale analysis of restricted to large scale analysis of

gene products involving only proteinsgene products involving only proteins (small view)(small view)– 2. 2. Inclusive -Inclusive - combination of protein studies with combination of protein studies with

analyses that have genetic components such as analyses that have genetic components such as mRNA, genomics, and yeast two-hybridmRNA, genomics, and yeast two-hybrid (bigger vi(bigger view)ew)

Don’t forget that the proteome is Don’t forget that the proteome is dynamicdynamic, changing to reflect the , changing to reflect the environment that the cell is inenvironment that the cell is in..

1 gene = 1protein?

1 gene is no longer equal to one protein1 gene is no longer equal to one protein

TThe definition of a gene is debatable..(ORF, he definition of a gene is debatable..(ORF, promoter, pseudogene, gene product, etc)promoter, pseudogene, gene product, etc)

1 gene1 gene == how many proteins?how many proteins? (never known) (never known)

Why Proteomics?

Differential protein expression

Scenario 1: can be analyzed by microarray technology

DNA RNA Protein

Transcription Translation

x1 x4

DNA RNA Protein

Transcription Translation

x3

Stimulus

DNA RNA Protein

Transcription Translation

x3

Stimulus

Scenario 2: can be solved by proteomics technology

Co- and Post-translational modification

Co-translational modified Post-translational modified

What proteomics can answer

Protein identification

Protein Expression Studies

Protein Function

Protein Post-Translational Modification

Protein Localization and Compartmentalization

Protein-Protein Interactions

General classification for Proteomics

Protein Expression comparison (beginning)– Quantitative study of protein expression between

samples that differ by some variable

Structural Proteomics (simulation)– Goal is to map out the 3-D structure of proteins and

protein complexes

Functional Proteomics (everything)– To study protein-protein interaction, 3-D structures,

cellular localization and PTMS in order to understand the physiological function of the whole set of proteome.