biochemistry, part 2. 1introduction 2theoretical background biochemistry/molecular biology...

67
Biochemistry, part 2

Upload: roger-gray

Post on 29-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Biochemistry, part 2

1 Introduction

2 Theoretical backgroundBiochemistry/molecular biology

3 Theoretical backgroundcomputer science

4 History of the field

5 Splicing systems

6 P systems

7 Hairpins

8 Micro technology introductions Microreactors / Chips

9 Microchips and fluidics

10 Self assembly

11 Regulatory networks

12 Molecular motors

13 DNA nanowires

14 Protein computers

15 DNA computing - summery

Course outline

DNA folding

Many DNA molecules are circular (e.g.,

bacterial chromosomes, all plasmid DNA).

Circular DNA can form supercoils. Human

chromosome contains 3x109 basepairs and are

wrapped around proteins to form nucleosomes.

Nucleosomes are packed tightly to form helical

filament, a structure called chromotin.

RNA are much shorter but more diverse

molecules. They can form various three

dimensional structures.

DNA folding

Supercoils refer to the DNA structure in which

double-stranded circular DNA twists around

each other. Supercoiled DNA contrasts relaxed

DNA;

In DNA replication, the two strands of DNA

have to be separated, which leads either to

overwinding of surrounding regions of DNA or

to supercoiling;

A specialized set of enzymes (gyrase,

topoisomerases) is present to introduce

supercoils that favor strand separation;

The degree of supercoils can be quantitatively

described.

Tertial structure in DNA

Varieties of supercoiled DNA

The linking number L of DNA, a topological

property, determines the degree of

supercoiling;

The linking number defines the number of times

a strand of DNA winds in the right-handed

direction around the helix axis when the axis

is constrained to lie in a plane;

If both strands are covalently intact, the

linking number cannot change;

For instance, in a circular DNA of 5400

basepairs, the linking number is 5400/10=540,

where 10 is the base-pair per turn for type B

DNA.

Linking number

Twist T is a measure of the helical winding of the

DNA strands around each other. Given that DNA prefers

to form B-type helix, the preferred twist = number of

basepair/10;

Writhe W is a measure of the coiling of the axis of

the double helix. A right-handed coil is assigned a

negative number (negative supercoiling) and a left-

handed coil is assigned a positive number (positive

supercoiling).

Topology theory tells us that the sum of T and W

equals to linking number: L=T+W

For example, in the circular DNA of 5400 basepairs,

the linking number is 5400/10=540

If no supercoiling, then W=0, T=L=540;

If positive supercoiling, W=+20, T=L-W=520;

The twist and writhe

The relation between L, T and W

Positive supercoiling

The relation between L, T and W

Negative supercoiling

A relaxed circular, double stranded DNA (1600

bps) is in a solution where conditions favor

10 bps per turn. What are the L, T, and W?

During replication, part of this DNA unwinds

(200 bps) while the rest of the DNA still

favor 10 bps per turn. What are the new L, T,

and W?

L=1600/10=160W=0 (relaxed)T=L-W =160

L=160T=(1600-200)/10=140W=L-T=+20

1600 bps 1400 bps200 bps

L, T and W calculation

Nucleosomes look like “beads on a string”

under microscope. The beads contain a pair of

four histone proteins, H2A, H2B, H3, and H4

(octamer). The string is double stranded DNA;

The surface of the octamer contain features

that guide the course of DNA such that DNA can

wrap 1.65 turns around in a left-handed

conformation. H1 proteins serves to seal the

ends of the DNA and connects consecutive

nucleosomes.

nucleosomes

Nucleosomes

Organisation of chromosomes

DNA double helix

‘Beads on a string’ chromatin form

2 nm

11 nm

Base pairs per turn

Packing ratio

10

80

1

6-7

Organisation of chromosomes

Solenoid (6 nucleosomes per turn)

Loops (50 turns per loop)

30 nm

o.25 μm

Base pairs per turn

Packing ratio

1200

60,000

~40

680

Organisation of chromosomes

Miniband (18 loops)

Chromosome (stacked minibands)

o.84 μm

o.84 μm

Base pairs per turn

Packing ratio

1.1 106 1.2 104

Organisation of chromosomes

Organisation of chromosomes

proteins

4 possible bases (A, C, G, U) 3 bases in the codon 4 x 4 x 4 = 64 possible codon sequences Start codon: AUG Stop codons: UAA, UAG, UGA 61 codons to code for amino acids (AUG as well) 20 amino acids – redundancy in genetic code

Genetic code

building blocks for proteins (20 different) vary by side chain groups

Hydrophilic amino acids are water soluable Hydrophobic are not

Linked via a single chemical bond (peptide bond)

Peptide: Short linear chain of amino acids (< 30) polypeptide: long chain of amino acids (which can be upwards of 4000 residues long).

Amino acids

Glycine (G, GLY) Alanine (A, ALA) Valine (V, VAL) Leucine (L, LEU) Isoleucine (I, ILE) Phenylalanine (F, PHE) Proline (P, PRO) Serine (S, SER) Threonine (T, THR) Cysteine (C, CYS) Methionine (M, MET) Tryptophan (W, TRP) Tyrosine (T, TYR) Asparagine (N, ASN) Glutamine (Q, GLN) Aspartic acid (D, ASP) Glutamic Acid (E, GLU) Lysine (K, LYS) Arginine (R, ARG) Histidine (H, HIS) START: AUG STOP: UAA, UAG, UGA

20 amino acids

20 amino acids

20 amino acids

The basic amino acid

Peptide bond

Two amino acids

Removal of water molecule

Formation of CO-NH

Amino end Carboxyl end

Peptide bond

Peptide bond

Polypeptide

There are four basic levels of structure in protein architecture

Protein structure

Primary–sequence of amino acids constituting the

polypeptide chain

Secondary–local organization into secondary

structures such as helices and sheets Tertiary –three dimensional arrangements of the

amino acids as they react to one another due to

the polarity and resulting interactions between

their side chains

Quaternary–number and relative positions of the

protein subunits

Protein structure

Primary structure: amino acid sequence

Protein structure

Protein structure Secondary structure: α-helix and β-sheet

Carboxyl end

Amino end

Protein structure Secondary structure: α-helix and β-sheet

AntiparallelParallel

Side view Side view

Protein structure Secondary structure: α-helix and β-sheet

Protein structure Tertiary structure: spatial arrangement of amino residues

Protein structure Quaternary structure: spatial arrangement of subunits

Protein structure

tertiary quaternarysecondaryprimary

Protein structure

Every function in the living cell depends on proteins.

Motion and locomotion of cells and organisms depends on

contractile proteins. [Example: Muscles]

The catalysis of all biochemical reactions is done by enzymes,

which contain protein.

The structure of cells, and the extracellular matrix in which they

are embedded, is largely made of protein. [Example: Collagens]

Defence by antibodies.

The receptors for hormones and other signalling molecules are

proteins.

The transcription factors that turn genes on and off to guide the

differentiation of the cell and its later responsiveness to

signals reaching it are proteins.

and many more - proteins are truly the physical basis of life.

Protein function

Protein function

Protein function antibody

Protein function enzyme

Gene expression

Bacteria express only a subset of their genes at

any given time.

Expression of all genes constitutively in

bacteria would be energetically inefficient.

The genes that are expressed are essential

for dealing with the current environmental

conditions, such as the type of available

food source.

Gene regulation mechanism

Regulation of gene expression can occur at

several levels:

Transcriptional regulation: no mRNA is made.

Translational regulation: control of whether

or how fast an mRNA is translated.

Post-translational regulation: a protein is

made in an inactive form and later is

activated.

Gene regulation mechanism

Transcriptional control Translational control Post-translational control

Onset of transcription

RNA polymerase

Translation rate

Lifespan of mRNA

Ribosome

mRNA

Protein

Protein activation (by chemical modification)

Feedback inhibition (protein inhibits transcription of its own gene)

DNA

Gene regulation mechanism

Escherichia .Coli

Operon

A controllable unit of transcription

consisting of a number of structural

genes transcribed together. Contains at

least two distinct regions: the operator

and the promoter.

Gene regulation mechanism

Case study of the regulation of the lactose

operon in E. coli

E. coli utilizes glucose if it is available,

but can metabolize other sugars if glucose is

absent.

Gene regulation mechanism

Glucose : LactoseFood source:

70

60

50

3020

40

100

Relative density of cells

0 1 2 3 4 5 0 1 2 3 4 5 6

43.5

13.5

1:3

Glucose : Lactose

1:1

Glucose : Lactose

3:1

Time (hours)

29.5

26.5

0 1 2 3 4 5 6 7

14.0

39.0

Second period of rapid growth with lactose as food source

Initial period of rapid growth with glucose as food source

Gene regulation mechanism

Case study of the regulation of the lactose

operon in E. coli

Genes that encode enzymes needed to break

other sugars down are negatively regulated.

Example: enzymes required to metabolize

lactose are only synthesized if glucose is

depleted and lactose is available.

In the absence of lactose, transcription

of the genes that encode these enzymes is

repressed. How does this occur?

Gene regulation mechanism

Case study of the regulation of the lactose operon in E. coli

All the loci required for lactose metabolism are grouped together into an operon.

The lacZ locus encodes -galactosidase enzyme, which breaks down lactose.

The lacY locus encodes galactosidase permease, a transport protein for lactose.

The function of the lacA locus is unknown.

The lacI locus encodes a repressor that blocks transcription of the lac operon.

Gene regulation mechanism

Section of E. coli chromosome

(1) Lacl protein and glucose shut down transcription of lacZ and lacY

(2) Lactose induces transcription of lacZ andlacY

Regulatory function

RegulatoryproteinLacl

lacl

Cleaves lactoseto glucose and galactose

ß-galactosidase

LacZ

lacZ

E. coli

Chromosome

Glucose

Galactose

ß-galactosidase

Galactosidase permease

Lactose

Membrane transport protein-imports lactose

Galactosidase permease

lacY

LacY

Observations aboutregulation of lacZ and lacY:

Gene regulation mechanism

lacl promoter lacl Promoter Operator lacZ lacY

Lac operon

lacA

Gene regulation Lac operon

Repression and induction of the lactose operon.

The lac operon is under negative regulation,

i.e. , normally, transcription is repressed.

Glucose represses transcription of the lac

operon.

Glucose inhibits cAMP synthesis in the

cells.

At low cAMP levels, no cAMP is available

to bind CAP.

Unless CAP is bound to the CAP site in

the promoter, no transcription occurs.

Gene regulation mechanism

lacl

Functional repressor

RNA polymerase blocked

Operator (binding site for repressor)

lacZ lacY

NO TRANSCRIPTION

When no lactose is present, the repressor binds to DNA and blocks transcription.

Gene regulation mechanism

Lactose

lacl + lacZ lacY

TRANSCRIPTION BEGINS

-galactosidase

Permeaserepressor mRNA

Repressor plus lactose (an inducer) present. Transcription proceeds.

Gene regulation mechanism

lacl promoter

lacl Promoter Operator lacZ lacY lacA

RNA polymerase binds to promoter

lacZ message

"Polycistronic" mRNA

lacY message

lacA message

Operons produce mRNAs that code for functionally related proteins.

Gene regulation mechanism

DNA binding sites

Proteins that bind to DNA share similarity in

the structure of their DNA-binding regions.

Many DNA binding proteins, such as lac

repressor, have a helix-turn-helix motif which

fits into the major groove of a DNA molecule

DNA binding proteins

(a) (b) (c)

DNA binding proteins

Binding of an inducer to the lac repressor

causes it to release the operator DNA because it

alters the conformation of the helix-turn-helix

motif.

DNA binding proteins

DNA binding proteins

DNA binding proteins

DNA binding proteins

Information about regulation of the expression

of genetic loci may help to combat diseases.

Virulent bacterial strains have genes that

encode the ability to infect and produce

disease.

Knowledge of how the expression of these

genes is controlled and regulated may

provide insights into blocking the

development of the disease.

DNA binding proteins

When tryptophan is absent, transcription occurs.

RNA polymerase

Promoter

Leader

5 coding loci

When tryptophan is present, transcription is blocked.

Tryptophan

Repressor

DNA binding proteins, negative regulation

Ribosomes translatesmRNA rapidly whentryptophan is abundant,…

…leading to formation of stem-and-loop structure that inhibitsRNA polymerase and terminates transcription.

DNA binding proteins