lec 1 introduction to molecular biology

Introduction

to

molecular biology

Subjects overview • The aim of these lectures is to investigate how cells organize their DNA within

the cell nucleus, and replicate it during cell division to produce two new copies of the genome. Cellular processes to repair damaged DNA will also be covered. • The mechanism of DNA replication will be discussed, covering the structure of the replication fork, how cells select sites of replication initiation, and how they control whether and when to replicate DNA • The reaction mechanism catalyzed by DNA polymerases causes difficulty in replicating the ends of linear DNA molecules. Various methods have evolved to solve this ‘end-replication problem’. The most common involves the use of an unusual reverse transcriptase, called telomerase • We will discuss genome organization: introns, exons, satellites, repetitive

DNAetc • How is the huge amount of genomic DNA packaged to fit within the cell nucleus, and still keeping specific sequences accessible for transcription? We will discuss the structure of the nucleosome and higher levels of chromatin organization and packaging • DNA is often damaged under normal environmental conditions. How can cells repair their genome and what are the consequences if they cannot?

Molecular biology Is the branch of biology that deals with the molecular

basis of biological activity.

This field overlaps with other areas of biology and

chemistry, particularly genetics and biochemistry.

Molecular biology chiefly concerns itself with

understanding and the interactions between the various

systems of a cell, including the interactions between the

different types of DNA, RNA and protein biosynthesis as

well as learning how these interactions are regulated.

The field of molecular biology studies macromolecules and

the macromolecular mechanisms found in living things, such

as the molecular nature of the gene and its mechanisms of

gene replication, mutation, and expression.

The genome The totality of genetic

information and is

encoded in the DNA or

RNA for some viruses.

All living things are grouped into

three domain:

Eukaryotes

Prokaryotes

Archaea

Eukaryotic cell Eukaryotic cell are generally more advanced than prokaryotic cell.

has a nucleus, which is separated from the rest of the cell by a membrane. The nucleus contains chromosomes, which are the carrier of the genetic material.

The genetic material distributed among multiple chromosome.

Eukaryotic DNA is linear and complexed with protiens called histones.

Prokaryotic cell Prokaryotes are single-celled

organisms

Without nucleus, no nuclear

membrane.

DNA is naked, without

histones,

Archaea are prokaryotes

(without nucleus) but some

aspects similar to

Eukaryotes.

Deoxyribonucleic acid (DNA)

The genetic instructions used

in the development and

functioning of all known living

organisms and some viruses.

The main role of DNA

molecules is the long-term

storage of information.

DNA is often compared to a set of blueprints or a recipe,

or a code, since it contains the instructions needed to

construct other components of cells, such as proteins and

RNA molecules.

The chromosome The storage place of

all genetic information.

The number of chromosome varies from one species to another.

The genes The DNA segments

that carry this genetic

information are called

genes.

• General structure of nucleic acids:

• DNA is a long polymer made from repeating units called nucleotides.

• The DNA chain is 22 to 26 Å wide (2.2 to 2.6 nano.), and one nucleotide unit is 3.3 Å (0.33 nm) long. Although each individual repeating unit is very small, DNA polymers can be very large molecules containing millions of nucleotides.

• Human chromosome number 1, is approximately 220 million base pairs long.

Building Blocks - Nucleotides A nucleotide is composed of three parts: sugar (

Ribose in RNA and Deoxy ribose in DNA), base and phosphate group. If all phosphate groups are removed, a nucleotide becomes a nucleoside.

The four bases found in DNA are:

Adenine (A),

Cytosine (C),

Guanine (G) and

Thymine (T).

A fifth pyrimidine base, called uracil (U), usually takes the

place of thymine in RNA and differs from thymine by

lacking a methyl group on its ring.

These bases are classified

into two types; adenine and

guanine are fused five- and

six-membered heterocyclic

compounds called purines,

while cytosine and thymine

are six-membered rings

called pyrimidines.

• In living organisms, DNA does not usually exist as a single molecule, but instead as a pair of molecules that are held tightly together. These two long strands entwine like vines, in the shape of a double helix.

• In a double helix the direction of the nucleotides in one strand is opposite to their direction in the other strand: the strands are antiparallel.

• The asymmetric ends of DNA strands are called the 5′ (five prime) and 3′ (three prime) ends, with the 5' end having a terminal phosphate group and the 3' end a terminal hydroxyl group.

Base pairing

Each type of base on one strand forms a bond with

just one type of base on the other strand. This is called

complementary base pairing. Here, purines form

hydrogen bonds to pyrimidines, with A bonding only

to T, and C bonding only to G.

This arrangement of two nucleotides binding together

across the double helix is called a base pair. As

hydrogen bonds are not covalent, they can be broken

and rejoined relatively easily.

http://en.wikipedia.org/wiki/File:DNA_chemical_structure.svg

• Due to the specific base pairing, DNA's two strands are complementary to each other. Hence, the nucleotide sequence of one strand determines the sequence of another strand. For example, the sequence of the two strands can be written as

• 5' -ACT- 3'

• 3' -TGA- 5'

• Note that they obey the (A:T) and (C:G) pairing rule. If we know the sequence of one strand, we can deduce the sequence of another strand. For this reason, a DNA database needs to store only the sequence of one strand. By convention, the sequence in a DNA database refers to the sequence of the 5' to 3' strand (left to right).

Grooves

Twin helical strands form the DNA backbone. Another

double helix may be found by tracing the spaces, or

grooves, between the strands. As the strands are not

directly opposite each other, the grooves are unequally

sized. One groove, the major groove, is 22 Å wide and the

other, the minor groove, is 12 Å wide.

Sense and antisense

A DNA sequence is called "sense" if its sequence is the

same as that of a messenger RNA copy that is translated

into protein. The sequence on the opposite strand is called

the "antisense" sequence. Both sense and antisense

sequences can exist on different parts of the same strand

of DNA (i.e. both strands contain both sense and antisense

sequences). In both prokaryotes and eukaryotes, antisense

RNA sequences are produced, but the functions of these

RNAs are not entirely clear.

Supercoiling

DNA can be twisted like a rope in a process called DNA

supercoiling. With DNA in its "relaxed" state, a strand

usually circles the axis of the double helix once every 10.4

base pairs.

If the DNA is twisted in the direction of the helix, this is

positive supercoiling, and the bases are held more tightly

together.

If they are twisted in the opposite direction, this is negative

supercoiling, and the bases come apart more easily.

In nature, most DNA has slight negative supercoiling that is

introduced by enzymes called topoisomerases.

Alternate DNA structures

DNA exists in many possible conformations that include

A-DNA, B-DNA, and Z-DNA forms, although, only B-

DNA and Z-DNA have been directly observed in

functional organisms.

From left to right, the structures of A, B and Z DNA

http://en.wikipedia.org/wiki/File:A-DNA,_B-DNA_and_Z-DNA.png

DNA binding proteins

Lec 2

The aims of this lecture is to investigate how cells organize

their DNA within the cell nucleus, how is the huge amount of

genomic DNA packaged to fit within the cell nucleus, and

still keeping specific sequences accessible for transcription?.

We will discuss genome organization, satellites, repetitive

DNAetc

• We will discuss the structure of the nucleosome and higher

levels of chromatin organization and packaging.

Interactions with proteins

All the functions of DNA depend on interactions with

proteins. These protein interactions can be non-specific, or

the protein can bind specifically to a single DNA

sequence. Enzymes can also bind to DNA for example the

polymerases that copy the DNA sequence in transcription

and DNA replication.

• DNA-binding proteins

Within chromosomes, DNA is held in complexes with

structural proteins. These proteins organize the DNA into

a compact structure called chromatin. In eukaryotes this

structure involves DNA binding to a complex of small

basic proteins called histones, while in prokaryotes

multiple types of proteins are involved. The histones form

a disk-shaped complex called a nucleosome. These non-

specific interactions are formed through basic residues in

the histones making ionic bonds to the acidic sugar-

phosphate backbone of the DNA.

http://en.wikipedia.org/wiki/File:Lambda_repressor_1LMB.png

Chromatin

Chromatin is the complex combination of DNA and

protein that makes up chromosomes. It is found inside the

nuclei of eukaryotic cells. The major components of

chromatin are DNA and histone proteins. The functions of

chromatin are to package DNA into a smaller volume to

fit in the cell.

• Chromatin is the substance which becomes visible

chromosomes during cell division. Its basic unit is

nucleosome, composed of 146 bp DNA and eight histone

proteins. The structure of chromatin is dynamically

changing, at least in part, depending on the need of

transcription. In the metaphase of cell division, the

chromatin is condensed into the visible chromosome. At

other times, the chromatin is less condensed, with some

regions in a "Beads-On-a-String" conformation.

• Histones are the proteins closely associated with DNA

molecules. They are responsible for the structure of

chromatin and play important roles in the regulation of

gene expression. Five types of histones have been

identified: H1 (or H5), H2A, H2B, H3 and H4. H1 and its

homologous protein H5 are involved in higher-order

structures of chromatin. The other four types of histones

associate with DNA to form nucleosomes.

Histones (H1, H2A, H2B, H3, H4, and H5) organized into

two super classes as follows: Core histones – H2A, H2B,

H3 and H4 and linker histones – H1 and H5.

Histones contain a high proportion of basic amino acids

(arginine and lysine) that facilitate binding to the

negatively charged DNA molecule.

Two of each of the core histones (H2A, H2B, H3 and H4)

assemble to form one nucleosome core particle by

wrapping 146 base pairs of DNA around the protein spool

in 1.65 left-handed super-helical turn. The linker histone

H1 binds the nucleosome and the entry and exit sites of

the DNA, thus locking the DNA into place and allowing

the formation of higher order structure.

• each nucleosome is associated with an H1 (or H5) to form a solenoid structure. H1 and H5 are called linker histones.

Chromosomes

A chromosome is an organized structure of DNA and protein that is found in cells. It is a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions.

• Chromosomes in prokaryotes

The prokaryotes – bacteria and archaea – typically have a

single circular chromosome, but many variations do exist.

Most bacteria have a single circular chromosome that can

range in size from only 160,000 base pairs in the

endosymbiotic bacterium Candidatus Carsonella ruddii, to

12,200,000 base pairs in the soil-dwelling bacterium

Sorangium cellulosum. Spirochaetes of the genus Borrelia

are a notable exception to this arrangement, with bacteria

such as Borrelia burgdorferi, ontaining a single linear

chromosome.

Repetitive DNA Sequences A stretch of DNA sequence often repeats several times in

the total DNA of a cell. For example, the following DNA

sequence is just a small part of telomere located at the

ends of each human chromosome:

An entire telomere, about 15 kb, is constituted by thousands of the repeated sequence "GGGTTA".

DNA sequences are divided into three classes:

• Highly repetitive: About 10-15% of mammalian DNA

fragments reassociate very rapidly. This class includes

tandem repeats.

• Moderately repetitive: Roughly 25-40% of mammalian

DNA fragments reassociate at an intermediate rate. This

class includes interspersed repeats (also known as

mobile elements or transposable elements).

• Single copy (or very low copy number): This class

accounts for 50-60% of mammalian DNA.

Tandem repeats are an array of consecutive repeats. They

include three subclasses: satellites, minisatellites and

microsatellites. The name "satellites" comes from their

optical spectra.

Satellites

• The size of a satellite DNA ranges from 100 kb to over 1

Mb. In humans, a well known example is the alphoid

DNA located at the centromere of all chromosomes. Its

repeat unit is 171 bp and the repetitive region accounts for

3-5% of the DNA in each chromosome. Other satellites

have a shorter repeat unit. Most satellites in humans or in

other organisms are located at the centromere.

Minisatellites

• The size of a minisatellite ranges from 1 kb to 20 kb. One

type of minisatellites is called variable number of

tandem repeats (VNTR). Its repeat unit ranges from 9

bp to 80 bp. They are located in non-coding regions. The

number of repeats for a given minisatellite may differ

between individuals. This feature is the basis of DNA

fingerprinting.

• Another type of minisatellites is the telomere. In a human

germ cell, the size of a telomere is about 15 kb. In an

aging somatic cell, the telomere is shorter. The telomere

contains tandemly repeated sequence GGGTTA.

Microsatellites

• Microsatellites are also known as short tandem repeats

(STR), because a repeat unit consists of only 1 to 6 bp and

the whole repetitive region spans less than 150

bp. Similar to minisatellites, the number of repeats for a

given microsatellite may differ between

individuals. Therefore, microsatellites can also be used

for DNA fingerprinting. In addition, both microsatellite

and minisatellite patterns can provide information about

paternity.

Interspersed Repeats • Interspersed repeats are repeated DNA sequences located

at dispersed regions in a genome. They are also known as

mobile elements or transposable elements. A stretch of

DNA sequence may be copied to a different location

through DNA recombination. After many generations,

such sequence (the repeat unit) could spread over various

regions. Mobile elements are found in all kinds of

organisms. In mammals, the most common mobile

elements are LINEs and SINEs.

Thanks for your

attention

DNA Replicatio

n

Aimes To understand the DNA replication mechanism in

eukaryotes and prokaryotes.

Identifying the DNA polymerases.

DNA replication:

The basis for biological inheritance, is a fundamental process occurring in all living organisms to copy their DNA.

"semiconservative" Each strand of the original

double-stranded DNA

molecule serves as template

for the reproduction of the

complementary strand.

Replisome The replisome is a complex

molecular machine that carries out

replication of DNA. It is made up of a

number of subcomponents that each

provides a specific function during

the process of replication.

Helicase

Gyrase (Topoisomerases)

Primase

DNA pol. III

DNA pol. I

Ligase

SSB (Single strand binding

protein).

Exonuclease

Major Components of Replisome

• DNA polymerases are a family of enzymes that carry out

all forms of DNA replication.

• DNA polymerase can only extend an existing DNA strand

paired with a template strand; it cannot begin the synthesis

of a new strand.

• To begin synthesis of a new strand, a short fragment of

DNA or RNA, called a primer, must be created and paired

with the template strand before DNA polymerase can

synthesize new DNA.

DNA polymerases

DNA pol III has two key

limitations:

It can only add nucleotides to

the 3' end of a strand.

2- It cannot start a new strand;

it can only extend an existing

strand (because it must only add

to 3' ends of strand).

Three types of DNA polymerase

classified in prokaryotes,

Type I, used to fill the gap between DNA

fragments of the lagging strand.

Type II, involved in the SOS response to

DNA damage.

Type III, DNA replication is mainly

carried out by the DNA polymerase III.

In Eukaryotes

• There are five types of DNA polymerases in mammalian

cells: a, b, g, d, and e. The (g) subunit is located in the

mitochondria, responsible for the replication of

mtDNA. Other subunits are located in the nucleus. Their

major roles of each subunits are:

• a: synthesis of lagging strand.

• b: DNA repair.

• d: synthesis of leading strand.

• e: DNA repair.

• The prokaryotic DNA polymerase III consists of several

subunits, with a total molecular weight exceeding

600kD. Among them, a, e, and q subunits constitute the

core polymerase.

• The major role of b subunits is to keep the enzyme from

falling off the template strand. Two b subunits can form a

donut-shaped structure to clamp a DNA molecule in its

center, and slide with the core polymerase along the DNA

molecule. This allows continuous polymerization of up to

5 x 105 nucleotides. In the absence of b subunits, the core

polymerase would fall off the template strand after

synthesizing 10-50 nucleotides.

DNA replication within the cell

Origins of replication

• For a cell to divide, it must first replicate its DNA. This

process is initiated at particular points within the DNA,

known as "origins", which are targeted by proteins

that separate the two strands and initiate DNA synthesis.

Origins contain DNA sequences recognized by replication

initiator proteins (eg. dnaA in E.coli and the Origin

Recognition Complex in yeast). These initiator proteins

separate the two strands and initiate replication forks.

Origins tend to be "AT-rich" (rich in

adenine and thymine bases) to assist this

process, because A-T base pairs have two

hydrogen bonds (rather than the three

formed in a C-G pair)—strands rich in

these nucleotides are generally easier to

separate because a greater number of

hydrogen bonds requires more energy to

break them.

The replication fork When replicating, the original DNA splits in two, forming

two "prongs" which resemble a fork (hence the name

"replication fork"). Because DNA polymerase can only

synthesize a new DNA strand in a 5' to 3' manner, the

process of replication goes differently for the two strands

comprising the DNA double helix.

Mechanism of DNA Replication Once strands are separated, RNA primers are created on

the template strands. DNA Polymerase extends the leading

strand in one continuous motion and the lagging strand in

a discontinuous motion. Rnase removes the RNA

fragments used to initiate replication by DNA Polymerase,

and DNA Polymerase I enters to fill the gaps. When this is

complete, a single nick on the leading strand and several

nicks on the lagging strand can be found. Ligase works to

fill these nicks in, thus completing the newly replicated

DNA molecule.

The leading strand receives one

RNA primer per active origin of

replication while the lagging strand

receives several; these several

fragments of RNA primers found on

the lagging strand of DNA are

called Okazaki fragments, named

after their discoverer.

Leading strand

The leading strand is the template strand of the DNA

double helix so that the replication fork moves along

it in the 3' to 5' direction. This allows the newly

synthesized strand complementary to the original

strand to be synthesized 5' to 3' in the same direction

as the movement of the replication fork.

On the leading strand, a polymerase "reads" the DNA

and adds nucleotides to it continuously. This

polymerase is DNA polymerase III (DNA Pol III)

in prokaryotes and presumably Pol ε in eukaryotes.

Lagging strand

The lagging strand is the strand of the template DNA double helix that is oriented so that the replication fork moves along it in a 5' to 3' manner. Because of its orientation, opposite to the working orientation of DNA polymerase III, which moves on a template in a 3' to 5' manner, replication of the lagging strand is more complicated than that of the leading strand.

On the lagging strand, primase "reads" the DNA and

adds RNA to it in short, separated segments. DNA

polymerase III or Pol δ lengthens the primed

segments, forming Okazaki fragments. Primer

removal in eukaryotes is also performed by Pol δ. In

prokaryotes, DNA polymerase I "reads" the

fragments, removes the RNA by 5'-3' exonuclease

activity of polymerase I, and replaces the RNA

nucleotides with DNA nucleotides (this is necessary

because RNA and DNA use slightly different kinds of

nucleotides). DNA ligase joins the fragments together.

In bacteria, which have a single origin of

replication on their circular chromosome,

this process eventually creates a "theta

structure" (resembling the Greek letter

theta: θ). In contrast, eukaryotes have

longer linear chromosomes and initiate

replication at multiple origins within

these.

Telomerase and Aging • Synthesis of the lagging strand requires a short primer which will be removed. At the extreme end of a chromosome, there is no way to synthesize this region when the last primer is removed. Therefore, the lagging strand is always shorter than its template by at least the length of the primer. This is the so-called "end-replication problem".

• Bacteria do not have the end-replication problem, because its DNA is circular.

• In eukaryotes, the chromosome ends are called telomeres which have at least two functions:

• to protect chromosomes from fusing with each other.

• to solve the end-replication problem.

• In a human chromosome, the telomere is about 10 to 15 kb in length, composed of the tandem repeat sequence: TTAGGG. The telomerase contains an essential RNA component which is complementary to the telomere repeat sequence. Hence, the internal RNA can serve as the template for synthesizing DNA. Through telomerase translocation, a telomere may be extended by many repeats.

Aging

• In the absence of telomerase, the telomere will become shorter after each cell division. When it reaches a certain length, the cell may cease to divide and die. Therefore, telomerase plays a critical role in the aging process.

Rolling circle replication

Another method of copying DNA, sometimes used in vivo

by bacteria and viruses, is the process of rolling circle

replication. In this form of replication, a single replication

fork progresses around a circular molecule to form

multiple linear copies of the DNA sequence. In cells, this

process can be used to rapidly synthesize multiple copies

of plasmids or viral genomes.

Topoisomerases During replication, the unwinding of DNA may cause the

formation of tangling structures, such as supercoils or

catenanes. The major role of topoisomerases is to prevent

DNA tangling.

There are two types of topoisomerases:

• type I produces transient single-strand breaks in DNA .

• types II produces transient double-strand breaks.

As a result, the type I enzyme removes supercoils from

DNA one at a time, whereas the type II enzyme removes

supercoils two at a time.

• In eukaryotes, the topo I and topo II can remove both

positive and negative supercoils.

• In bacteria, the topo I can remove only negative

supercoils. The bacterial topo II is also called the gyrase,

which has two functions:

(a) to remove the positive supercoils during DNA

replication,

(b) to introduce negative supercoils (one supercoil for 15-20

turns of the DNA helix) so that the DNA molecule can

be packed into the cell. During replication, these

negative supercoils are removed by topo I.

Without topoisomerases, the DNA cannot

replicate normally. Therefore, the

inhibitors of topoisomerases have been

used as anti-cancer drugs to stop the

proliferation of malignant

cells. However, these inhibitors may also

stop the division of normal cells. Some

cells (e.g., hair cells) which need to

continuously divide will be most

affected. This explains a noticeable side

effect: the hair loss.

DNA

Mutation

Lecture 4

Lecture overview One look around a room tells you that each person has

slight differences in their physical make up — and

therefore in their DNA. These subtle variations in

DNA are called polymorphisms (literally "many

forms"). Many of these gene polymorphisms account

for slight differences between people such as hair and

eye color. But some gene variations may result in

disease or an increased risk for disease. Although all

polymorphisms are the result of a mutation in the

gene, geneticists only refer to a change as a mutation

when it is not part of the normal variations between

people.

Aims To understand mutation, mutagen, mutants.

Classification of mutations.

Types of mutagens.

Mutagen & carcinogen.

In biology, a mutation is a randomly derived change to the nucleotide sequence of the genetic material of an organism.

• Mutations can be caused by copying errors in the genetic material during cell division( DNA replication), or by exposure to mutagens (chemical, physical or viruses).

• In multi-cellular organisms with dedicated reproductive cells, mutations can be subdivided into germ line mutations, which can be passed on to descendants, and somatic mutations, which are not usually transmitted to descendants.

*By effect on structure

1- Small-scale mutations:- such as those affecting a small gene in one or a few nucleotides, including:

A- Point mutations: Exchange a single nucleotide for another, there are different types of point mutation:-

*Transitions: Exchanges a purine for a purine (A ↔ G) or a pyrimidine for a pyrimidine, (C ↔ T).(Most common)

*Transversions: Exchanges a purine for a pyrimidine or a pyrimidine for a purine (C/T ↔ A/G). (Less common)

Classification of mutation

*Insertions add one or more extra nucleotides into the DNA.

They are usually caused by transposable elements, or errors

during replication of repeating elements (e.g. AT repeats).

*Deletions remove one or more nucleotides from the DNA.

Ex:

original The fat cat ate the wee rat.

Point Mutation The fat hat ate the wee rat.

B-Frame-shift mutation In a frame shift mutation, one or more bases are inserted

or deleted. This type of mutation disrupt the reading frame

thus make the DNA meaningless and often results in a

shortened protein. Frame shift mutation can classified to:-

Deletion

Original The fat cat ate the wee rat.

Deletion The fat ate the wee rat.

Insertion


Insertion The fat cat xlw ate the wee rat.

Inversion In an inversion mutation, an entire section of DNA is reversed.


Inversion The fat tar eew eht eta tac.

2- Large-scale mutations in chromosomal structure,

including:

A- Deletion of large chromosomal regions, leading to

loss of the genes within those regions.

B- Translocation: interchange of genetic parts from

non-homologous chromosomes.

C-Inversion: reversing the orientation of a

chromosomal segment.

D- Amplifications (or gene duplications) leading to multiple

copies of all chromosomal regions, increasing the dosage

of the genes located within them.

Deletion Duplication Inversion

Loss-of-function mutations are the result of gene product

having less or no function.

Gain-of-function mutations change the gene product

such that it gains a new and abnormal function.

Lethal mutations are mutations that lead to the death of

the organisms which carry the mutations.

*By effect on function

In applied genetics it is usual to speak of mutations as either

harmful or beneficial.

A harmful mutation is a mutation that decreases the

fitness of the organism.

A beneficial mutation is a mutation that increases fitness

of the organism, or which promotes traits that are

desirable.

*By effect on fitness

Conditional mutation is a mutation that has wild-type (or

less severe) phenotype under certain "permissive"

environmental conditions and a mutant phenotype under

certain "restrictive" conditions. For example, a

temperature-sensitive mutation can cause cell death at

high temperature (restrictive condition), but might have no

deleterious consequences at a lower temperature

(permissive condition).

*Special classes

Causes of mutation

Mutations may occur spontaneously

(spontaneous mutations) or induced

(induced mutations) caused by

Mutagens.

*Spontaneous mutations can arise as a result of:

1- DNA replication errors and polymerase accuracy.

A- Base alterations

Taotomeresim – A base is changed by the

repositioning of a hydrogen atom, altering the hydrogen bonding pattern of that base resulting in incorrect base pairing during replication.

Deamination - Hydrolysis changes a normal base to

an atypical base containing a keto group in place of the original amine group. Examples include C → U and A → HX (hypoxanthine), and 5MeC (5-methylcytosine) → T.

B- Base damage

Depurination – Loss of a purine base (A or G) to form an apurinic site (AP site). Alkylation can occur through reaction of compounds such as S-adenosyl methionine with DNA. Alkylated bases may be subject to spontaneous breakdown or mispairing.

** Alkylation, the addition of alkyl (methyl, ethyl, occasionally propyl) groups to the bases or backbone of DNA.

2- Spontaneous genetic rearrengment mutations

Deletion, duplication, ……..etc

Induced mutations on the molecular level can be caused by either Chemical or Physical mutagens.

1- Chemical mutagens

The first report of mutagenic action of a chemical was in 1942 by Charlotte Auerbach, who showed that nitrogen mustard (component of poisonous mustard gas used in World Wars I and II) could cause mutations in cells.

A- Base analogs

These chemicals structurally resemble purines and pyrimidines and may be incorporated into DNA in place of the normal bases during DNA replication: examples are

*bromouracil (BU), resembles thymine (has Br atom instead of methyl group) and will be incorporated into DNA and pair with A like thymine.

*aminopurine --adenine analog which can pair with T or with C; causes A:T to G:C or G:C to A:T transitions.

B- Chemicals which alter the structure and pairing properties of bases (base modifiers). Example …

*nitrous acid-- formed by digestion of nitrites (preservatives) in foods. It causes C to U, meC to T, and A to hypoxanthine deaminations.

*nitrosoguanidine, *methyl methanesulfonate, *ethyl methanesulfonate--chemical mutagens that react with bases and add methyl or ethyl groups. Depending on the affected atom, the alkylated base may then degrade to yield a baseless site, or mispair to result in mutations upon DNA replication.

C- Intercalating agents

acridine orange, proflavin, ethidium bromide (used in labs as dyes and mutagens), All are flat, multiple ring molecules which interact with bases of DNA and insert between them.

This insertion causes a "stretching" of the DNA duplex and the DNA polymerase is "fooled" into inserting an extra base opposite an intercalated molecule. The result is that intercalating agents cause frameshifts.

D- Agents altering DNA structure

Includes a variety of different kinds of agents. These may be:

Large molecules which bind to bases in DNA and cause them to be noncoding "bulky" lesions (eg. NAAAF).

agents causing intra- and inter-strand crosslinks (eg. psoralens--found in some vegetables and used in treatments of some skin conditions).

chemicals causing DNA strand breaks (eg. peroxides)

Physical

mutagens

(Radiation)

Natural sources of radiation produce so-

called background radiation. These include

cosmic rays from the sun and outer space,

radioactive elements in soil and terrestrial

products (wood, stone) and in the

atmosphere (radon). One's exposure due to

background radiation varies with

geographic location.

Sources of radiation

Artificial sources:

humans have created artificial sources of

radiation which contribute to our radiation

exposure. Among these are medical

testing (diagnostic X-rays and other

procedures), nuclear testing and various

other products (TV's, smoke detectors,

airport X-rays).

Types of radiation Ionizing radiation

- Alpha, Beta, Neutron, X-ray and Gamma

Non-ionizing radiation (electromagnetic radiation)

- Visible light, Infrared, Microwave, Radio waves, Very

low frequency (VLF), Extremely low frequency (ELF),

Thermal radiation (heat) and Black body radiation.

Non Ionizing radiation 1. EM spectrum

Visible light and other forms of radiation are all types

of electromagnetic radiation (consists of electric and

magnetic waves). The length of EM waves (wavelength)

varies widely and is inversely proportional to the energy

they contain: this is the basis of the so-called EM

spectrum.

•UV (ultraviolet) UV radiation is less energetic,

and therefore non-ionizing, but its wavelengths are preferentially absorbed by bases of DNA and by aromatic amino acids of proteins, so it has important biological and genetic effects.

UV is normally classified in terms of its wavelength:

UV-C (180-290 nm)--"germicidal"--most energetic and

lethal, it is not found in sunlight because it is absorbed by

the ozone layer.

UV-B (290-320 nm)--major lethal/mutagenic fraction of

sunlight.

UV-A (320 nm--visible)--"near UV"--also has deleterious

effects (because it creates oxygen radicals) but it produces

very few pyrimidine dimers.

The major lethal lesions are pyrimidine dimers in DNA

(produced by UV-B and UV-C)--these are the result of a

covalent attachment between adjacent pyrimidines in one

strand. These dimers, like bulky lesions from chemicals,

block transcription and DNA replication and are lethal if

unrepaired. They can stimulate mutation and chromosome

rearrangement as well.

. Ionizing radiation

X- and gamma-rays are energetic enough that they produce reactive ions (charged atoms or molecules) when they react with biological molecules; thus they are referred to as ionizing radiation.

Intense exposure (high dose rate) causes burns and skin damage versus a long-term weak exposure (low dose rate) which would only increase risk of mutation and cancer.

• Biological effects of radiation

Ionizing radiation produces a range of damage to cells and organisms primarily due to the production of free radicals of water (the hydroxyl or OH radical). Free radicals possess unpaired electrons and are chemically very reactive and will interact with DNA, proteins, lipids in cell membranes, etc. Thus X-rays can cause DNA and protein damage which may result in organelle failure, block cell division, or cause cell death. The rapidly dividing cell types (blood cell-forming areas of bone marrow, gastrointestinal tract lining) are the most affected by ionizing radiation and the severity of the effects depends upon the dose received.

Genetic effects of radiation

Ionizing radiation produces a range of effects on DNA both through free radical effects and direct action:

Breaks in one or both strands (can lead to rearrangements, deletions, chromosome loss, death if unrepaired; this is from stimulation of recombination).

Damage to/loss of bases (mutations).

cross linking of DNA to itself or proteins

Thanks

Introduction DNA repair refers to a collection of processes by which

a cell identifies and corrects damage to the DNA molecules that

encode its genome. In human cells, both normal metabolic activities

and environmental factors such as UV light and radiation can cause

DNA damage, resulting in as many as 1 million individual molecular

lesions per cell per day. Many of these lesions cause structural

damage to the DNA molecule and can alter or eliminate the cell's

ability to transcribe the gene that the affected DNA encodes. Other

lesions induce potentially harmful mutations in the cell's genome,

which affect the survival of its daughter cells after it

undergoes mitosis. As a consequence, the DNA repair process is

constantly active as it responds to damage in the DNA structure.

When normal repair processes fail, and when cellular apoptosis does

not occur, irreparable DNA damage may occur, including double-

strand breaks and DNA crosslinkages

The rate of DNA repair is dependent on many factors, including the cell type, the age of the cell, and the extracellular environment. A cell that has accumulated a large amount of DNA damage, or one that no longer effectively repairs damage incurred to its DNA, can enter one of three possible states:

an irreversible state of dormancy, known as senescence

cell suicide, also known as apoptosis or programmed cell death

unregulated cell division, which can lead to the formation of a tumor that is cancerous

Since many mutations are deleterious,, DNA repair systems are vital to the survival off all organisms

– Living cells contain several DNA repair systems that can fix different type of DNA alterations

DNA Repair DNA repair mechanisms are placed into different

categories on the basis of the way they operate

Direct correction or direct reversal- reversing the damage

Excise the damaged areas and then repair the gap by new DNA synthesis

Basic mechanism of repairing DNA In most cases,, DNA repair is a multi-step process

1. An irregularity in DNA structure is detected and removed

2. Normal DNA is synthesized DNA

3. Ligation

Direct Reversal of DNA Damage Mismatch Repair by DNA Polymerase Proofreading.

Repair of UV-Induced Pyrimidine Dimers (reverted

by exposure to near-UV light-activates photolyase –

not found in humans, It splits the dimers restoring the

DNA to its original condition).

Repair of Alkylation Damage (by O6-methylguanine

methyltransferase encoded by ada gene, It transfers

the methyl or ethyl group from the base to a cysteine

side chain within the alkyltransferase protein)

Base Excision Repair and Repair Involving Excision of Nucleotides There are three major DNA repairing mechanisms:

1- Base excision

2- Nucleotide excision

3- Mismatch repair

• Base excision DNA bases may be modified by deamination or

alkylation. The position of the modified (damaged) base

is called the "abasic site" or "AP site".

In E.coli, the DNA glycosylase can recognize the

AP site and remove its base. Then, the AP endonuclease

removes the AP site and neighboring nucleotides. The

gap is filled by DNA polymerase I and DNA ligase.

These enzymes can recognize a single damaged base and

cleave the bond between it and the sugar in the DNA.

Removes one base, excises several around it, and replaces

with several new bases using Pol adding to 3’ ends then

ligase attaching to 5’ end

Depending on the species,, this repair system can

eliminate abnormal bases such as

– Uracil; Thymine dimers

– 3-methyladenine; 7-methylguanine

• Nucleotide excision

In E. coli, proteins UvrA, UvrB, and UvrC

are involved in removing the damaged

nucleotides (e.g., the dimer induced by UV

light). The gap is then filled by DNA

polymerase I and DNA ligase.

In yeast, the proteins similar to Uvr's

are named RADxx ("RAD“ for "radiation"),

such as RAD3, RAD10. etc.

An important general process for DNA repair is nucleotide

excision repair (NER)

Nicks DNA around damaged base and removes region

Then fills in with Pol on 3’ends, and attaches 5’ end

with ligase

This type off system can repair many types off DNA

damage,, including

– Thymine dimers and chemically modified bases

NER is found in all eukaryotes and prokaryotes

(However, its molecular mechanism is better

understood in prokaryotes).

Nucleotide excision repair

(NER) of pyrimidine dimer

and other damage

induced distortions of

DNA.

Several human diseases have been shown to involve

inherited defects in genes involved in NER

– These include xeroderma pigmentosum (XP) and

Cockayne syndrome (CS)

" A common characteristic off both syndromes is an

increased sensitivity to sunlightt

– Xeroderma pigmentosum can be caused by defects

in seven different NER genes

Skin lesions of Xeroderma Pigmentosum

Caused by homozygosity

For a recessive mutation in

A repair gene.

One example of a DNA-repair genetic

disease

Mismatch Repair System If proofreading fails, the methyl-directed mismatch

repair system comes to the rescue

--This repair system is found in all species

--In humans, mutations in the system are associated with

particular types of cancer.

Methyl-directed mismatch repair recognizes

mismatched base pairs, excises the incorrect

bases, and then carries out repair synthesis.

• Mismatch repair To repair mismatched bases, the system has to know

which base is the correct one. In E. coli, this is achieved by

a special methylase called the "Dam methylase", which can

methylate all adenines that occur within (5')GATC

sequences. Immediately after DNA replication, the template

strand has been methylated, but the newly synthesized

strand is not methylated yet. Thus, the template strand and

the new strand can be distinguished.

--The repairing process begins with the protein MutS which binds

to mismatched base pairs. Then, MutL activates MutH which binds to

GATC sequences.

--Activation of MutH cleaves the unmethylated strand at the GATC

site. Subsequently, the segment from the cleavage site to the mismatch is

removed by exonuclease (with assistance from helicase II and SSB

proteins).

If the cleavage occurs on the 3' side of the

mismatch, this step is carried out by

exonuclease I (which degrades a single

strand only in the 3' to 5' direction).

* If the cleavage occurs on the 5' side of the

mismatch, exonuclease VII or RecJ is used

to degrade the single stranded DNA. The

gap is filled by DNA polymerase III and

DNA ligase.

Mechanism of mismatch

repair. The mismatch

correction enzyme recognizes

which strand the base

mismatch is on by reading the

methylation state of a nearby

GATC sequence. If the

sequence is unmethylated, a

segment of that DNA strand

containing the mismatch is

excised and new DNA is

inserted.

Mismatch Repair in Eukaryotes

Eukaryotes also have mismatch repair, but

it is not clear how old and new DNA

strands are identified.

– Four genes are involved in humans,

hMSH2 and hMLH1, hPMS1, and

hPMS2

– All of these are mutator genes

In humans, mutations in any one of the four

human mismatch repair genes confers a

phenotype of hereditary predisposition to a

form of colon cancer called hereditary

nonpolyposis colon cancer

Proteins involved in DNA repairing of E. coli

Thanks

Gene Expression

Overview of Gene Expression

An organism may contain many types of somatic cells,

each with distinct shape and function. However, they all

have the same genome. The genes in a genome do not

have any effect on cellular functions until they are

"expressed". Different types of cells express different

sets of genes, thereby exhibiting various shapes and

functions.

Essential steps involved in the expression of the genes.

Gene expression

Is the process by which information from a gene is

used in the synthesis of a functional gene product. These

products are often proteins, but in non-protein coding

genes such as ribosomal RNA (rRNA), transfer RNA

(tRNA) or small nuclear RNA (snRNA) genes, the

product is a functional RNA.

Steps of gene expression

Several steps in the gene expression

process may be modulated, including the

Transcription

RNA splicing

Translation

Post-translational modification of a

protein.

A DNA strand is used as a template to synthesize a

complementary RNA strand, which is called the primary

transcript.

Transcription

Schematic illustration of transcription. (a) DNA before transcription. (b) During transcription, the DNA should unwind so that one of its strand can be used as template to synthesize a complementary RNA.

The function of RNA polymerases

Both RNA and DNA polymerases can add

nucleotides to an existing strand,

extending its length. However, there is a

major difference between the two classes

of enzymes: RNA polymerases can

initiate a new strand but DNA

polymerases cannot.

RNA Polymerases

In prokaryotes RNA polymerase is composed of five subunits:

• Two α subunits

• one for each β, β´

• δ subunit.

Several different forms of δ subunits have been identified, with molecular weights ranging from 28 kD to 70 kD.

The δ subunit is also known as the sigma factor

(δ factor).

δ factor plays an important role in

recognizing the transcriptional

initiation site, and also possesses the

helicase activity to unwind the DNA

double helix.

Tow α subunits, β, β´ carry out

nucleotide synthesis.

Core RNA polymerase

RNA polymerase without sigma factor (α2 subunits, β, β´),

carry out nucleotide synthesis.

Holoenzyme

refers to a complete and fully functional RNA polymerase.

The holoenzyme includes the core polymerase and the δ

factor.

In Eukaryotes

• There are three classes of eukaryotic RNA

polymerases: I, II and III, each comprising two

large subunits and 12-15 smaller subunits.

• The two large subunits β and β' subunits.

• Two smaller subunits α subunit.

• The eukaryotic RNA polymerase does not

contain any sigma factor.

• Therefore, in eukaryotes, transcriptional

initiation should be mediated by other proteins.

RNA polymerase II is involved in the transcription of all

protein genes and most snRNA genes.

The other two classes transcribe only RNA genes. RNA

polymerase I is located in the nucleolus, transcribing

rRNA genes except 5S rRNA.

RNA polymerase III is located outside the nucleolus,

transcribing 5S rRNA, tRNA, U6 snRNA and some small

RNA genes.

In prokaryotes, binding of the polymerase's δ factor to

promoter can catalyze unwinding of the DNA double

helix. The most important δ factor is Sigma 70.

Promoter: a short nucleotide sequence that is recognized

by an RNA polymerase enzyme as a point at which to bind

to DNA in order to start transcription. Promoters occur

upstream of the gene.

Transcription Mechanisms in

Prokaryotes

Identifiable steps during transcription:

1- Promoter recognition.

2- Chain initiation.

3- Chain elongation.

4- Chain termination

1- promoter recognition:

δ factor directs RNA polymerase to specific sequences in

the DNA called promoters so that transcription initiates at

the proper place. Prokaryotic polymerases can recognize the

promoter and bind to it directly.

Promoters contain two distinct sequence motifs that reside

~10 bases and ~35 bases upstream of the transcriptional start

site or first base of the RNA.

• The transcriptional start site is known as the +1 site. All of the bases following the +1 site are transcribed into RNA and are numbered with positive numbers.

• The bases prior to the +1 site are numbered with negative numbers.

• The promoter sequence consists of tow motifs a ~10 bases upstream of the +1 site is called the -10 box (Pribnow box) and ~35 bases upstream of the +1 is called the -35 box.

• δ 70 recognizes promoters with a consensus sequence consisting of TAATAT at the -10 region and -35 region.

The following steps occur before

initiation:

RNA polymerase recognizes and

specifically binds to the promoter

region on DNA. At this stage, the

DNA is double-stranded ("closed").

This wound-DNA structure is referred

to as the closed complex.

•The DNA is unwound and becomes

single-stranded ("open") in the

vicinity of the initiation site (defined

as +1). This unwound-DNA structure

is called the open complex.

• RNA polymerase incorporate the

first few nucleotides to the +1 region.

• sigma disassociate from the

promoter.

Chain initiation: Unwinding (melting) of the DNA

double helix. The enzyme which can unwind the double

helix is called helicase. Prokaryotic RNA polymerases

have the helicase activity.

Chain elongation: Synthesis of RNA based on the

sequence of the DNA template strand.

RNA polymerases use nucleoside triphosphates (NTPs) to

construct a RNA strand.

Chain termination: Prokaryotes and eukaryotes use

different signals to terminate transcription.

Transcription in eukaryotes is much more complicated

than in prokaryotes, partly because eukaryotic DNA is

associated with histones, which could hinder the access of

polymerases to the promotor.

Transcriptional Termination in Prokaryotes In prokaryotes, the transcription is terminated by two

major mechanisms:

Rho-independent

Rho-dependent.

The Rho-independent termination

signal is a stretch of 30-40 bp

sequence (terminator sequence),

consisting of many GC residues

followed by a series of T ( "U" in the

transcribed RNA). The resulting

RNA transcript will form a stem-loop

structure (hairpin) to terminate

transcription.

The stem-loop structure of the RNA transcript as a termination signal for the transcription of the trp operon.

• Rho-dependent mechanism

• Rho is a ~ 50 kD protein, involved in transcription

terminations. Six Rho proteins form a hexamer to

terminate transcription.

• The Rho protein binds to the RNA transcript at the

upstream site which is 70-80 nucleotides long and rich in

C residues. Upon binding, the Rho moves along the RNA

in the 3' direction. If movement of the polymerase is slow,

the Rho will catch up and terminate the transcription at the

downstream termination site. Rho has ATPase activity

which can induce release of the polymerase from DNA.

Reverse transcription Scheme of reverse transcription

• Some viruses (such as HIV, the cause of AIDS), have the ability to

transcribe RNA into DNA in order to see a cell's genome.

• The main enzyme responsible for this type of transcription is called

reverse transcriptase. In the case of HIV, reverse transcriptase is

responsible for synthesizing a complementary DNA strand (cDNA) to the

viral RNA genome.

• An associated enzyme, ribonuclease H, digests the RNA strand and

reverse transcriptase synthesises a complementary strand of DNA to form

a double helix DNA structure.

• This cDNA is integrated into the host cell's genome via another enzyme

(integrase) causing the host cell to generate viral proteins which

reassemble into new viral particles. Subsequently, the host cell undergoes

programmed cell death (apoptosis).

Eukaryotic RNA polymerases

1. The mechanism of eukaryotic transcription is

similar to that in prokaryotes.

2. A lot more proteins are associated with the

eukaryotic transcription machinery, which results

in the much more complicated transcription.

3. Three eukaryotic polymerases transcribe different

sets of genes.

4. In addition, eukaryotic cells contain additional

RNA Pols in mitochondria and chloraplasts.

Main Features of eukaryotic transcription

Type Location Substrate

RNA Pol I Nucleoli Most rRNAs gene

RNA Pol II Nucleo-plasm All protein-coding

genes and some

snRNA genes

RNA Pol III Nucleo-plasm tRNAs, 5S rRNA,

U6 snRNA and

other small RNAs

Three eukaryotic polymerases

RNA polymerase subunits

Each eukaryotic polymerase contains 12 or

more subunits.

– the two largest subunits are similar to each

other and to the b’ and b subunits of E. coli RNA

Pol.

– There is one other subunit in all three RNA Pol

homologous to alfa subunit of E. coli RNA Pol.

– Five additional subunits are common to all

three polymerases.

– Each RNA Pol contain additional four or seven

specific subunit.

RNA polymerase activities

1. Transcription mechanism is similar to

that of E. coli polymerase (How?)

2. Different from bacterial polymerasae,

they require accessory factors for

DNA binding.

The CTD of RNA pol II

1. The C-terminus of RNA Pol II contains a stretch of seven amino acids that is repeated 52 times in mouse and 26 times in yeast RNA pol II.

2. The heptapeptide sequence ( Seven amino acids) is: Tyr-Ser-Pro-Thr-Ser-Pro-Ser

3. This repeated sequence is known as carboxyl terminal domain (CTD)

4. The CTD sequence may be phosphorylated at the serines and some tyrosines.

5. The CTD is unphosphorylated at transcription initiation, and phosphorylation occurs during transcription elongation as the RNA Pol II leaves the promoter.

6. Because it transcribes all eukaryotic protein-coding gene, RNA Pol II is the most important RNA polymerase for the study of differential gene expression. The CTD is an important target for differential activation of transcription elongation.

RNA Pol II

1. located in nucleoplasm 2. catalyzing the synthesis of the

mRNA precursors for all protein-coding genes.

3. RNA Pol Ⅱ-transcribed pre-mRNAs are processed through cap addition, poly(A) tail addition and splicing.

Promoters

• Eukaryotic genes, like their prokaryotic

counterparts, require promoters for transcription

initiation. Each of the three types of polymerase has

distinct promoters.

•RNA polymerase I transcribes from a single type of

promoter, present only in rRNA genes, that

encompasses the initiation site. In some genes, RNA

polymerase III responds to promoters located in the

normal, upstream position; in other genes, it

responds to promoters imbedded in the genes,

downstream of the initiation site.

http://www.ncbi.nlm.nih.gov/books/n/stryer/A5607/def-item/A5677/


Promoters for RNA polymerase II can be simple or complex. As is the case for prokaryotes, promoters are always on the same molecule of DNA as the gene they regulate.

Most promoters contain a sequence called the TATA box around 25-35 bp upstream from the start site of transcription. It has a 7 bp consensus sequence 5’-TATA(A/T)A(A/T)-3’.


•TATA box acts in a similar way to an E. coli promoter –10 sequence to position the RNA Pol II for correct transcription initiation.

Some eukaryotic genes contain an initiator element instead of a TATA box. The initiator element is located around the transcription start site. Other genes have neither a TATA box nor an initiator element, and usually are transcribed at very low rates.

Enhancers

Sequence elements which can activate transcription from thousands of base pairs upstream or downstream.

• Exert strong activation of transcription of a

linked gene from the correct start site.

• activate transcription when placed in either

orientation with respect to linked genes Able to

function over long distances of more than 1 kb

whether from an upstream or downstream

position relative to the start site.

• Exert preferential stimulation of the closets of

two tandem promoters

General characteristics of Enhancers

The TATA-Box-Binding Protein Initiates the Assembly of the Active Transcription Complex Promotors constitute only part of the eukaryotic

gene expression. Transcription factors that bind to these elements also are required. For example, RNA polymerase II is guided to the start site by a set of transcription factors known collectively as TFII (TF stands for transcription factor, and II refers to RNA polymerase II). Individual TFII factors are called TFIIA, TFIIB, and so on. Initiation begins with the binding of TFIID to the TATA box


1. TFIID: Multiprotein Complex, including TBP, other proteins are known as TAFIIs. TBP is the only protein binds to TATA box

TBP:

1. a general

transcription

factor bound

to DNA at the

TATA box.

2. a general

transcription

required by

all 3 RNA pol.

2. TFIIA • binds to TFIID • stabilizes TFIID-DNA complex

3. TFIIB & RNA Pol binding • binds to TFIID •Binds to RNA Pol with TFIIF

4-1 TFIIE

binding

•Necessary

for

transcription

4-2 TFIIJ, TFIIH binding •Necessary for transcription

5. phosphorylation of the polymerase CTD by TFIIH Formation of a processive RNA polymerase complex and allows the RNA Pol to leave the promoter region.

Initiation of RNA synthesis For RNAP II (protein-coding genes), initiation

requires several transcription factors that assist binding to promoter sites. Promoters sites recognized by RNAP II (and associated protein factors) are several conserved elements that are located upstream from the transcription start point (the +1 base).

Elongation of RNA via RNAP II

Elongation of the RNA chain is similar to that in prokaryotes except that a 7-methyl guanosine (7-MG) cap is added to the 5’ end when the growing RNA chain is fairly short (20-30 bases in length).

The 7-MG cap is “attached” by an unusual 5’-5’ triphosphate linkage and serves to protect the growing RNA from degradation by nucleases. This “capping” is part of RNA processing in eukaryotes.

Termination of RNA synthesis 1- Transcription by RNAP II (for protein-coding genes) is

not really terminated, in the sense that transcription continues for 1,000 - 2,000 bases after or downstream from the site that ultimately will become the 3’ end of the mature transcript.

2- Termination of transcription via RNAPI and RNAP III is via response to discrete termination signals.

In general:

(a) The “functional” transcript actually results from endonucleolytic cleavage of the primary transcript.

(b) Cleavage occurs 10-30 bases downstream from the conserved sequence AAUAAA.

(c) After cleavage, an enzyme [poly(A) polymerase] adds about 200 adenine (A) bases to the 3’ends. This is called polyadenylation or the addition of poly-A tails.

**The function of poly-A tails is to increase stability of the transcript and to assist in transport of the mRNA from the nucleus to the cytoplasm. This is another part of RNA processing is eukaryotes.

Thanks

Introduction In molecular biology and genetics, splicing is a modification of

the nascent pre-mRNA taking place after or concurrently with

its transcription, in which introns are removed and exons are

joined. This is needed for the typical eukaryotic messenger

RNA before it can be used to produce a correct protein

through translation. For many eukaryotic introns, splicing is

done in a series of reactions which are catalyzed by

the spliceosome, a complex of small nuclear ribonucleoproteins

(snRNPs), but there are also self-splicing introns

Over view The protein coding genes of eukaryotes typically contain

regions of DNA that serves no coding functions. Non coding regions called introns, interrupt the coding regions called exons.

When the genes is transcribed to RNA, both the coding and non coding regions are copied. However eukaryotic cell having a mechanism of removing introns from RNA, in a process called RNA splicing, a newly transcribed RNA molecule is cut at the intron – exon boundaries, its intron are discarded. And its exon are joined together. RNA splicing occur within the nucleus before RNA migrates to the cytoplasm. In the cytoplasm, ribosome translate the RNA- now containing uninterrupted coding information- in to protein.

mRNA processing and splicing

pre-mRNA –The nuclear transcript that is processed by

modification and splicing to give an mRNA.

RNA splicing – The process of excising introns from

RNA and connecting the exons into a continuous mRNA.

Eukaryotic mRNA is modified, processed, and transported

The 5′ End of Eukaryotic mRNA Is Capped A 5′ cap is formed by adding a G to the terminal base of

the transcript via a 5′–5′ link.

The capping process takes place during the transcription,

which may be important for transcription reinitiation.

Eukaryotic mRNA has a methylated 5’ cap

The 5′ End of Eukaryotic mRNA Is Capped The 5′ cap of most mRNA is monomethylated, but some

small noncoding RNAs are trimethylated.

The cap structure is recognized by protein factors to

influence mRNA stability, splicing, export, and translation.

The 3′ Ends of mRNAs Are Generated by Cleavage

and Polyadenylation • The sequence AAUAAA is a

signal for cleavage to generate a 3′ end of mRNA that is polyadenylated.

• The reaction requires a protein complex that contains a specificity factor, an endonuclease, and poly(A) polymerase.

• The specificity factor and endonuclease cleave RNA downstream of AAUAAA.

The 3’ end of mRNA is generated by cleavage

The 3′ Ends of mRNAs Are Generated by Cleavage and

Polyadenylation The specificity factor and

poly(A) polymerase add

~200 A residues processively

to the 3′ end.

The poly(A) tail controls

mRNA stability and

influences translation.

There is a single 3’ end-processing complex

Pre-mRNA Splicing Proceeds through a Lariat

Splicing requires the 5′ and 3′ splice sites and a branch

site just upstream of the 3′ splice site.

A lariat is formed when the intron is cleaved at the 5′

splice site, and the 5′ end is joined to a 2′ position at an A

at the branch site in the intron.

Pre-mRNA Splicing Proceeds through a Lariat The intron is released as a

lariat when it is cleaved at

the 3′ splice site, and the left

and right exons are then

ligated together.

Splicing proceeds through a lariat

snRNAs Are Required for Splicing small cytoplasmic RNAs (scRNA) – RNAs that are

present in the cytoplasm (and sometimes are also found

in the nucleus).

small nuclear RNA (snRNA) – One of many small RNA

species confined to the nucleus; several of them are

involved in splicing or other RNA processing reactions.

small nucleolar RNA (snoRNA) – A small nuclear RNA

that is localized in the nucleolus.

snRNA Proteins Are Required for Splicing The five snRNPs involved in splicing are U1, U2, U5, U4,

and U6.

Together with some additional proteins, the snRNPs form

the spliceosome.

tRNA Splicing Involves Cutting and Rejoining in

Separate Reactions RNA polymerase III terminates transcription in a poly(U)4

sequence embedded in a GC-rich sequence.

tRNA splicing occurs by successive cleavage and ligation

reactions.

tRNA splicing recognized a specific structure

An endonuclease cleaves the tRNA precursors at both ends of

the intron.

Release of the intron generates two half-tRNAs with unusual

ends that contain 5′ hydroxyl and 2′–3′ cyclic phosphate.

tRNA splicing has separate cleavage and ligation stages

Production of rRNA Requires Cleavage Events

and Involves Small RNAs

RNA polymerase I terminates transcription at an 18-base

terminator sequence.

The large and small rRNAs are released by cleavage

from a common precursor rRNA; the 5S rRNA is

separately transcribed.

Generation of mature eukaryotic rRNAs

Protein synthesis is based on the sequence of

mRNA, which is made up of nucleotides while

proteins are made up of amino acids. There must

be a specific relationship between the nucleotide

sequence and amino acid sequence. This

relationship is the so called genetic code, which

was deciphered by Marshall Nirenberg and his

colleagues in early 1960s. It turns out that three

nucleotides (a codon) code for one amino acid, as

shown in the following figure.

The Genetic Code

The standard genetic code. Synthesis of a peptide always starts from methionine (Met), coded by AUG. The stop codon (UAA, UAG or UGA) signals the end of a peptide. This table applies to mRNA sequences. For DNA, U (uracil) should be replaced by T (thymine). In a DNA molecule, the sequence from an initiating codon (ATG) to a stop codon (TAA, TAG or TGA) is called an open reading frame (ORF), which is likely (but not always) to encode a protein or polypeptide.

Order in the Genetic Code The genetic code is not randomly assigned. If an amino

acid is coded by several codons, they often share the same

sequence in the first two positions and differ in the third

position. Such assignment is accomplished by the design

of wobble position.

The Ribosome

• Ribosomes are the sites of protein synthesis in both

prokaryotic and eukaryotic cell.

• 70S for bacterial ribosome and 80S for eukaryotic cell.

• Both prokaryotic and eukaryotic ribosomes are composed

of two distinct subunits, each containing characteristic

proteins and rRNAs.

Lec 9 molecular Biology Dr. Dlnya Asad

This lecture is to describe in detail the process of

protein synthesis, whereby a messenger RNA is

translated by the ribosome in the cytoplasmic

compartment. The parallels and differences between

eukaryote and prokaryote translation will be

considered. As well as constituting a central

component of the machinery of the cell.

Overview

• Proteins are synthesized from mRNA templates by a

process that has been highly conserved throughout

evolution.

• All mRNAs are read in the 5´ to 3´ direction, and poly

peptide chains are synthesized from the amino to the

carboxy terminus.

• Each amino acid is specified by three bases (a codon ) in

the mRNA, according to a nearly universal genetic code.

• The basic mechanics of protein synthesis are also the

same in all cells.

http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=cooper&part=A2886&rendertype=def-item&id=A3297

• Translation is carried out on ribosomes, with tRNAs

serving as adaptors between the mRNA template and the

amino acids being incorporated into protein.

• Protein synthesis thus involves interactions between three

types of RNA molecules (mRNA templates, tRNAs, and

rRNAs), as well as various proteins that are required for

translation.


• Consists of approximately 70 to 80 nucleotides.

• Cloverleaf structures

• All tRNAs have the sequence CCA at their 3´ terminus,

and amino acids are covalently attached to the ribose of

the terminal adenosine.

• The mRNA template is then recognized by the anticodon

loop, located at the other end of the folded tRNA, which

binds to the appropriate codon by complementary base

pairing.

tRNA

The incorporation of the correctly encoded amino

acids into proteins depends on the attachment of

each amino acid to an appropriate tRNA by the

action of the enzyme aminoacyl tRNA synthetases.

The reaction proceeds in

two steps. First, the amino

acids is activated by

reaction with ATP to form

an aminoacyl AMP

synthetase intermediate.

The activated amino acids

is then joined to the 3´

terminus of the tRNA.

• After being attached to tRNA, an amino acids is aligned on the mRNA template by complementary base pairing between the mRNA codon and the anticodon of the tRNA.

• Codon-anticodon base pairing is somewhat less stringent than the standard A-U and G-C base pairing, leading to the nonstandard base pairing (wobble), and then making the genetic code redundant (redundancy of the genetic code), because Inosine located in the tRNA anticodon loop can base-pair with either C, U, or A in the third position on mRNA.






The Organization of mRNAs and

the Initiation of Translation

Both prokaryotic and eukaryotic

mRNAs contain untranslated regions

(UTRs) at their 5´ and 3´ ends.

Eukaryotic mRNAs also contain 5´

7-methylguanosine (m7G) caps and 3´

poly-A tails.

• Prokaryotic mRNAs are frequently polycistronic: They encode multiple proteins, each of which is translated from an independent start site. Eukaryotic mRNAs are usually monocistronic, encoding only a single protein.

• In both prokaryotic and eukaryotic cells, translation always initiates with the amino acid methionine, usually encoded by AUG. Alternative initiation codons, such as GUG, are used occasionally in bacteria (GUG normally encodes valine).

• In most bacteria, protein synthesis is initiated with a modified methionine residue (N-formylmethionine), whereas unmodified methionines initiate protein synthesis in eukaryotes (except in mitochondria and chloroplasts, whose ribosomes resemble those of bacteria).

The signals that identify initiation codons are different in prokaryotic and eukaryotic cells

• Initiation codons in bacterial mRNAs are preceded by a specific sequence called a Shine-Delgarno sequence that aligns the mRNA on the ribosome for translation by base-pairing with a complementary sequence near the 3´ terminus of 16S rRNA.

• Ribosomes recognize most eukaryotic mRNAs by binding to the 7-methylguanosine cap at their 5´ terminus .

• The ribosomes then scan downstream of the 5´ cap until they encounter an AUG initiation codon.

Translation is generally divided into three stages:

Initiation

Elongation

Termination.

• In both prokaryotes and eukaryotes the first step of the

initiation stage is the binding of a specific initiator

methionyl tRNA and the mRNA to the small

ribosomal subunit.

• The large ribosomal subunit then joins the complex,

forming a functional ribosome on which elongation of

the polypeptide chain proceeds.

• A number of specific non-ribosomal proteins are also

required for the various stages of the translation

process.

http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=cooper&part=A1167&rendertype=figure&id=A1179

• The first translation step in bacteria is the binding of three initiation factors (IF-1, IF-2, and IF-3) to the 30S ribosomal subunit.

• The mRNA and initiator N-formylmethionyl tRNA then join the complex.

• A 50S ribosomal subunit associate with the complex.

• The result is the formation of a 70S initiation complex (with mRNA and initiator tRNA bound to the ribosome) that is ready to begin peptide bond formation during the elongation stage of translation.

• Initiation in eukaryotes is more complicated and requires at least ten proteins, which are designated eIFs (eukaryotic initiation factors). The factors bind to the 40S ribosomal subunit, and associates with the initiator methionyl tRNA .

• The mRNA is recognized by initiation factors via mRNA 5´ cap and poly-A tail at the 3’ end then brought to the 40S ribosomal subunit.

• The 40S ribosomal subunit, in association with the bound methionyl tRNA and eIFs, then scans the mRNA to identify the AUG initiation codon. When the AUG codon is reached, a 60S subunit binds to the 40S subunit to form the 80S initiation complex of eukaryotic cells.

Elongation of the polypeptide chain.

The ribosome has three sites for tRNA binding,

designated the P (peptidyl), A (aminoacyl), and E (exit)

sites. The initiator methionyl tRNA is bound at the P

site. The first step in elongation is the binding of the

next aminoacyl tRNA to the A site by pairing with the

second codon of the mRNA. The aminoacyl tRNA is

escorted to the ribosome by an elongation factor.

Once elongation factor has left the ribosome, a peptide bond can be formed

between the initiator methionyl tRNA at the P site and the second aminoacyl

tRNA at the A site. This reaction is catalyzed by the large ribosomal subunit,

with the rRNA playing a critical role. The result is the transfer of methionine

to the aminoacyl tRNA at the A site of the ribosome, forming a peptidyl

tRNA at this position and leaving the uncharged initiator tRNA at the P site.

The next step in elongation is translocation, the ribosome moves three

nucleotides along the mRNA, positioning the next codon in an empty A site.

This step translocates the peptidyl tRNA from the A site to the P site, and the

uncharged tRNA from the P site to the E site. The ribosome is then left with

a peptidyl tRNA bound at the P site, and an empty A site. The binding of a

new aminoacyl tRNA to the A site then induces the release of the uncharged

tRNA from the E site, leaving the ribosome ready for insertion of the next

amino acid in the growing polypeptide chain.

• Elongation of the polypeptide chain continues until a stop codon (UAA, UAG, or UGA) is translocated into the A site of the ribosome. Cells do not contain tRNAs with anticodons complementary to these termination signals; instead, they have release factors that recognize the signals and terminate protein synthesis. The release factors bind to a termination codon at the A site and stimulate hydrolysis of the bond between the tRNA and the polypeptide chain at the P site, resulting in release of the completed polypeptide from the ribosome. The tRNA is then released, and the ribosomal subunits and the mRNA template dissociate

1 Codon recognition

Amino acid

Anticodon

A site P site

Polypeptide

2 Peptide bond formation

3 Translocation

New peptide bond

mRNA movement

mRNA

Stop codon

• Messenger RNAs can be translated simultaneously by several ribosomes in both prokaryotic and eukaryotic cells.

• Thus, mRNAs are usually translated by a series of ribosomes, spaced at intervals of about 100 to 200 nucleotides

• The group of ribosomes bound to an mRNA molecule is called a polyribosome, or polysome.


Thanks

Why regulate gene expression? It takes a lot of energy to make RNA and protein.

Therefore some genes active all the time because their

products are in constant demand.

Others are turned off most of the time and are only switched

on when their products are needed.

Gene Control in Prokaryotes

One way in which prokaryotes control gene expression

is to group functionally related genes together so that

they can be regulated together.

This grouping is called an operon (The clustered genes

are transcribed together from one promoter giving a

polycistronic messenger RNA).

Gene Control in Prokaryotes

The prokaryotic genes organized in to operons.

An operon can be defined as a cluster gene that

encode the proteins necessary to perform coordinated

function. Genes of the same operon have related

functions within the cell and are turned on

(expressed) and off together (suppressed).

The first operon discovered was the lac operon so

named because its products are involved in lactose

breakdown.

An operon consists of:

a promoter (binding site for RNA polymerase)

a repressor binding site called an operator that overlaps the promoter.

structural genes

Operator

Repressor proteins encoded by repressor genes, are

synthesized to regulate gene expression. They bind to

the operator site to block transcription by RNA

polymerase.

Promoter

The promoter sequences are recognized by RNA

polymerase, When RNA polymerase binds to the

promoter, transcription occurs.

Activators The activity of RNA polymerase is also regulated by

interaction with accessory proteins called activators

The presence of the activator removes repression and

transcription occurs

Two major modes of transcriptional regulation function in bacteria

(E. coli) to control the expression of operons:

Repression

Induction

Both mechanisms involve repressor proteins.

Induction happen in operons that produce gene products needed

for the utilization of energy.

Repression regulate operons that produce gene products

necessary for the synthesis of small biomolecules such as amino

acids.

Inducible system

Also called Positive control

The effector molecule interacts with the repressor protein such that it

cannot bind to the operator.

With inducible systems, the binding of the effector molecule to the

repressor greatly reduces the affinity of the repressor for the

operator as a result the repressor is released and transcription

proceeds.

A classic example of an inducible (catabolite-mediated) operon is

the lac operon, responsible for obtaining energy from galactosides

such as lactose.

Repressible system

Also called Negative control

The effector molecule interacts with the repressor protein such

that it can bind to the operator .

With repressible systems, the binding of the effector molecule to

the repressor greatly increases the affinity of repressor for the

operator, the repressor binds and stops transcription.

For the trp operon , the addition of tryptophan (the effector

molecule) to the E. coli environment shuts off the system because

the repressors binds at the operator.

In addition to negative control mediated by a repressor,

expression from a repressible operons is attenuated by sequences

within the transcribed RNA.

A classic example of a repressible (and attenuated) operon is the

trp operon, responsible for the biosynthesis of tryptophan.

Structure of the lac Operon

The lac operon have three structural genes: Z, , Y and A

The z gene codes for β-galactosidase , responsible for the hydrolysis

of the disaccharide, lactose into its monomeric units, galactose and

glucose.

The y gene codes for permease, which increases permeability of the

cell to galactosides.

The a gene encodes a transacetylase.

In addition to the structural genes the lac operon also has regulatory

genes:

Promoter: Binding site for RNA polymerase

Operator: Binding site of repressor

Control of lac operon expression

The control of the lac operon occurs by both positive and negative

control mechanisms.

Negative control of the lac operon

What happens to lac operon when glucose is present and lactose is

absent?

During normal growth on a glucose-based medium (lacking lactose),

the lac repressor is bound to the operator region of the lac operon,

preventing transcription.

What happens when glucose is absent and lactose is present?

The few molecules of lac operon enzymes present will

produce a few molecules of allolactose from lactose.

Allolactose is the inducer of the lac operon.

The inducer binds to the repressor causing a

conformational shift that causes the repressor to release the

operator.

With the repressor removed, the RNA polymerase can

now bind the promoter and transcribe the operon.

Positive Control of the lac operon What happens when both glucose and lactose levels are

high?

Since the inducer is present, the lac operon will be transcribed

but the rate of transcription is very slow (almost repressed)

because glucose levels are high and therefore cAMP levels are

low.

The repression of the lac operon under these conditions is

termed catabolite repression and is a result of the low levels of

cAMP that results from an adequate glucose supply.

This repression is maintained until the glucose supply is

exhausted.

What happens when glucose levels start dropping in the

presence of lactose?

As the level of glucose in the medium falls, the level of cAMP

increases.

Simultaneously the inducer (allolactose) is also binds to the lac

repressor (since lactose is present).

The net result is an increase in transcription from the operon.

The ability of cAMP to activate (increase) expression from the lac

operon results from an interaction of cAMP with a protein termed

CRP (for cAMP receptor protein).

The protein is also called CAP (for catabolite activator protein).

The cAMP-CAP complex binds to a region of the lac operon

just upstream of the promoter. This binding stimulates RNA

polymerase activity 20-to-50-fold.

(Repression of the lac operon is relieved in the presence

of glucose if excess cAMP is added.)

cAMP is therefore an activator of the lac operon.

This type of regulation by an activator is positive in contrast

to the negative control exerted by repressors.

trp operon

The trp operon encodes the genes for the synthesis of tryptophan.

As with all operons, the trp operon consists of the promoter, operator and the structural genes.

It is also subject to negative control by a repressor

In this system, unlike the lac operon, the gene for the repressor is not adjacent to the promoter, but rather is located in another part of the E. coli genome.

Another difference is that the operator resides entirely within the promoter

Unlike an inducible system, the repressible operon is usually turned on.

Structure of the trp operon

The operon consists of:

Five structural genes that code for the three enzymes required

to convert chorismic acid into tryptophan.

A gene (trpL) which functions in attenuation.

Operator

promoter

Gene Gene Function

P/O: Promoter; operator sequence is found in the promoter

trp L Leader sequence; containing attenuator (A) sequence the leader

trp E: Gene for anthranilate synthetase subunit trp D : Gene for anthranilate synthetase subunit trp C: Gene for glycerolphosphate synthetase

trp B: Gene for tryptophan synthetase subunit

trp A: Gene for tryptophan synthetase subunit

Negative control of trp operon

The affinity of the trp repressor for binding the operator region is

enhanced when it binds tryptophan, blocking further transcription of

the operon and, as a result, the synthesis of the three enzymes will

decline, hence tryptophan is a co-repressor, this means that when

tryptophan is absent expression of the trp operon occurs.

the rate of expression of the trp operon is graded in response to the

level of tryptophan in the cell.

Attenuation of the trp operon

Expression of trp operon is reduced by the addition of

trytophan. Tryptophan synthesis also controlled by two

other components:

1. tRNA, specifically tryptophanyl-tRNA, tRNATrp

(tRNATrp charged with tryptophan).

2. the trpL gene

Trp L gene found between operator and trp E gene, thus the attenuator region is composed of sequences found within the transcribed RNA of the operon.

It is involved in controlling transcription from the operon after RNA polymerase has initiated synthesis of the proteins.

The leader sequence (trp L) contains 14 amino acids including tandem tryptophan codons (2 codons).

How does leader sequence affect transcription of the trp

operon?

It contains two consecutive trp codons and therefore serves to

measure the tryptophan supply in the cell.

If the supply is inadequate, small amount of tRNA will be charged

and the leader peptide will be translated without problem.

If the supply is good, large amount of tRNA will be charged, and

translation will stall at the trp codons. How?

The trpL mRNA consists of four region can adopt a

number of different conformations. It contains several

self-complementary regions which can form a variety

of stem-loop structures

Different stem-loops can form depending on the level of

tryptophan in the cell and hence the level of charged trp-

tRNAs determine the position of ribosome on the leader

polypeptide as well as determining the rate of translation.

In the case of the trpL mRNA, when the cellular levels of tryptophan are high, the levels of the tryptophan tRNA are also high.

Immediately after transcription, the ribosome follows right

behind RNA polymerase until it is halted by a stop codon

prior to the region 2 which prevent stem loop formation

between 2 & 3 and permits formation of the terminator

stemloop (3 & 4) which will cause RNA polymerase to

dissociate when reach the UUUUU rich region in the end of

the 4 region terminating transcription.

How is the terminator stem-loop formed?

Because of the quick translation of domain 1, domain 2

becomes associated with the ribosome complex. Then domain 3 binds with domain 4, and transcription is

attenuated because of this stem loop formation. The stem loop formed by binding of domains 3 and 4 is found

near a region rich in uracil and acts as the transcriptional terminator loop.

Consequently, RNA polymerase is dislodged from the template.

trp Operon Transcription Under Low

Levels of Tryptophan

Under low cellular levels of tryptophan, the translation of the short peptide

on domain 1 is slow .

As a result domain 2 does not become associated with the ribosome.

Rather domain 2 of the leader mRNA associates with domain 3 of the leader mRNA.

This stem loop structure is the anti-terminator. Its formation prevents formation of the terminator.

This structure permits the continued transcription of the

operon. Then the trpE-A genes are translated, and the

biosynthesis of tryptophan occurs

Domain 4 is called the attenuator because its presence is

required to reduce (attenuate) mRNA transcription in the

presence of high levels of tryptophan.

• Review Articles:

• Regulation of RNA polymerase I transcription in the nucleolus - Genes and Develop., 2003.

• Roles of the heat shock transcription factors in regulation of the heat shock response and beyond - FASEB J., 2001.

• Translational Control of Viral Gene Expression in Eukaryotes - Microbiology and Molecular Biology Reviews, 2000.

• References:

• Molecular Biology of the Cell, Bruce Alberts, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, Peter Walter, Garland Science, 2007.

• Molecular cell biology, 1986, Darnell, Lodish, and Baltimore.

http://www.genesdev.org/cgi/content/full/17/14/1691

















http://www.fasebj.org/cgi/content/full/15/7/1118































http://mmbr.asm.org/cgi/content/full/64/2/239















lec 1 introduction to molecular biology

Technology

eukaryotic dna

damaged dna

packaging dna

dna polymerases

dna polymers

dna segments

genomic dna

dna chain