general: activators - protein-dna interaction. mbv4230 odd s. gabrielsen the sequence specific...
TRANSCRIPT
general:
Activators - protein-DNA interaction
MBV4230
Odd S. Gabrielsen
The sequence specific activators: transcription factors
Modular design with a minimum of two functional domains 1. DBD - DNA-binding domain 2. TAD - transactivation domain
DBD: several structural motifs classification into TF-families
TAD - a few different types Three classical categories
Acidic domains (Gal4p, steroid receptor) Glutamine-rich domains (Sp1) Proline- rich domains (CTF/NF1)
Mutational analyses - bulky hydrophobic more important than acidic
Unstructured in free state - 3D in contact with target?
Most TFs more complex Regulatory domains, ligand binding domains
etc
N
C
TAD
DBD
MBV4230
Odd S. Gabrielsen
TF classification based on structure of DBD
bHelix-Loop-Helix(Max)
Zinc finger
Leucine zipper(Gcn4p)
p53 DBD
NFB
STATdimer
Two levels of recognition1. Shape recognition
Anhelix fits into the major groove in B-DNA. This is used in most interactions
2. Chemical recognitionNegatively charged sugar-phosphate chain involved in electrostatic interactionsHydrogen-bonding is crucial for sequence recognition
MBV4230
Odd S. Gabrielsen
Alternative classification of TFs on the basis of their regulatory role Classification questions
Is the factor constitutive active or requires a signal for activation?
Does the factor, once synthesized, automatically enter the nucleus to act in transcription?
If the factor requires a signal to become active in transcriptional regulation, what is the nature of that signal?
Classification system I. Constitutive active nuclear factors II. Regulatory transcription factors
Developmental TFs Signal dependent
Steroid receptors Internal signals Cell surface receptor controlled
Nuclear Cytoplasmic
MBV4230
Odd S. Gabrielsen
Classification - regulatory function
Brivanlou and Darnell (2002) Science 295, 813 -
MBV4230
Odd S. Gabrielsen
Sequence specific DNA-binding- essential for activators TFs create nucleation sites in promoters for
activation complexes Sequence specific DNA-binding crucial role
Principles of sequence specific DNA-binding
MBV4230
Odd S. Gabrielsen
How is a sequence (cis-element) recognized from the outside?
Electrostaticinteraction
Hydrophobicinteraction
Hydrogen-bonds
Form/geometry
Shape recognition Chemical recognition
MBV4230
Odd S. Gabrielsen
Complementary forms
The dimension of anhelix fits the dimensions of the major groove in B-DNA
Side chains point outwards and are ideally positioned to engage in hydrogen bonds
MBV4230
Odd S. Gabrielsen
Direct reading of DNA-sequenceRecognition of form
The dimension of an -helix fits the dimensions of the major groove in B-DNA
Most common type of interaction
Usually multiple domains participate in recognition dimers of same motif tandem repeated motif Interaction of two different
motifs recognition: detailed fit of
complementary surfaces Hydration /water participates seq specific variation of DNA-
structure
MBV4230
Odd S. Gabrielsen
Example
Steroid receptor
MBV4230
Odd S. Gabrielsen
How is a sequence (cis-element) recognized from the outside?
Electrostaticinteraction
Hydrophobicinteraction
Hydrogen-bonds
Form/geometry
Shape recognition Chemical recognition
MBV4230
Odd S. Gabrielsen
Next level: chemical recognition - reading of sequence information
Negatively charged sugar-phosphate chain = basis for electrostatic interaction Equal everywhere - no sequence-recognition
Still a main contributor to the strength of binding
MBV4230
Odd S. Gabrielsen
How is a sequence (cis-element) recognized from the outside?
Electrostaticinteraction
Hydrophobicinteraction
Hydrogen-bonds
Form/geometry
Shape recognition Chemical recognition
MBV4230
Odd S. Gabrielsen
Recognition by Hydrogen bonding
A
D A Hydrogen-bonding is a
key element in sequence specific recognition
10-20 x in contact surface
Base pairing not exhausted in duplex DNA, free positions point outwards in the major groove
MBV4230
Odd S. Gabrielsen
Docked prot side chains exploit the H-bonding possibilities for interaction
Hydrogen-bonding is essential for sequence specific recognition 10-20 x in contact interphase
Most contacts in major groove
Purines most important
A Zif example
MBV4230
Odd S. Gabrielsen
Interaction: Protein side chain - DNA bp Close up
Amino acid side chains points outwards from the -helix and are optimally positioned for base-interaction
MBV4230
Odd S. Gabrielsen
How is a sequence (cis-element) recognized from the outside?
Electrostaticinteraction
Hydrophobicinteraction
Hydrogen-bonds
Form/geometry
Shape recognition Chemical recognition
MBV4230
Odd S. Gabrielsen
Hydrophobic contact points
Ile
Homeodomains
MBV4230
Odd S. Gabrielsen
The Homeodomain-family: common DBD-structure
Homeotic genes - biology Regulation of Drosophila development Striking phenotypes of mutants – body-
parts move Control genetic developmental program
Homeobox / homeodomain Conserved DNA-sequence “homeobox” in a
large number of genes Encode a 60 aa “homeodomain” A stably folded structure that binds DNA Similarity with prokaryotic helix-turn-
helix 3D-structure determined for several HDs
Drosophila Antennapedia HD (NMR) Drosophila Engrailed HD-DNA complex
(crystal) Yeast MAT2
MBV4230
Odd S. Gabrielsen
Homeodomain-family: common DBD-structure Major groove contact via a 3 -helix structure
helix 3 enters major groove (“recognition helix”)
helix 1+2 antiparallel across helix 3
16 -helical aa conserved 9 in hydrophobic core some in DNA-contact interphase (common docking mechanism?)
MBV4230
Odd S. Gabrielsen
Engrailed
MBV4230
Odd S. Gabrielsen
Homeodomain-family: common DBD-structure
Minor groove contacted via N-terminal flexible arm R3 and R5 in engrailed and R7 in MAT2 contact AT in minor groove
R5 conserved in 97% of HDs Deletions and mutants impair DNA-binding
Loop between helix 1 and 2 determines Ubx versus Antp function Close to DNA exposed for protein-protein interaction
MBV4230
Odd S. Gabrielsen
HD-paradox: what determines sequence specificity? Drosophila Ultrabithorax (Ubx), Antennapedia (Antp), Deformed
(Dfd) and Sex combs reduced (Scr): closely similar HD, biological role very different
Minor differences in DNA-binding in vitro TAAT-motif bound by most HD-factors contrast between promiscuity in vitro and specific effects in vivo
Swaps reveal that surprisingly much of the specificity is determined by the N-terminal arm which contacts the minor groove Swaps: Antp with Scr-type N-term arm shows Scr-type specificity in
vivo Swaps: Dfd with Ubx-type N-term arm shows Ubx-type specificity in
vivo N-terminal arm more divergent than the rest of HD
R5 and R7 (contacting DNA) are present in both Ubx, Antp, Dfd, and Scr
Other tail aa diverge much more
MBV4230
Odd S. Gabrielsen
Solutions of the paradox
Conformational effects mediated by N-term arm Even if the -helical HDs are very similar, a much
larger diversity is found in the N-terminal arms that contact the minor groove
Protein-protein interaction with other TFs through the N-terminal arm - enhanced affinity/specificity - the basis of combinatorial control MAT2 interaction with MCM1 - cooperative
interactions Ultrabithorax- Extradenticle in Drosophila Hox-Pbx1 in mammals
MBV4230
Odd S. Gabrielsen
Combinatorial TFs give enhanced specificity TFs encoded by the the
homeotic (Hox) genes govern the choice between alternative developmental pathways along the anterior–posterior axis.
Hox proteins, such as Drosophila Ultrabithorax, have low DNA-binding specificity by themselves but gain affinity and specificity when they bind together with the homeoprotein Extradenticle (or Pbx1 in mammals).
MBV4230
Odd S. Gabrielsen
N-tail in protein-protein interaction- adopt different conformations
Mat-2/Mcm-1HD
HD
Conformation determinedby prot-prot interaction
MBV4230
Odd S. Gabrielsen
It works impressively well
Hox genes
POU family
MBV4230
Odd S. Gabrielsen
POU-family: common DBD-structure
The POU-name : Pit-1 pituitary specific TF Oct-1 and Oct-2 lymphoid TFs Unc86 TF that regulates neuronal development in C.elegans
A bipartite160 aa homeodomain-related DBD a POU-type HD subdomain (C-terminally located) et POU-specific subdomain (N-terminally located) Coupled by a variable linker (15-30 aa)
POU is a structurally bipartite motif that arose by the fusion of genes encoding two different types of DNA-binding domain.
MBV4230
Odd S. Gabrielsen
POU: Two independent subdomains
POUHD subdomain 60 aa closely similar to the classical
HD Only weakly DNA-binding by itself (<HD) contacts 3´-half site (Oct-1: ATGCAAAT) docking similar to engrailed. Antp etc Main contribution to non-specific
backbone contacts
POUspec subdomain 75 aa POU-specific domain enhances DNA-affinity 1000x contacts 5´-half site (Oct-1: ATGCAAAT) contacts opposite side of DNA relative
to HD structure similar to prokaryotic - and
434-repressors The two-part DNA-binding domain
partially encircles the DNA.
MBV4230
Odd S. Gabrielsen
Flexible DNA-recognition
POU-domains have intrinsic conformational flexibility and this feature
appears to confer functional diversity in DNA-recognition
The subdomains are able to assume a variety of conformations, dependent on the DNA element.
ZNFs:zinc finger families
MBV4230
Odd S. Gabrielsen
Zinc finger proteins
Zinc finger proteins were first discovered as transcription factors.
Zinc finger proteins are among the most abundant proteins in eukaryotic genomes.
Their functions are extraordinarily diverse include DNA recognition, RNA packaging, transcriptional activation, regulation of apoptosis, protein folding and assembly, and lipid binding.
Zinc finger structures are as diverse as their functions.
MBV4230
Odd S. Gabrielsen
Examples
C2H2-type Ziffra Sp1 (3.fngr)
C4-type Ziffra GATA-1
LIM-domain type Ziffra ACRP
PKC-type Zif
Zn++
The C2H2 subfamily
MBV4230
Odd S. Gabrielsen
Zn++ Zn++ Zn++ Zn++Zn++ Zn++Zn++ Zn++
Classical TFIIIA-related zinc fingers: n x [Zn-C2H2]
History: Xenopus TFIIIA the first isolated and cloned eukaryotic TF Function: activation of 5S RNA transcription (RNAPIII) Rich source : accumulated in immature Xenopus oocyttes as “storage particles” = TFIIIA+5S RNA (≈ 15% of total soluble protein)
Purified 1980, cloned in 1984 Mr= 38 600, 344 aa
Primary structure TFIIIA Composed of repeats: 9x 30aa minidomains + 70aa unique region C-trm
Each minidomain conserved pattern of 2Cys+2His Hypothesis: each minidomain structured around a coordinated zinc ion (confirmed later)
MBV4230
Odd S. Gabrielsen
Zinc finger proteins
Finger-like in 2D Not in 3D
MBV4230
Odd S. Gabrielsen
Common features of TFIIIA-related zinc fingers Consensus for each finger: FXCX2-5CX3FX5FX2HX2-5H
Number of fingers in related factors varies: 2-37 Number of members exceptionally high
S.cerevisiae genome: 34 C2H2 zinc fingers C.elegans genome 68 C2H2 zinc fingers Drosophila genome 234 C2H2 zinc fingers Humane genome 564 C2H2 zinc fingers, (135 C3HC4 zinc finger)
We now recognize the classical C2H2 zinc finger as the first member of a rapidly expanding family of zinc-binding modules.
MBV4230
Odd S. Gabrielsen
3D structure of the classical C2H2-type of zinc fingers
Each finger = a minidomain with -structure each finger an independent module Several fingers linked together by flexible
linkers First 3D structure: the 3-finger Zif268 (mouse)
DNA interaction in Zif268 major groove contact through -helix in recognition of base triplets aa in three positions responsible for sequence
recognition: -1, 3 and 6 (rel. til -helix) Simple one-to-one pattern (contact aa - baser)
can a recognition code be defined ?? DNA interaction in GLI and TTK differs
different phosphate contact distortion of DNA finger 1 without DNA contact
DNA
MBV4230
Odd S. Gabrielsen
The Zif268 prototype
Finger 2 from Zif268 including the two
cysteine side chains and two histidine side chains that coordinate the zinc ion
DNA-recognition residues indicated by the
numbers identifying their position relative to the start of the recognition helix
MBV4230
Odd S. Gabrielsen
Three fingers in Zif268
Zif268 - first multi-finger structure
recognition of base triplets
Finger 1
Finger 2
Finger 3
LINKER
DNA
MBV4230
Odd S. Gabrielsen
Recognition code?
The DNA sequence of the Zif268 site is color coded to indicate base contacts made by each finger.
MBV4230
Odd S. Gabrielsen
Structure of the six-finger TFIIIA–DNA complex
In a multi-finger protein some fingers contact base pairs and some will not, but rather function as bridges Fingers 1–2–3, separated by
typical linkers, wrap smoothly around the major groove like those of Zif268
In contrast, fingers 4–5–6 form an open, extended structure running along one side of the DNA. Of these, only finger 5 makes contacts with bases in the major groove. The flanking fingers, 4 and 6, appear to serve primarily as spacer elements.
Nuclear receptors 2xC4
MBV4230
Odd S. Gabrielsen
Nuclear receptors: 2x[Zn-C4]:
Large family where DBD binds two Zn++ through a tetraedrical pattern of Cys
conserved DBD 70-80 aa Protein structure
Two “zinc fingers” constitute one separate domain Two -helices with C3-Zn-C4 N-terminally These perpendicular on the top of each other with
hydrophobic interactions Mediates trx response to complex extra cellular signals Evolutionary coupled to multi cellular organisms
Yeast = 0 but C.elegans 233 or 1.5% of genes !! Sequence prediction: 90% with nuclear receptor DBD has
potential ligand-BD Implies that lipophilic signal molecules have been important
to establish communication between cells
MBV4230
Odd S. Gabrielsen
DNA-binding by nuclear receptors
MBV4230
Odd S. Gabrielsen
Nuclear receptors - DNA interaction
3D Prot-DNA structure glucocorticoid receptor + estrogen receptor
Dimer in complex (monomer in solution) DNA interaction
First “finger” binds DNA Second “finger” involved in dimerization Binds to neighboring “major grooves” on same
side of DNA Extensive phosphate contact and recognition
helix docked into the groove specificity determined by 3 aa (E2, G3, A6)
in recognition helix Structured dimer interphase formed upon DNA-
binding
GATA factors
MBV4230
Odd S. Gabrielsen
GATA-factors: 1x [Zn-C4]:
Small family Prototype erythroid TF: GATA-1 (2 fingers)
C-terminal finger – DNA binding N-terminal finger – protein interactions?
From fungi to humans Structure ≈ 1.finger in nuclear receptors Hydrophobic DNA interphase Evolutionary implications
Early duplication of primitive finger divergent functions developed in NR
Gal4p factors
MBV4230
Odd S. Gabrielsen
GAL4-related factors: 1 x [Zn2-C6]:
GAL4-DBD = 28aa cys-rich domain binds 2 Zn++ + 26aa C-terminal domain involv. in dimerization
Cys-rich domain consensus: CX2CX6CX6CX2CX6C A Zn-Cys cluster with shared Cys (1. and 4.) Two short -helices with C-Zn-C N-terminal
MBV4230
Odd S. Gabrielsen
GAL4-related factors: 1 x [Zn2-C6]:
Dimerization domain Monomer in solution, dimer in DNA-complex In solution only Cys-rich motif structured In complex forms two extended helix-strand motives Amfipathic helices form a dimer-interphase in the complex
DNA interaction contacts CGG-triplets in major groove C-terminal of 1. -helix contacts bases Phosphate contact via helix-strand motif Coiled-coil dimer-interphase at right angle to DNA (≈bZIP) Linker determines spacing of CGG-triplet: 11bp in GAL4, 6bp
in PPR1