distal enhancer – gene interactions at the lmo2 locus in ... › bitstream › 1807 › ... ·...
TRANSCRIPT
Distal Enhancer – Gene Interactions at the Lmo2 locus in Mouse Erythroid cells
by
Anandi Bhattacharya
A thesis submitted in conformity with the requirements for the degree of Master of Science
Department of Cell and Systems Biology University of Toronto
© Copyright by Anandi Bhattacharya 2012
ii
Distal Enhancer – Gene Interactions at the Lmo2 locus in Mouse
Erythroid cells
Anandi Bhattacharya
Master of Science
Department of Cell and Systems Biology University of Toronto
2012
Abstract
Distal regulatory elements (DREs) have been identified upstream of the hematopoietic
regulator Lim domain only 2 (Lmo2) gene in the human and mouse genomes. In this thesis I
have investigated how these elements regulate Lmo2 transcription in erythroid cells. My
results show that strong chromatin-chromatin interactions exist between the DREs and the
Lmo2 gene promoter in erythroid cells. These interactions are absent from kidney cells that
do not express Lmo2. Within the distal chromatin interaction cluster encompassing three of
the DREs increased DNase I sensitivity, presence of high levels of H3K4me1, and binding of
multiple transcription factors, p300, cohesin (RAD21) and CTCF are observed. CTCF bound
regions are located between the farthest DRE and the neighboring Caprin1 promoter
suggesting that CTCF insulates Caprin1 from the DREs. Hence, my data suggests that these
DREs function through a chromatin looping mechanism supported by cohesin associated
with CTCF and transcription factor bound regions.
iii
¨ The purpose of learning is growth, and our minds, unlike our bodies,
can continue growing as we continue to live. ¨
Mortimer Adler
iv
Acknowledgments
I would like to take this opportunity to thank all those people without whom this thesis would
not have been possible. At the very beginning I would like to thank my supervisor Prof.
Jennifer Mitchell, for her guidance, unwavering support and inspiration through the last 2.5
years. In spite of her extremely busy schedule, she always had time for me, and each and
every discussion with her was illuminating, to say the least. I have learnt a lot from her and I
am deeply indebted to her. Because of her I was introduced not only to the world of
genomics but also to this project which to say the least has been a life changing experience. I
would also like to express my sincere gratitude to Prof. Dorothea Godt and Prof. Vincent
Tropepe for their invaluable advice and encouragement and all the help that they have
provided me for completing my graduate study. I would especially like to thank Prof. Godt for
her invaluable support throughout the entire graduate program. The last two and half years
have been a very humbling experience for me and I consider myself privileged to have got
this opportunity to study at the University of Toronto.
I would also like to thank all those people who have played a very important role in bringing
me to the place where I stand today; my parents Sankarlal and Indrani Bhattacharya, for
believing in me, Prof. Sudeep Banerjee for inspiring me to be a researcher, my friends Conny
Bartholmes, Julie Chen, Kamelia Miri, Mike Schwartz and Huda Abdel-Aleem for their
invaluable companionship, Ian Buglass for answering all the innumerable questions that I had
throughout the entire duration of my study here at the department of Cell and systems
biology. I would also like to thank Dr. Neil Macpherson for providing me with invaluable
inputs both on my project as well as on my presentations thereby helping me immensely and
also Dr. Scott Davidson for the discussions on biological and bioinformatics research. I
would also like to thank my labmate Harry Zhou for helping me a lot during my gel running
sessions and Pooya Dibajnia for helping me in my experiments. I would especially like to
thank Julie Chen for working along with me on my project. without Julie`s expertise in
bioinformatic analyses this thesis would not have been possible. Another person who has
contributed a lot to my project is Sara Ho, I would really like to thank her as well. Finally, I
would like to thank my husband and best friend Kunal Dasgupta for his understanding and
v
unwavering support, if it had been not for him I would not have been a part of University of
Toronto.
At the very end I would also like to thank the department of Economics and department of
Cell and Systems Biology for funding my education here at the University of Toronto.
vi
Table of contents
Acknowledgments .......................................................................................................................... iv
Table of contents............................................................................................................................. vi
List of Tables .................................................................................................................................. ix
List of Figures .................................................................................................................................. x
List of Appendices .......................................................................................................................... xi
Chapter 1: Introduction .................................................................................................................. 1
1.1 General Introduction ............................................................................................................ 2
1.2 Mechanisms of regulating gene transcription ...................................................................... 3
1.2.1 First Level of regulation of gene transcription: DNA sequences ............................ 4
1.2.1.1 Core Promoter and Proximal Promoter ..................................................... 5
1.2.1.2 Cis regulatory elements ............................................................................. 6
1.2.1.2.1 Distal Regulatory Elements: Enhancers and Insulators ............................ 7
1.2.2 Second Level of regulation of gene transcription: Epigenetic modifications.......... 8
1.2.2.1 Epigenetic signatures, tissue specificity and enhancers ............................ 9
1.2.2.1.1 Low nucleosome occupancy and DNaseI sensitivity ................................ 9
1.2.2.1.2 p300 binding ............................................................................................ 10
1.2.2.1.3 Cobound by multiple TFs ........................................................................ 10
1.2.3 Third Level of regulation of gene transcription: Spatial organization of the chromatin ............................................................................................................... 11
1.2.3.1 Relation between transcription and organization of chromatin structure ................................................................................................... 11
1.2.3.2 Chromatin looping and regulation of gene transcription ........................ 12
1.2.3.2.1 Enhancers and chromatin looping .......................................................... 12
1.2.3.2.2 Active chromatin hub .............................................................................. 13
vii
1.2.3.2.3 Insulators and chromatin looping ............................................................ 14
1.2.3.3 Role of protein complexes in maintaining higher-order chromatin conformation ........................................................................................... 14
1.3 Lmo2: A candidate for investigating tissue specific chromatin conformation ................. 16
Chapter 2: Methods........................................................................................................................ 18
2.1 Cell Isolation ...................................................................................................................... 19
2.2 Chromosome Conformation Capture (3C) ........................................................................ 19
2.3 RNA isolation and real-time RT-qPCR ............................................................................. 21
2.4 Statistical analysis .............................................................................................................. 22
2.5 Genome Mapping and Peak Identification of ChIP-seq datasets in erythroid cells .......... 22
Chapter 3: Distal regulatory elements located upstream of Lmo2 are associated with tissue-specific chromatin features ............................................................................................. 24
3.1 Introduction........................................................................................................................ 25
3.2 Results ............................................................................................................................... 26
3.2.1 Identification and mapping of the enhancer elements on the mouse genome ................................................................................................................... 26
3.2.2 TFs bind to the distal regulatory elements ............................................................. 28
3.2.2.1 Multiple TFs binding to DREs in mouse erythroid cells ........................ 28
3.2.3 The DREs have erythroid cell-specific epigenetic features ................................... 32
3.2.4 Intergenic transcription occurs at the distal regulatory elements .......................... 34
3.2.5 CTCF and RAD21 bind to multiple regions across the Lmo2/Caprin1 locus ....................................................................................................................... 36
3.3 Discussion .......................................................................................................................... 39
3.3.1 TFs bind to the distal regulatory elements ............................................................. 39
3.3.2 The DREs have erythroid cell-specific epigenetic features ................................... 41
3.3.3 Intergenic transcription occurs at the distal regulatory elements .......................... 42
3.3.4 CTCF and RAD21 bind to multiple regions across the Lmo2/Caprin1 locus ....................................................................................................................... 42
viii
3.3.5 The big picture ....................................................................................................... 42
3.4 Future work ........................................................................................................................ 43
Chapter 4:Chromatin-Chromatin interactions at the Lmo2 locus .................................................. 44
4.1 Introduction........................................................................................................................ 45
4.2 Results ............................................................................................................................... 46
4.2.1 The 75 distal regulatory element contacts the Lmo2 proximal promoter .............. 46
4.2.2 Several upstream distal regulatory elements contact the Lmo2 promoter ............. 50
4.2.3 The Caprin1 promoter does not interact with the identified distal regulatory elements ................................................................................................ 52
4.3 Discussion .......................................................................................................................... 54
4.3.1 The 75 distal regulatory element and several other upstream distal regulatory elements contact the Lmo2 proximal promoter .................................... 54
4.4 Future work ........................................................................................................................ 56
Chapter 5: General Discussion ...................................................................................................... 57
5.1 Discussion .......................................................................................................................... 58
5.2 Summary ............................................................................................................................ 61
References...................................................................................................................................... 63
Appendices .................................................................................................................................... 75
ix
List of Tables
Table 2-1. Chromatin immunoprecipitation sequencing data........................................................ 23
x
List of Figures
Figure 2-1.Chromosome Conformation Capture (3C)................................................................... 21
Figure 3-1. Genomic map of the Lmo2/Caprin1 locus showing the identified DREs .................. 27
Figure 3-2. Distal regulatory elements upstream of Lmo2 overlap transcription factor bound
regions in erythroid cells ............................................................................................................... 29
Figure 3-3. Distal regulatory elements upstream of Lmo2 overlap transcription factor bound
regions in HPC7 hematopoietic progenitor cells ........................................................................... 31
Figure 3-4. Distal regulatory elements upstream of Lmo2 overlap transcription factor bound
regions and have different epigenetic marks in erythroid cells ..................................................... 33
Figure 3-5. Intergenic transcription occurs in anaemic spleen erythroid cells. ............................. 35
Figure 3-6. CTCF and RAD21 are bound within the Lmo2-Caprin1 region ................................ 37
Figure 3-7. CTCF bound upstream region of Lmo2 in different cell types. .................................. 38
Figure 4-1.Lmo2 primary transcripts are abundant in anaemic spleen erythroid cells .................. 47
Figure 4-2. The Lmo2/Caprin1 region on mouse chromosome 2 ................................................. 48
Figure 4-3. The 75DRE interacts with the Lmo2 proximal promoter in anaemic spleen
erythroid cells ................................................................................................................................ 49
Figure 4-4. Distal regulatory elements interact with the Lmo2 proximal promoter in anaemic
spleen erythroid cells ..................................................................................................................... 51
Figure 4-5. The distal regulatory elements upstream of Lmo2 do not interact with the Caprin1
promoter ......................................................................................................................................... 53
xi
List of Appendices
Appendix 1. Coordinates of distal regulatory elements located upstream of the Lmo2
promoter in the mouse genome...................................................................................................... 75
Appendix 2. Coordinates of the Lmo2 proximal and distal promoters in the mouse genome ....... 77
Appendix 3. Primers used in quantitative chromosome conformation capture and RT-qPCR ..... 78
Appendix 4. Restriction digestion efficiency in chromosome conformation capture ................... 85
Appendix 5. Transcription factor bound regions at the 75 and 12DREs ....................................... 86
Appendix 6. QPCR products of gene expression profile run on agarose gel ................................ 87
Appendix 7. QPCR products of intergenic transripts at 90,75, and12 DREs run on agarose gel . 87
Appendix 8.QPCR products of the intergenic transcripts at 25,35,40,43,47,58,64, and 70
DREs run on agarose gel ............................................................................................................... 88
Appendix 9. QPCR products of the intergenic transcripts located between the 47and 58DREs,
64 and 58DREs, and between the 90 and 75DREs run on agarose gel ......................................... 88
Appendix 10. Chromosome conformation capture (3C) products run on agarose gel (75E as
the anchor fragment) ...................................................................................................................... 89
Appendix 11. Chromosome conformation capture (3C) products with Lmo2 proximal
promoter as anchor fragment run on agarose gel........................................................................... 89
Appendix 12. Chromosome conformation capture (3C) products with Caprin1 promoter as
anchor fragment run on agarose gel............................................................................................... 90
Appendix 13. Primary intergenic transcript levels in adult mouse anaemic spleen and kidney
cells at the Lmo2/Caprin1 locus .................................................................................................... 90
1
Chapter 1: Introduction
2
1.1 General Introduction
The completion of the human genome project in 2003 launched the genomics revolution in
molecular biology research (Lander et al., 2001; Venter et al., 2001). Even though science has
since experienced a succession of rapid advances, there are several fundamental gaps in our
current knowledge. For example we know that every cell in our body contains the same genetic
material; however genes expressed in the neurons of our brain are different from the genes
expressed in our blood cells. So how is this tissue specific gene expression pattern generated?
Classical factors that allow cells to exhibit differential gene expression patterns without
changing genome sequences are chemical modifications to DNA sequences and histones and the
presence of various cell type specific proteins that interact with DNA (Jaenisch and Bird, 2003).
Recently the manner in which the chromatin is organized at a specific gene locus has emerged
as a new paradigm in gene expression regulation and new techniques have allowed the
investigation of the chromatin organization at a gene locus within a cell type in fine detail
(Dekker et al., 2002; Ethier et al., 2012; Palstra et al., 2003; Tolhuis et al., 2002). It has become
clear that to understand the tissue-specific regulation of genes all of these factors that contribute
to the cell-type specific epigenome need to be considered. For example a study has shown that a
2Mb region encompassing the human β-globin locus along with the flanking olfactory receptor
genes is organized in cell type specific manner in the human erythroid K562 cells and the
formation of this cell type-specific conformation is mediated by two proteins namely CTCF and
cohesin (Hou et al., 2010). Furthermore, the results also suggest that there is a correlation
between β-globin gene transcription and the histone methylation pattern of the locus (Hou et al.,
2010). Therefore, in order to understand tissue specific regulation of gene transcription a
combination of different biological data is required to explain how different biological factors
work synergistically to execute a specific transcriptional program.
3
1.2 Mechanisms of regulating gene transcription
Regulation of gene transcription within the nuclear space is a multistep process. As multiple
events occur before and after the actual initiation of transcript (RNA) synthesis, hence gene
transcription can be regulated not only at different steps but these steps can be part of different
levels of regulation (Fuda et al., 2009). The first step that hails the onset of transcription is the
decondensation of the locus followed by remodeling of the nucleosomes so that RNA
Polymearse II (RNAPII) can gain access to the gene promoter (Fuda et al., 2009; Smale and
Kadonaga, 2003). In the next step the pre-initiation complex is assembled at the gene promoter
following which the DNA is unwound and RNAPII initiates transcription (Fuda et al., 2009).
After the actual initiation of RNA synthesis, the RNAPII pauses so as to undergo promoter
clearance before engaging in productive elongation (Sims et al., 2004). For promoter clearance
the C-terminal domain (CTD) of RBP1, the largest subunit of RNAPII has to be phosphorylated
(Komarnitsky et al., 2000; Sims et al., 2004). In this context it should be mentioned that the
CTD of RNAPII comprises of multiple repeats of the heptapeptide sequence (YSPTSPS) that
can be phosphorylated at the serine 2, 5 and 7 positions in each of the repeats (Chapman et al.,
2007; Corden, 1993; Sims et al., 2004). Whereas phosphorylation of serine 2 of the RNAPII
CTD occurs at the transition to productive elongation, phosphorylation of serine 5 of RNAPII
CTD is required for promoter clearance of the paused RNAPII (Komarnitsky et al., 2000;
Marshall et al., 1996; Sims et al., 2004). It is only after the release of the paused polymerase that
the gene is transcribed throughout its entire length (Fuda et al., 2009; Sims et al., 2004). After
this the last step of transcription is performed by the RNAPII that is `termination` following
which RNAPII can initiate a new round of transcription (Fuda et al., 2009). Therefore gene
transcription can be regulated during different stages of transcription namely before, during or
after the initiation of RNA synthesis (Fuda et al., 2009). Although there are multiple
mechanisms by which gene transcription can be regulated after the initiation of RNA synthesis,
in this thesis I will focus on the regulatory events that occur before the process of initiation of
RNA synthesis.
In eukaryotes, mechanisms that are involved in regulating gene transcription before the process
of initiation of RNA synthesis can be arranged into a series of different levels (Dillon, 2006).
The most basic level of regulation lies at the DNA sequence level (Dillon, 2006). Studies have
4
shown that binding of transcription factors (TFs) to specific sequences in the gene promoter and
more distally located DNA elements helps in the assembly of the transcription pre initiation
complex at the promoter, the first step in productive transcription (Dillon, 2006). The next level
of regulation of gene transcription involves interactions between chromatin-associated proteins
and the DNA sequences so as to cause chemical modifications of DNA sequences as well as the
histone proteins (Li et al., 2007). These modifications are essential because in spite of the DNA
sequences having all the information required for regulation gene transcription, these DNA
sequences are often not accessible to the regulatory proteins in different cell types due to the
packaging of the DNA sequences around nucleosomes (Li et al., 2007). Therefore epigenetic
modifications of the DNA sequences and associated proteins that impact regulation of gene
transcription constitute the second level of regulation of gene transcription. Finally a
combination of the presence of cell type-specific regulatory proteins and the epigenetic
modifications along with chromatin de-condensation and looping helps in organizing the
chromatin in a tissue specific manner (Chambeyron and Bickmore, 2004; de Laat and Grosveld,
2003; Dillon, 2006). Formation of this tissue-specific conformation of the chromatin in turn
plays a vital role in the regulation of gene transcription by bringing distal regulatory sequences
into physical proximity to gene promoters (de Laat and Grosveld, 2003; Palstra et al., 2003;
Spilianakis and Flavell, 2004; Tolhuis et al., 2002). This kind of regulation of gene transcription
constitutes the third level of regulation of gene transcription. All three levels of regulation of
gene transcription together coordinate tissue specific transcriptional programs thereby
conferring a tissue specific gene expression pattern.
1.2.1 First Level of regulation of gene transcription: DNA sequences
The most basic level of regulation is at the DNA sequence level (Dillon, 2006). Studies have
shown that sequences both proximal and distal to a gene can regulate the transcription of that
gene (Lettice et al., 2003; Maston et al., 2006; Van der Ploeg et al., 1980). As all the regulatory
information required by a cell is encoded within its genome in the form of the DNA sequence,
regulatory elements such as promoters, silencers, insulators, enhancers and other elements (such
as locus control regions) play an important role in the regulation of gene transcription (Maston
et al., 2006). Not only do the DNA sequences serve as the docking sites for the different kinds
of regulatory and chromatin-associated proteins but are also the target of different kinds of
modification, all executed in order to regulate gene transcription within the cell (Maston et al.,
5
2006). Therefore we can consider the DNA sequence to be the first basic level of regulation of
gene transcription in any living organism.
1.2.1.1 Core Promoter and Proximal Promoter
The transcription start site (TSS) along with the DNA sequences located immediately around the
TSS (~ 35bp upstream and/or downstream of the TSS) is defined as the core promoter of a gene
(Smale and Kadonaga, 2003). The core promoter consists of different functional DNA motifs
known as core promoter elements that help in the assembly and initiation of the RNAPII
transcription machinery and also directly interact with the components of the basal transcription
machinery (Smale and Kadonaga, 2003). The first core promoter element to be described was
the TATA box that serves as the binding site of the TFIID subunit of the TATA binding proteins
(Maston et al., 2006; Smale and Kadonaga, 2003). In addition to the TATA box other types of
core promoter elements have been identified in metazoan core promoters such as the initiator
element, downstream core element, downstream promoter element, motif ten element and
TFIIB-recognition element (Maston et al., 2006; Smale and Kadonaga, 2003). Furthermore,
statistical sequence analysis of about 10,000 human RNAPII core promoter elements have
shown that whereas the downstream promoter element and the TFIIB-recognition element were
each present in roughly 25% of the promoters, only 12.5% of the core promoter elements
contained the TATA box motif (Gershenzon and Ioshikhes, 2005). Interestingly, whereas the
TATA box motifs have been only found in strong tissue-specific core promoters, the core
promoters of housekeeping genes have been found to be associated with CpG islands (Maston et
al., 2006). For example studies have shown that a correlation exists between the presence of
CpG islands and the presence of some core promoter elements (Gershenzon and Ioshikhes,
2005). TATA boxes are usually present in core promoters that do not have any CpG islands
located nearby, whereas TFIIB-recognition elements are found in core promoters that have a
CpG island located nearby (Gershenzon and Ioshikhes, 2005). In this context it should be
mentioned that CpG islands are regions of the genome that have a higher concentration of CpG
(Cytosine-phosphate-Guanine) sites, methylation of these CpG sites within the promoters of
genes can lead to gene silencing by blocking the binding of TFs to their recognition sequences
(Gardiner-Garden and Frommer, 1987; Gonzalez-Zulueta et al., 1995; Jones and Baylin, 2002;
Maston et al., 2006).
6
In addition to the core promoter, proximal gene promoters are also present in higher eukaryotic
genomes (Maston et al., 2006). Proximal promoter (pP) of a gene denotes the region located
immediately upstream of the core promoter that can range from fifty to a few hundred base pairs
and contains multiple binding sites for proteins that act synergistically to either activate or
repress transcription of the linked gene (Maston et al., 2006).
1.2.1.2 Cis regulatory elements
Transcription is not regulated only by the gene promoters but in many cases also by other DNA
elements (Maston et al., 2006). For many years promoters were considered to be the only gene
regulatory elements as a result of which a contiguous stretch of DNA sequence located upstream
of the TSS were cloned and analysed for regulatory activity. However in 1980 it was identified
for the first time that a deletion outside the intact human ß-globin gene had caused thalassaemia
indicating that sequences apart from the gene promoter can play a vital role in regulation of gene
transcription (Van der Ploeg et al., 1980). Moreover, studies over the past three decades have
indicated that transcription is regulated not only by the sequences located immediately upstream
of the gene TSS but in many cases by DNA elements which can be located up or downstream of
the genes they regulate (Mills et al., 1997; Tuan et al., 1989). A classical example of this being
alteration in the DNA sequences located as far as 1Mb from the Sonic Hedge Hog (SHH) gene
causes malformation of limbs in patients due to a condition known as preaxial polydactyly
(Lettice et al., 2003). Interestingly in some cases these DNA regulatory elements have also been
found to be located in a different chromosome altogether (Lomvardas et al., 2006). For example
the enhancer element H for the olfactory receptor genes is able to regulate transcription of
olfactory receptor genes found on multiple chromosomes (Lettice et al., 2003; Lomvardas et al.,
2006). These distal regulatory DNA sequences in many cases are bound by specific proteins that
regulate transcription (Drissen et al., 2004; Sawado et al., 2001; Song et al., 2007). Chromatin
immunoprecipitation sequencing (ChIP-Seq) studies can be used to investigate the binding of
TFs and the presence of distinct epigenetic marks at specific locations of the genome
(Kharchenko et al., 2008). Such a genome wide ChIP-Seq study for several TFs has revealed
that a significant proportion (40-60%) of transcription factor bound regions are located in the
7
intergenic regions of the genome ≥ 10 kb from a gene TSS (Chen et al., 2008; Fullwood et al.,
2009; Yu et al., 2009). Taken together these data suggests that in addition to the gene
promoters, DNA sequences located > 10 kb from gene promoters play a vital role in regulating
transcription on a genome-wide scale. Nevertheless, in this context it should be mentioned that
these elements could not only regulate transcription of protein coding target genes but also of
different types of non coding RNAs which remain largely un-annotated in mammalian genomes
and whose roles are not completely defined though many are themselves involved in regulating
gene expression.
1.2.1.2.1 Distal Regulatory Elements: Enhancers and Insulators
Distal regulatory elements (DREs) regulate transcription by enhancing (enhancers), repressing
(repressors) or by insulating (insulators) transcription of their target genes (Gowri et al., 2003;
Maston et al., 2006; Tuan et al., 1989). In fact in some cases these DREs can act as an enhancer
in one context and a repressor in another (Murayama et al., 2004; Noonan and McCallion, 2010;
Perissi et al., 2004). Hence whether or not a specific DRE will enhance or repress transcription
depends upon its genomic environment and the protein complex that associates with the DRE
(Murayama et al., 2004; Noonan and McCallion, 2010; Perissi et al., 2004). We have defined
DREs as DNA sequences located either upstream or downstream at > 10 kb from a potential
target gene (Chen et al., 2012).
Enhancers
Enhancers are DNA sequences that are capable of activating gene transcription irrespective of
their orientation, distance and location relative to the target gene (Banerji et al., 1981;
Blackwood and Kadonaga, 1998; Dillon and Sabbattini, 2000; Jin et al., 2011; Ptashne, 1986).
Enhancers are functionally comprised of multiple protein binding sites as a result of which
different TFs can bind to them thereby regulating gene transcription (Jin et al., 2011; Maston et
al., 2006). A tandem SV40 repeat comprising of two identical 72bp elements were the first
characterised enhancers which were located 200bp upstream of the TSS (Moreau et al., 1981).
Insulators
Insulators are DNA elements that have the ability to prevent a gene from getting inappropriately
8
either activated or silenced due to outside or surrounding influences (Wallace and Felsenfeld,
2007). There are two classes of insulators based on their function namely the enhancer blocking
(EB) Insulators and the barrier insulators (Wallace and Felsenfeld, 2007). Whereas, the barrier
insulator prevents the inappropriate silencing of a gene by spreading of heterochromatin through
the gene, the EB insulators when placed between an enhancer and its target gene promoter
prevent the enhancer from activating the gene promoter (Wallace and Felsenfeld, 2007). A
classical example of the EB insulator is the insulator element that is bound by the CTCF protein
at the mouse Igf2/H19 imprinted locus and at the ß-globin locus (Bell and Felsenfeld, 2000;
Chung et al., 1993).
1.2.2 Second Level of regulation of gene transcription: Epigenetic
modifications
Although the DNA template contains all the information required for the execution of a
transcriptional program, an important factor that can regulate the process of gene transcription is
the accessibility of the DNA sequences that can be modulated by epigenetically modifying the
sequence itself or modifying nucleosomes that associate with DNA (Ong and Corces, 2011; Xi
et al., 2007). Packaging of DNA sequences around nucleosomes restricts access of regulatory
proteins to the DNA sequence and in turn regulates all DNA-based processes including gene
transcription (Ong and Corces, 2011). The modulation of DNA accessibility which in turn
affects gene transcription in fact involves multiple mechanisms such as temporary removal of
the core histone octamer from the DNA or epigenetic modifications to the histone proteins
(histone acetylation, methylation, phosphorylation, ubiquitination) as well as chemical
modifications to DNA sequences such as DNA methylation (Xi et al., 2007). For example
studies by Cui et al. have shown that modification of histone proteins plays a crucial role in the
maintenance and differentiation of hematopoietic stem cells (HSCs) (Cui et al., 2009). The
authors have shown that as the HSCs undergo differentiation to give rise to the different
lineages of blood cells there is a change in the gene expression pattern, and their results suggest
that there is a correlation between the histone marks at the HSC stage and the subsequent
changes in the gene expression pattern (Cui et al., 2009). Furthermore, there is a correlation
between epigenetic modifications at distal enhancers and tissue specific transcription of nearby
genes (Chen et al., 2012; Heintzman et al., 2009).
9
1.2.2.1 Epigenetic signatures, tissue specificity and enhancers
Genome wide studies have shown that enhancers have a distinct epigenetic signature (Blow et
al., 2010; Heintzman et al., 2009; Heintzman et al., 2007; Visel et al., 2009). Enhancers in many
cases are marked with increased levels of histone H3 lysine 4 monomethylation (H3K4me1) and
with reduced marks of histone H3 trimethylation of lysine 4 (H3K4me3) (Chen et al., 2008;
Gross and Garrard, 1988; Ren et al., 2007; Visel et al., 2009; Visel et al., 2010; Wu, 1980).
Furthermore, studies conducted by Heintzmann et al. have also shown that enhancers are
marked by highly cell type-specific histone modification patterns which correspond to their
functional status within the specific cell types (Heintzman et al., 2009). However, gene
promoters have been shown to have a more uniform histone modification pattern in different
cell types suggesting that distal enhancers more functionally relevant in regulating tissue
specific gene expression (Heintzman et al., 2009; Visel et al., 2009). Tissue specific chromatin
signatures at enhancers include an increased sensitivity to DNaseI, binding of multiple TFs, the
histone acetyl transferase protein p300, H3K4me1 and H3K27ac (Blow et al., 2010; Heintzman
et al., 2009; Heintzman et al., 2007; Visel et al., 2009; Xi et al., 2007). For example the
regulatory elements of the ß-globin (Hbb) locus control region (LCR) consists of a series of
transcription factor bound DNaseI hypersensitive sites 50 kb upstream of the Hbb-b1 gene
(Forrester et al., 1990; Tuan et al., 1989). The distinct chromatin features of DREs/Enhancers
combined with ChIP-Seq experiments have begun to allow for regulatory element identification
on a genome-wide scale increasing our understanding of gene regulation for mammalian
genomes (Blow et al., 2010; Heintzman et al., 2009; Visel et al., 2009).
1.2.2.1.1 Low nucleosome occupancy and DNaseI sensitivity
Genome wide studies have shown that nucleosome occupancy tends to be low at transcription
start sites and at the boundaries of the cis-regulatory elements indicating that the changes in the
nucleosome dynamics can play a critical role in regulating gene transcription (Mito et al., 2007;
Schones et al., 2008). As a result of low nucleosome occupancy and due to the presence of
``open`` chromatin genomic regions that are rich in DREs tend to exhibit increased sensitivity to
DNaseI treatment in a tissue specific manner (Gross and Garrard, 1988). Moreover, studies have
10
also shown that TFs can induce displacement of nucleosomes thereby inducing changes in the
chromatin structure (He et al., 2010; Li et al., 2007). Studies in prostate cancer cells where
androgen receptor binds to enhancers have shown that in response to stimulation with androgen,
the central H2A.Z containing nucleosome present at androgen receptor binding sites that is
flanked by a pair of marked nucleosomes disappears (He et al., 2010). Hence, this study
suggests that TFs have the ability to displace nucleosomes at their binding sites thereby
changing the chromatin structure.
1.2.2.1.2 p300 binding
A study conducted in the human genome over a region of 30Mb has shown that there is a
correlation between enhancer function and binding of the co-activator protein p300 (Birney et
al., 2007; Heintzman et al., 2007). Furthermore ChIP-Seq studies performed to map thousands
of p300 binding sites across the mouse genome have been able to accurately predict the
presence of novel enhancers in mouse brain, limbs and heart suggesting that there is a
correlation between p300 binding and enhancer function (Blow et al., 2010; Visel et al., 2009).
1.2.2.1.3 Cobound by multiple TFs
DREs are bound by multiple TFs (Chen et al., 2008; Heintzman and Ren, 2009; Jin et al.,
2011). The well characterized regulatory elements of the ß-globin gene are bound by multiple
TFs in the adult erythroid cells (Cho et al., 2008; Song et al., 2007; Wijgerde et al., 1996). TFs
such as GATA1, EKFL1 have been found to bind to the regulatory elements of the ß-globin
gene in a tissue specific manner and have been shown to play an active role in regulation of the
ß-globin gene transcription (Cho et al., 2008; Drissen et al., 2004; Kim et al., 2007; Vakoc et al.,
2005; Wijgerde et al., 1996). As mentioned above TFs have the ability to change the chromatin
structure by displacing nucleosomes at their binding sites thereby regulating gene expression
(He et al., 2010).
11
1.2.3 Third Level of regulation of gene transcription: Spatial organization of
the chromatin
The organization of chromatin in the three dimensional nuclear space plays a vital role in the
regulation of transcription of genes (de Laat and Grosveld, 2003; Palstra et al., 2003). To
understand how chromatin regulates transcription of genes it is important to understand the role
of both chromatin decondensation and chromatin looping in regulating gene transcription.
Studies with the murine ß-globin locus and the cytokine gene cluster have suggested that
formation of higher order chromatin structure not only seems to be an important parameter in
regulation of gene transcription but also correlates with transcription of genes (Palstra et al.,
2003; Spilianakis and Flavell, 2004). Furthermore different proteins such as CTCF, RAD21, and
SATB1 appear to mediate the formation of these higher order structures thereby playing an
active role in regulating gene transcription (Cai et al., 2006; Handoko et al., 2011; Kagey et al.,
2010). Moreover various techniques have been developed over the last decade that are routinely
used to investigate the three dimensional organization of different gene loci to understand how
chromatin conformation can correlate with specific transcriptional states of genes (Dekker et al.,
2002; Dostie et al., 2006; Ethier et al., 2012; Tolhuis et al., 2002; Vassetzky et al., 2009).
Hence, the third and the highest level of regulation of gene transcription lies within the three
dimensional spatial organization of different genetic loci.
1.2.3.1 Relation between transcription and organization of chromatin structure
Studies have shown that there is a strong correlation between transcription and the organization
of chromosomes (Mahy et al., 2002). In higher eukaryotes individual chromosomes occupy a
discrete location within the nucleus known as a chromosome territory (CT) (Cremer and
Cremer, 2001). Within a CT the DNA is further organized, wherein the gene rich regions are
mostly kept separate from gene poor regions (Cremer and Cremer, 2001; Croft et al., 1999;
Lieberman-Aiden et al., 2009). Studies also show that active genes are mostly located at the
periphery of the CT, whereas the inactive genes are concentrated towards the interior of the CT
(Harnicarova et al., 2006). Furthermore, regions with high density of active genes can be located
in loops that extend outside the CTs only in the cells that express those genes but not otherwise
(Volpi et al., 2000; Williams et al., 2002). Studies by Mahy et al. have also shown that genomic
regions comprising of a large number of expressed genes localize outside their CTs (Mahy et al.,
12
2002). Furthermore, it has been shown that chromatin decondensation correlates with the
transcriptionally active state of a genomic region (Chambeyron and Bickmore, 2004). For
example in response to induction by retinoic acid the HoxB gene cluster de-condenses and loops
out of its chromosome territory in order to express the HoxB gene (Chambeyron and Bickmore,
2004). This looping out of the HoxB genes was also accompanied by an overall change in the
histone modification state of the entire locus (Chambeyron and Bickmore, 2004). Hence all
these studies together suggest that not only high levels of transcriptional activity correlates with
localization of genes outside their CTs but also that chromatin structure could probably undergo
a complete reorganization in a tissue specific manner in order to execute the transcriptional
program of a set of genes.
1.2.3.2 Chromatin looping and regulation of gene transcription
Several studies have shown that there is a correlation between formation of tissue specific
chromatin loops and the regulation of gene transcription (Palstra, 2009; Palstra et al., 2003;
Spilianakis and Flavell, 2004; Tolhuis et al., 2002). Chromatin looping can help in bringing not
only DREs close to their target genes but also in sequestering regulatory elements together
thereby forming a tissue specific spatial organization of the chromatin in order to regulate gene
transcription (de Laat and Grosveld, 2003; Palstra et al., 2003).
1.2.3.2.1 Enhancers and chromatin looping
Enhancers have been shown to regulate transcription of their target genes through the chromatin
looping mechanism (Palstra, 2009; Tolhuis et al., 2002). As enhancers can be located far from
their target genes in the DNA sequence they have been shown to come in physical contact with
the target gene promoter in order to regulate gene transcription while the intervening DNA
sequence is looped out (Carter et al., 2002; Palstra et al., 2003; Tolhuis et al., 2002). This
mechanism in known as the chromatin looping mechanism and the process as a whole is known
as the long-range regulation of gene transcription. For example in adult erythroid cells, the
regulatory elements of the Hbb LCR located at a distance of 50 kb from the Hbb-b1 gene is
found in close proximity to the active Hbb genes while the intervening 50 kb of DNA sequence
containing the embryonic erythroid cell expressed genes is looped out (Carter et al., 2002;
Palstra et al., 2003; Tolhuis et al., 2002). As a result of this chromatin looping, the ß-globin
13
locus adopts a tissue specific conformation in adult erythroid cells wherein it is highly
transcribed (Palstra et al., 2003; Tolhuis et al., 2002). On the other hand the locus adopts a more
random conformation in other cell types such as in erythroid progenitor cells and brain cells
where Hbb-b1 is not expressed (Tolhuis et al., 2002). Hence it has been suggested that
conformation of the locus is associated with its active state as well its gene expression pattern
(Palstra et al., 2003).
Currently interactions between different genomic regions are detected by using the
Chromosome Conformation Capture (3C) technique (Dekker et al., 2002). The technique was
initially developed to investigate the three dimensional organization of yeast chromosome 3,
however later it was quickly adapted to understand the regulatory relationships between various
mammalian genes and their DREs (Dekker et al., 2002; Ethier et al., 2012; Spilianakis and
Flavell, 2004; Tolhuis et al., 2002). The 3C technique showed for the first time that long-range
regulation of the β-globin genes is mediated by actual physical interactions between genomic
fragments containing the regulatory elements and the gene promoters (Tolhuis et al., 2002). 3C
since has been used to detect chromatin-chromatin interactions between DREs and several other
genes including; Hba, Shh, TH2, HoxB1 and olfactory receptor genes (Amano et al., 2009;
Lomvardas et al., 2006; Spilianakis and Flavell, 2004; Vernimmen et al., 2007; Wurtele and
Chartrand, 2006). Based on the basic principle of the 3C technique various other techniques
have been developed such as 4C, e4C, 5C, Hi-C that can identify genome wide chromatin-
chromatin interactions and can also detect the conformation of a genome as a whole (Dostie et
al., 2006; Dostie et al., 2007; Ethier et al., 2012; Lieberman-Aiden et al., 2009; Schoenfelder et
al., 2010; Zhao et al., 2006). For example recent Hi-C studies have been able to describe the
conformation of the whole Drosophila melanogaster genome (Sexton et al., 2012).
1.2.3.2.2 Active chromatin hub
At some genetic loci the spatial organization of the chromatin is such that all the regulatory
elements are sequestered to form a tissue specific chromatin structure involved in activating
target gene transcription (de Laat and Grosveld, 2003). For example at the ß-globin locus all the
regulatory elements are sequestered together in order to form the active chromatin hub (ACH)
(de Laat and Grosveld, 2003). ACH formation and spatial clustering of enhancer elements has
been suggested to maintain a high local concentration of tissue specific TFs required for
14
efficient transcription of the target genes. Studies have shown that the formation of the ACH
requires the presence of protein factors bound to the regulatory DNA elements (de Laat and
Grosveld, 2003; Drissen et al., 2004; Vakoc et al., 2005). For example the TFs EKLF and
GATA-1 are required for the formation of the ß-globin ACH (Drissen et al., 2004; Vakoc et al.,
2005).
1.2.3.2.3 Insulators and chromatin looping
The insulator protein CTCF has been shown to participate in intra- and inter-chromosomal
looping from individual gene loci including Hbb, Igf2/H19 and HoxA (Ferraiuolo et al., 2010;
Kooren et al., 2007; Kurukuti et al., 2006; Yang and Corces, 2011). A genome-wide analysis of
chromatin-chromatin interactions at CTCF bound regions identified four distinct classes of
interactions occurring between CTCF bound regions, based on histone modifications: active
domain interactions, repressive domain interactions, enhancer-gene promoter interactions and
loops at the border of opposite chromatin states where CTCF acts as a boundary element
(Handoko et al., 2011).
1.2.3.3 Role of protein complexes in maintaining higher-order chromatin
conformation
CTCF and other proteins like cohesin play an important role in maintaining higher-order
chromatin conformation (Degner et al., 2011). Genome wide studies have shown that CTCF and
cohesin, a protein complex that mediates sister chromatid cohesion, localise to the same regions
of the genome (Parelho et al., 2008; Rubio et al., 2008). Furthermore, at the imprinted IGF2-
H19 locus both cohesin and CTCF are required for maintaining higher-order chromatin
conformation (Merkenschlager et al., 2009; Nativio et al., 2009). Studies with the human
Hep3B cells have shown that the genomic region containing the apolipoprotein gene cluster
APO A1/C3/A4/A5 has overlapping CTCF and cohesin (Rad21) binding sites (Mishiro et al.,
2009). The 3C studies of the locus showed the presence of two chromatin loops, whereas in one
loop the APOA1 promoter is present in the other loops the APO C3/A4/A5 promoters and the C3
enhancer is present (Mishiro et al., 2009). Furthermore, reduction in the levels of either CTCF
or Rad21 disrupts the chromatin loops, thereby causing not only a significant change in the
expression of the APO genes but also a reduction in the localization of the transcription factor
15
HNF-4a and RNAPII-Ser5P specifically at the gene promoter of APO3 (Mishiro et al., 2009).
Interestingly whereas the CTCF bound regions of the genome show limited differences between
cell types, CTCF/cohesin bound regions form tissue specific chromatin loops (Cuddapah et al.,
2009; Hou et al., 2010). Cohesin mediated chromatin loops have also been found at other
genetic loci such as at the IGF2/H19 locus and at the ß-globin locus (Hou et al., 2010). Hence
all these studies together suggest that CTCF-cohesin mediated chromatin looping plays a critical
role in mediating tissue specific regulation of gene transcription.
Members of the cohesin complex also interact with mediator, a complex recruited by TFs which
acts as a bridge to the RNA polymerase II preinitiation complex (Conaway and Conaway, 2011;
Kagey et al., 2010). ChIP-Seq analyses in mouse embryonic stem cells (ESC) showed that
mediator, cohesin and the cohesin loading factor NIPBL proteins co-localise at thousands of
sites across the ESC genome (Kagey et al., 2010). Furthermore knockdown of mediator, NIPBL
or cohesin changed the expression pattern of multiple genes whose enhancers were co-bound by
mediator and cohesin proteins (Kagey et al., 2010). The study also showed that both mediator
and cohesin proteins are bound at enhancers which form ES-cell specific chromatin loops with a
nearby gene promoter thereby regulating gene transcription in a tissue specific manner (Kagey
et al., 2010). Hence this study suggests that cohesin is capable of stabilizing higher-order
chromatin conformation across the genome and necessary for mediating enhancer-gene
interactions.
16
1.3 Lmo2: A candidate for investigating tissue specific chromatin conformation
Lim domain only 2 (LMO2) is a critical transcriptional regulator of hematopoiesis. Gene
targeting experiments conducted to introduce null mutations in the mouse Lmo2 gene, have
shown that Lmo2 is necessary for embryonic yolk sac erythropoiesis (Warren et al., 1994).
During differentiation of hematopoietic progenitor cells, Lmo2 expression is maintained in
erythroid cells but down regulated in the T-cell lineage (Boehm et al., 1991; Foroni et al., 1992;
Royer-Pokora et al., 1991; Warren et al., 1994). Aberrant expression of Lmo2 results in the
development of various T-cell related diseases; indeed Lmo2 is located at a recurrent site of T-
cell acute lymphoblastic leukemia (T-ALL) specific translocation (Fisch et al., 1992; Fitzgerald
et al., 1992; Foroni et al., 1992; Garcia et al., 1991; Larson et al., 1994; Royer-Pokora et al.,
1991). In addition, patients undergoing gene therapy for X-linked severe combined
immunodeficiency developed clonal T-cell proliferation as a result of aberrant transcriptional
activation of Lmo2 when the gene therapy vector integrated near Lmo2 (Hacein-Bey-Abina et
al., 2003; McCormack and Rabbitts, 2004).
Previous studies have shown that in erythroid cells LMO2 is usually present as part of a
complex with the transcriptional regulators, TAL1, E47, LDB1, and GATA1(Osada et al., 1995;
Valge-Archer et al., 1994; Wadman et al., 1997). This protein complex binds DNA by
recognizing a bipartite DNA sequence comprising of an E box and a GATA site (Osada et al.,
1995; Wadman et al., 1997). These LMO2 containing oligomeric complexes along with other
factors in hematopoietic cells have been found on the regulatory regions of various other genes
including, β-globin (Hbb), α-globin (Hba), retinaldehyde dehydrogenase 2, c-kit and erythroid
Kruppel-like factor (Eklf) (Anderson et al., 1998; Anguita et al., 2004; Lecuyer et al., 2002; Ono
et al., 1998; Song et al., 2007; Song et al., 2010).
The proximal promoter of the Lmo2 gene has multiple functional motifs that affect its promoter
activity (Landry et al., 2005). Lmo2 was initially believed to have two transcriptional promoters;
however, recent studies have indicated that the gene actually has three promoters, termed the
17
distal promoter, proximal promoter and intermediate promoter (Oram et al., 2010). Studies have
shown that the transcriptional activity of the proximal promoter (pP) is largely dependent on the
three Ets sites that are present within the conserved region of the pP (Landry et al., 2005).
Mutation of the first Ets site only marginally affected promoter activity of pP, however any
alterations within the second, or third site or both strongly hampered the promoter activity of pP
(Landry et al., 2005). Motif analysis has also shown that the proximal promoter (pP) contains an
E-box, which strongly influences the activity of the Lmo2 pP (Landry et al., 2005). Mutation of
the E-box element results in a small decrease in promoter activity of pP (Landry et al., 2005).
Furthermore, studies have also shown that the TFs FLI1, ELF1, and ETS1 regulate the activity
of the pP in the endothelial cells (Landry et al., 2005). In this context it should be mentioned
that though the pP of Lmo2 is sufficient to drive expression of a reporter gene in endothelial
cells in vivo, the expression levels are weak, and no expression in any other tissue has been
observed (Landry et al., 2005).
Robust expression of Lmo2 in hematopoietic cells requires the presence of multiple regulatory
elements (Landry et al., 2009). As mentioned before, though the proximal promoter of Lmo2 is
sufficient to drive its expression in endothelial cells in vivo, the expression levels are weak, and
no expression in any other tissue has been observed, hence that made researchers look into the
possibility of other regulatory elements that modulate the expression of Lmo2 in erythroid cells.
In fact, recent studies have identified eight distal regulatory elements (DREs) located upstream
of the LMO2 gene in the human genome (Landry et al., 2009). These DREs are capable of
enhancing reporter gene expression in erythroid tissues (Landry et al., 2009). Transgenic
analysis suggests that strong expression of Lmo2 in hematopoietic cells requires the combined
action of these cell-type specific distal regulatory elements (DREs) and the Lmo2 pP (Landry et
al., 2009). Interestingly though these DREs have been identified and it is known that the gene
Lmo2 is highly transcribed in erythroid cells, studies have not been conducted to investigate
how these multiple DREs regulate transcription of Lmo2 in erythroid cells and whether or not
they function cooperatively in the endogenous context.
18
2 Chapter 2: Methods
Analyses of Chip-Seq data from erythroid cells mentioned in this chapter has been
performed by Julie-Chih-Yu Chen.
19
2.1 Cell Isolation
C57/Blk6 mouse was used as the model system in this study. Adult erythroid cells were isolated
in large numbers (>1x 108) from the spleen of mice treated with phenyl hydrazine. This
treatment induces haemolytic anaemia in the mice, as a result of which the spleen becomes the
major site of red blood cell production (Dickerman et al., 1976). Three I.P. injections of 1%
phenylhydrazine solution (1 ml 10% PHZ in DMSO + 9 ml 1x PBS) are prepared and
administered at twelve-hourly intervals with 0.1 ml PHZ per 25 g body weight. Mice are
anaemic four days after first injection.
2.2 Chromosome Conformation Capture (3C)
Chromosome Conformation Capture (3C) experiments were performed as developed by Dekker
et al. 2002 with some minor modifications (Dekker et al., 2002).
3C fixation and digestion of nuclei: Anaemic spleen and kidney dissected from mouse were
strained through 70 µm strainer into chilled petridishes in cold D-MEM + 10% FBS. Cells
(suspended in room temperature medium) were next fixed for 10 min with 2% Formaldehyde.
The reaction was quenched with cold 1 M Glycine, followed by centrifugation at 1300 rpm,
4°C. The cell pellets were next washed with cold PBS and centrifuged at 1300 rpm, 4°C. In the
next step the cells were suspended in cold lysis buffer (10 mM Tris-HCl, pH 8, 10 mM NaCl,
0.2% NP-40, and complete protease inhibitors), and incubated on ice for 30 min with occasional
mixing followed by centrifugation at 1800 rpm. Next, the nuclei were resuspended in a trace of
liquid and counted using a haemocytomoter. 1x107 nuclei aliquots were next resuspended in
500 ul of 1.2x NEB2 Buffer containing 20% SDS and incubated for 1 hr at 37°C.Next 20%
Triton-X100 was added to the reaction and incubated for 1 hr at 37°C. Finally, 1500U of
HindIII were added to the tube and incubated overnight at 37°C.
3C ligation: 20% SDS was added to the tubes and incubated for 25 min at 65°C to deactivate
the HindIII enzyme. After which the contents of the tube were added to 7 ml of 1.1x ligation
buffer. 20% Triton-X100 was next added to neutralize the SDS and incubated for 1 hr at 37°C,
mixing occasionally. Next, 800U of T4 DNA ligase was added to the reaction and incubated
20
overnight in a 16°C water bath, then for 30 min at room temp. Finally 900 µg Proteinase K was
added to the ligation reaction and incubated overnight at 65°C.
3C DNA purification: The reactions were cooled to room temperatre before adding 300 µg of
RNase A and incubated at 37°C for 1 hr following which the DNA is purified using the Phenol-
Chloroform extraction and the pellet is resuspended in water.
3C validation and control template preparation: The DNA was quantified using the PicoGreen
assay. The 3C control template was prepared by mixing equimolar amounts of the BAC clone of
the entire Lmo2-Caprin1 locus (RP23-76D2) with the Alpha Aortic Actin 2 BAC clone (RP23-
2N15) followed by digestion with HindIII. The digested DNA was then ligated, and purified
using phenol extraction and ethanol precipitation. HindIII restriction enzyme digestion
efficiency was confirmed to be between 85 and 95% efficient at several genomic fragments in
anaemic spleen and kidney cells (Appendix 4). The linear range of amplification was
determined for erythroid and kidney samples by serial dilution. An appropriate amount of the
DNA within the linear range (typically 40 ng of DNA) was subsequently used for quantification.
PCR products of the ligated fragments were quantified using real-time quantitative PCR (qPCR)
on the Bio-Rad CFX-384 cycler. All data points were generated from an average of between
three and five independent 3C experiments with the real-time quantitative PCR performed in
triplicate. Standard curves were generated by 5 fold serial dilution of the 3C control template
and run in parallel with 3C experimental samples. The primers used for real-time quantitative
PCR are listed in apppendix 3. In each individual experiment 3C data were normalized to
neighbouring fragments at the Alpha aortic actin (α-A2) locus.
21
Figure 2-1.Chromosome Conformation Capture (3C)
Digrammatic representation of the steps involved in the 3C technique. First the cells are fixed
with formaldehyde, which forms crosslinks between proteins attached to DNA segments that are
close together in the nuclear space. The cross linked chromatin next is digested with a suitable
restriction enzyme. Next the DNA ends are ligated under dilute conditions that favour intra
molecular ligation events between cross-linked DNA fragments. Finally the cross links are
reversed and the ligation events between selected pairs of restriction fragments are quantified by
real-time quantitative PCR, using primers specific for the given hybrid fragment.
2.3 RNA isolation and real-time RT-qPCR
RNA from anaemic spleen and kidney was isolated using TRIzol, according to the
manufacturer`s instructions (Invitrogen). The isolated RNA was next treated with DNaseI
(Fermentas) followed by phenol-chloroform extraction in order to remove any DNA and DNase
I contamination from the RNA. The RNA was next used for cDNA syntheisis. The iScript First
strand synthesis cDNA kit from Bio-Rad was used for preparation of random-hexamer primed
cDNA. Real-time qPCR was performed on the Bio-Rad CFX-384 cycler. The reaction mixture
contained 2X Bio-Rad iTaq SYBR green mastermix with ROX, 0.3pM of each primer, 1uL
cDNA (10 times diluted from a 20uL reverse transcription reaction). The conditions for real-
22
time PCR were as follows: 94°C for 3 min followed by 40 cycles at 94°C for 30s, 62°C for 30s.
Expression levels of Gapdh or Epn1 were used for normalization of expression levels. The
primers used for RT-qPCR are listed in Appendix 3.
2.4 Statistical analysis
The 3C data were analyzed by two-way ANOVA using Sigma Plot12. Post tests (Holm-Sidak
method) were performed to assess significant differences between anaemic spleen and kidney at
specific genomic locations.
2.5 Genome Mapping and Peak Identification of ChIP-seq datasets in
erythroid cells
ChIP-seq raw data for GATA1, KLF1, LDB1, TAL1, and MTGR1 listed in table 2-1 were
downloaded from Gene Expression Omnibus (GEO) (Barrett et al., 2011; Cheng et al., 2009;
Soler et al., 2010). ChIP-seq data were aligned to NCBI m37 mouse assembly (mm9) using
Bowtie alignment (Langmead et al., 2009) by suppressing alignments to only 1 best reportable
alignment with a maximum number of 2 mismatches within 28 nucleotides of seed length in the
high quality end. The SISSRs (Jothi et al., 2008) algorithm was subsequently used to identify
significant transcription factor peaks compared to that of the input DNA with p <0.001. To
remove amplification bias, multiple reads aligning to the same genomic coordinate is counted as
one. Parameters for the corresponding transcription factor data were set according to original
publications using applicable input data sets. Significant transcription factor peaks were
uploaded to the UCSC genome browser for visualization (Rhead et al., 2010). The HPC7 ChIP-
Seq data analysis was performed by using published peaks (Wilson et al., 2010). ChIP-Seq data
for CTCF, p300, RAD21, DNaseI hypersensitivity, H3K4me1 data were obtained from the
mouse ENCODE project (Table 2-1) (Birney et al., 2007).
23
Table 2-1. Chromatin immunoprecipitation sequencing data
Transcription factor binding sites have been obtained from three different cell types;
differentiated murine erythroleukemia cells (MEL), hematopoietic progenitor cells (HPC7), and
GIE-ER4 a GATA1-null erythroblast cell line in which GATA1 activity was restored. CTCF,
DNaseI hypersensitivity, H3K4me1, p300 and RAD21 data have been obtained from the mouse
ENCODE project, sources listed (Principal investigator, Institution).
Protein or Chromatin
Feature
Cell type References
LDB1
Differentiated
MEL
(Soler et al., 2010)
TAL1
MTGR1
GATA1
GATA1 GIE-ER4 cells
(Cheng et al., 2009)
H3K4me1 ENCODE (R Hardison, Penn State
University)
KLF1 e14.5 fetal livers Tallack et al. 2010
ERG
HPC7
Hematopoietic
Progenitor Cells
(Wilson et al., 2010)
FLI1 RUNX1 LYL1 MEIS1 PU1 GATA1 GATA2 GFI1b TAL1 LMO2
p300 MEL ENCODE
(M Snyder, Stanford University) CTCF
MEL (2% DMSO) RAD21
DNaseI hypersensitivity MEL, Kidney ENCODE
(JA Stamatoyannopoulos, University
of Washington)
CTCF Various ENCODE
(B Ren, Ludwig Inst. for Cancer
Research)
24
3 Chapter 3: Distal regulatory elements located upstream of Lmo2 are associated with tissue-specific chromatin features
Chapters 3 and 4 have been prepared as a manuscript for submission to a peer-reviewed journal
25
3.1 Introduction
Identification of distal regulatory elements is the key to understanding tissue specific regulation
of gene transcription. Genome wide studies have shown that some of the characteristic features
of the enhancers are their ability to function in a tissue specific manner as well as to exhibit cell
type-specific chromatin features (Heintzman et al., 2009). For example genome wide studies to
identify p300 binding sites in three different embryonic tissues have shown that majority of the
binding sites are occupied by the protein in only one of the tissues (Visel et al., 2009). Further
experiments showed that these p300 bound regions exhibited tissue specific enhancer activities
in vivo (Visel et al., 2009). Moreover, enhancers identified in different cell types have been
found to be associated with tissue specific histone modification patterns (Heintzman et al.,
2009). Hence, all these studies together suggest that enhancers play a key role in driving tissue
specific transcription of genes. However, identification of the enhancer elements required for the
tissue-specific expression of specific genes is difficult because these regulatory sequences lie
embedded within the non-coding part of the mammalian genome (Xi et al., 2007). At present a
combination of molecular biology based techniques and bioinformatic tools are used to identify
these regulatory elements (Blow et al., 2010; Chen et al., 2012; Chen et al., 2008; Heintzman et
al., 2009; Heintzman et al., 2007; Visel et al., 2009; Xi et al., 2007). Features that are used in
many cases to identify DREs are DNase I hypersensitivity, histone methylation patterns, binding
of multiple TFs as well as binding of the histone acetyl transferase p300 to specific DNA
sequences (Blow et al., 2010; Chen et al., 2008; Heintzman et al., 2009; Heintzman et al., 2007;
Visel et al., 2009; Xi et al., 2007). Therefore, in order to understand the tissue specific
regulation of Lmo2, I have used a combination of bioinformatic tools and ChIP-Seq data to
identify the location and the tissue specific epigenetic features of the DREs located near the
Lmo2 gene in the mouse genome. In addition, as previous studies have shown that DREs in
some cases are transcribed at very low levels in a tissue specific manner, I have also
investigated intergenic transcription to determine if DREs and other regions within the
Lmo2/Caprin1 locus are transcribed in erythroid cells (Gribnau et al., 2000; Miles et al., 2007).
In this chapter an in depth analysis of the chromatin features of the Lmo2/Caprin1 locus has
been conducted to identify the factors that might play a role in regulating Lmo2 transcription in
adult erythroid cells.
26
3.2 Results
3.2.1 Identification and mapping of the enhancer elements on the mouse
genome
Multiple DREs upstream of Lmo2 have been identified in the human genome and confirmed to
have enhancer activity in transgenic mice (Landry et al., 2009). I mapped the proximal and
distal promoters and enhancer sequences within a region of 90kb upstream and 7kb downstream
of Lmo2 in the mouse genome (Figure 3-1). To maintain the naming used in Landry et al.2009
the mapped DREs were named based on their distances upstream or downstream from the
annotated Lmo2 TSS overlapping the proximal promoter (pP) (Landry et al., 2009). Altogether
11 enhancer elements were identified in the mouse genome located upstream of Lmo2, whereas
two enhancer elements (+1 and +7) were located downstream of the Lmo2 pP. Many of the
enhancer elements were found to have multiple subparts, with homology to the corresponding
human enhancer element. For example when looking for sequence homology with the human
enhancer element 75 (75 DRE), six regions in the mouse genome were found to have sequence
homology, these sequences are separated from one another by several base pairs. The first two
homologous regions are separated by only 60 base pairs, whereas the distance between the third,
fourth, fifth, and sixth regions is about one kb. As these sequences were located close to one
another and had sequence homology to a single human Lmo2 enhancer element, all six regions
were considered to be a subpart of the single enhancer element unit 75DRE. Similarly the
70DRE has ten subparts and all the elements together are spread over a distance of about 3 kb.
On the other hand the 35DRE has two subparts which are only separated by five bp. In addition
to the DREs I also mapped the promoter elements for the gene. Lmo2 pP also comprises of 12
subparts spread over a distance of 760bp, whereas the distal promoter (dP) of the gene has nine
subparts spanning 560b.Recent studies have with the human Lmo2 gene has identified a third
promoter known as the intermediate promoter, however due to the unavailability of its
coordinates it couldnot be mapped onto the mouse genome (Oram et al., 2010).
27
Figure 3-1. Genomic map of the Lmo2/Caprin1 locus showing the identified DREs
The mouse Lmo2-Caprin1 region on chromosome 2 is depicted with chromosome coordinates
shown at the top. The two Lmo2 promoters are indicated by red boxes. Distal regulatory
element (DRE) homology regions are indicated by black boxes joined by a line to delineate the
human enhancer construct used in the generation of transgenic mice. Proximal promoter (pP),
distal promoter (dP).
28
3.2.2 TFs bind to the distal regulatory elements
DNA sequences with regulatory roles that are located in the intergenic regions of the genome in
many cases are bound by TFs in the tissues in which they are active (Chen et al., 2008;
Fullwood et al., 2009; Yu et al., 2009). For example, the regulatory elements of the well
characterized ß-globin LCR are bound by multiple TFs in erythroid cells (Johnson et al., 2002;
Kim et al., 2007; Sawado et al., 2001; Song et al., 2007). These TFs play crucial roles in the
regulation of the ß-globin genes in adult erythroid cells (Cho et al., 2008; Johnson et al., 2002;
Sawado et al., 2001; Song et al., 2007; Wijgerde et al., 1996). As the DREs in the human
genome had been confirmed to have enhancer activity in transgenic mice, I hypothesized that
some of the mapped DREs should have TFs bound in mature erythroid cells (Landry et al.,
2009). Furthermore, several TFs (LMO2, TAL1, GATA2, FLI1, and SFPI1) had already been
found to be associated with these human DREs identified by Landry et al. using a ChIP-chip
approach (Landry et al., 2009). Hence, I wanted to investigate whether these mouse DREs
overlap any TF bound regions in erythroid cells.
3.2.2.1 Multiple TFs binding to DREs in mouse erythroid cells
I retrieved available ChIP-Seq data for definitive mouse erythroid cells (KLF1, MTGR1,
GATA1, TAL1, LDB1) to more finely map the transcription factor-bound regions within each
DRE (Cheng et al., 2009; Soler et al., 2010). My analysis revealed several TFs were bound at
three of the identified enhancer elements. Whereas, four TFs MTGR1, GATA1, TAL1, LDB1
were bound at the 75 DRE only one TF (GATA1) was bound at the 40 DRE and two TFs
(GATA1, LDB1) were bound to the 12 DRE (Figure 3-2, Appendix 5). The transcription factor
KLF1 was not found to bind to any of the DREs, however it was bound to a region located
immediately upstream of the 75DRE (Figure 3-2). Interestingly, of all the four TFs that were
bound to the DREs, GATA1 was the only TF that was bound to all the three DREs, whereas
LDB1 was bound at two of the three DREs suggesting that these TFs might be playing a crucial
role in regulating expression of the Lmo2 gene (Figure 3-2).
29
Figure 3-2. Distal regulatory elements upstream of Lmo2 overlap transcription factor
bound regions in erythroid cells
The mouse Lmo2-Caprin1 region on chromosome 2 is depicted with chromosome coordinates
shown at the top. The two Lmo2 promoters are indicated by red boxes. Distal regulatory
element (DRE) homology regions are indicated by black boxes joined by a line to delineate the
human enhancer construct used in the generation of transgenic mice. Coloured boxes represent
peaks identified from transcription factor ChIP-Seq data for erythroid (MEL and GIE-ER4)
cells. Overlapping transcription factor peaks were identified at the 75 and 12 DREs. Proximal
promoter (pP), distal promoter (dP), transcription factor (TF).
30
3.2.2.2 Multiple TFs binding to the DREs in HPC7 cells
My initial analysis of transcription factor binding at the intergenic region between Caprin1 and
Lmo2 highlighted only three DREs (75, 40, and 12) bound by TFs. To investigate additional
transcription factor bound regions I analysed ChIP-Seq data available for the HPC7
hematopoietic progenitor cell line (Pinto do et al., 1998; Wilson et al., 2010) (Table 2-1). The
HPC7 cells are a model hematopoietic stem/progenitor cell line (early fetal multipotent
hematopoietic progenitor cell line), they have the potential to form any of the hematopoietic
cells, addition of necessary growth factors such as Steel factor and Erythropoietin can efficiently
induce differentiation of the HPC7 cells to form both primitive and definitive erythroid cells
(Pinto do et al., 1998). As this was the only available ChIP-Seq data set that was collected from
a cell type closest to erythroid cells (HPC7 cells are the progenitors of erythroid cells), hence I
used their ChIP-Seq for my analysis (Pinto do et al., 1998; Wilson et al., 2010). This HPC7 data
set also contains GATA1 and TAL1, as did the earlier analysis of differentiated erythroid cell
ChIP-Seq data, along with several additional TFs (ERG, FLI1, GATA2, GFI1B, LMO2, MEIS1,
PU1, RUNX1) (Wilson et al., 2010). This data revealed multiple TFs are associated with many
of the other DREs in the intergenic region upstream of the Lmo2 gene including the 75, 40 and
12 DREs in HPC7 cells (Figure 3-3). Multiple TFs were bound at several TFs including 9, 6,
and 4 bound at the 75, 40 and 12 DREs and 3 bound at each of the 90, 70, 64 and 3 DREs
(Figure 3-3). Interestingly apart from the DREs 58, 47 and 43 all the other DREs are bound by
ERG) (Figure 3-3). Furthermore, in addition to the DREs intergenic regions upstream of the
90DRE were also bound by ERG (Figure 3-3). One of these intergenic regions was co-bound by
ERG and FLI1 and this co-binding was also observed at the DREs 75, 70, 64, 40, 35, and 25
(Figure 3-3). A fact worth mentioning here is that many of the human Lmo2 enhancer elements
were bound by the transcription factor FLI1 as detected using the ChIP-chip approach
suggesting that FLI1 and ERG could be playing an important role in the regulation of the Lmo2
gene both in humans and in mouse (Landry et al., 2005).
31
Figure 3-3. Distal regulatory elements upstream of Lmo2 overlap transcription factor
bound regions in HPC7 hematopoietic progenitor cells
The mouse Lmo2-Caprin1 region. Distal regulatory element (DRE) homology regions are
indicated by black boxes joined by a line to delineate the human enhancer construct used in the
generation of transgenic mice. Coloured boxes represent peaks identified from transcription
factor ChIP-Seq data from HPC7 hematopoietic progenitor cells (Pinto do et al., 1998; Wilson et
al., 2010). Proximal promoter (pP), distal promoter (dP).
32
3.2.3 The DREs have erythroid cell-specific epigenetic features
Enhancers are not only co-bound by multiple TFs but are also associated with increased
sensitivity to DNaseI, and are bound by the histone acetyl transferase protein p300 (also known
as EP300) (Blow et al., 2010; Visel et al., 2009; Xi et al., 2007). For example the well studied
regulatory elements of the ß-globin (Hbb) locus control region (LCR) consists of a series of
transcription factor bound DNaseI hypersensitive sites 50 kb upstream of the Hbb-b1 gene
(Tuan et al., 1989). I next investigated if the DREs in the mouse genome were associated with
any of the enhancer features mentioned above. For this purpose I used available ChIP-Seq data
from the mouse ENCODE project to identify p300 binding and DNaseI hypersensitivity across
the entire Lmo2/Caprin1 locus (Table 2-1) (Birney et al., 2007). My analysis showed that indeed
many of the DREs are not only bound by the transcriptional co-activator protein p300 but are
also associated with increased sensitivity to DNaseI (Figure 3-4). Whereas, binding of p300 was
identified at the 75, 70, 40, 25 and 12 DREs, DNaseI hypersensitivity was identified only at the
75, 25 and 12 DREs (Figure 3-4). Furthermore, comparison of the ChIP-Seq data for DNase I
sensitivity between erythroid cells and the kidney cells showed that DNaseI hypersensitivity at
75, 25 and 12 was erythroid cell specific as peaks were absent from kidney cells (Figure 3-4)
(Birney et al., 2007).
In addition to DNaseI hypersensitivity and p300 binding I also investigated the histone
modification pattern of the entire Lmo2-Caprin1 locus in erythroid cells (Birney et al., 2007).
My analyses showed that the entire region is marked with high levels of histone H3 lysine 4
monomethylation (H3K4me1), which is believed to be an epigenetic mark for enhancers (Figure
3-4) (Chen et al., 2008; Gross and Garrard, 1988; Ren et al., 2007; Visel et al., 2009; Visel et al.,
2010; Wu, 1980). In summary individual DREs as well as some intergenic regions within the
Lmo2/Caprin1 locus showed increased DNaseI hypersensitivity, presence of high levels of
H3K4me1 mark and were bound by p300 in erythroid cells suggesting that the DREs as well
some other intergenic regions in the locus might be involved in regulating the Lmo2 gene in
erythroid cells.
33
Figure 3-4. Distal regulatory elements upstream of Lmo2 overlap transcription factor
bound regions and have different epigenetic marks in erythroid cells
The mouse Lmo2-Caprin1 region on chromosome 2 is depicted with chromosome coordinates
shown at the top. The two Lmo2 promoters are indicated by red boxes. Distal regulatory
element (DRE) homology regions are indicated by black boxes joined by a line to delineate the
human enhancer construct used in the generation of transgenic mice. Mouse ENCODE ChIP-
Seq data for p300 and DNaseI hypersensitivity are shown below the DREs (Birney et al., 2007).
Coloured boxes represent peaks identified from transcription factor ChIP-Seq data for erythroid
(MEL and GIE-ER4) cells. Overlapping transcription factor peaks were identified at the 75 and
12 DREs. These regions were also occupied by p300 and showed increased sensitivity to
DNaseI. The entire locus was marked with a high level of histone H3 lysine 4 monomethylation
(H3K4me1). Proximal promoter (pP), distal promoter (dP), murine erythroleukemia cells
(MEL), Transcription factors (TF).
34
3.2.4 Intergenic transcription occurs at the distal regulatory elements
Previous studies have identified intergenic transcripts at the human Hbb LCR (Gribnau et al.,
2000; Miles et al., 2007; Tuan et al., 1992). In addition enhancer RNA (eRNA) has been
identified at several neuronal enhancers (Kim et al., 2010). Furthermore, a recent study has also
shown that long non-coding RNAs can play a role in regulating the transcription of
neighbouring genes (Morris, 2009). Therefore to investigate if intergenic transcription occurs at
the DREs upstream of Lmo2, I performed RT-qPCR using primers overlapping the DREs and
both up- and down-stream of several DREs. This RT-qPCR analysis identified measurable
levels of intergenic transcription occurring at all identified DREs in erythroid cells (Figure 3-5).
My analysis also showed that transcripts were quite abundant at the 12, 58, 64, 70, 75 DREs and
immediately upstream of the 58, 64, and 70 DREs (Figure 3-5). Interestingly regions between
the DREs were also transcribed in relatively high levels in erythroid cells but not in kidney cells
(Figure 3-5). Intergenic transcription was also detected at the distal promoter (dP) of the gene
and at a distance of 150bp downstream of the dP (Figure 3-5). However in this case transcripts
were detected in both kidney and erythroid cells, though the level of transcription was
marginally higher in erythroid cells when compared to kidney (Figure 3-5). Transcripts in
erythroid cells but not in kidney cells were also detected in the region located immediately
upstream of the 90DRE and at regions located about 8kb, 15kb, and 30kb upstream of the
90DRE (Figure 3-5). As I was able to amplify intergenic transcripts at several locations both
upstream and downstream of the DREs as well as between the individual DREs it seems that the
transcripts proceed throughout a broad region located upstream of the Lmo2 gene (Figure 3-5).
My analysis suggests that the entire region encompassing the 90, 75, 70, 64, and 58 DREs are
transcribed at moderate levels (Figure 3-5). Hence at the end it can be summarized that I found
that all identified DREs as well as the regions in between the different DREs are transcribed at
low to high levels in erythroid cells but not in kidney cells.
35
Figure 3-5. Intergenic transcription occurs in anaemic spleen erythroid cells.
Primary transcript levels in adult mouse anaemic spleen and kidney (blue) for; distal regulatory
elements (DRE), regions upstream and downstream of the DREs, distal promoter (dP), proximal
promoter (pP), and Lmo2 (exon2-intron2), Caprin1 (exon3-intron2). Levels were quantitatively
assessed by RT-qPCR and expressed relative to Gapdh. Gapdh is a ubiquitously expressed
reference gene.
36
3.2.5 CTCF and RAD21 bind to multiple regions across the Lmo2/Caprin1
locus
Investigating ChIP-Seq data released by the mouse ENCODE project (Table 2-1) (Birney et al.,
2007) I identified several CTCF and cohesin (RAD21) bound regions across the entire
Lmo2/Caprin1 locus (Figures 3-6 and 3-7). My ChIP-Seq data analysis not only showed that
some regions were co-bound by CTCF and RAD21 proteins but there were also regions that
were only bound by the RAD21 protein in erythroid cells such as the 25 DRE and a region
located downstream of the 58 DRE (Figure 3-6). Furthermore, I also identified a region located
further upstream of the 90 DRE that was co-bound by CTCF and RAD21 (Figure 3-6).
However, the CTCF bound region upstream of the 90 DRE was bound by CTCF in several cell
types (Figure 3-7). Additional CTCF and cohesin bound regions were located just down-stream
of the 75 DRE and at the Lmo2 proximal promoter (Figure 3-6). Like the CTCF bound region
upstream of the 90 DRE, CTCF was bound at the Lmo2 proximal promoter in several cell types
(Figure 3-7). In contrast CTCF was bound 400 bp downstream of the 75 DRE predominantly in
erythroid cells, suggesting that the CTCF protein bound to the 75 DRE probably plays a role in
the transcription of Lmo2 in erythroid cells.
37
Figure 3-6. CTCF and RAD21 are bound within the Lmo2-Caprin1 region
A) The mouse Lmo2-Caprin1 region on chromosome 2 is depicted with chromosome
coordinates shown at the top. B) A zoomed in view of the region upstream of Lmo2 showing
the locus wide binding of RAD21 marked in pink and CTCF marked in blue. In both the two
Lmo2 promoters are indicated by red boxes. Distal regulatory element (DRE) homology regions
are indicated by black boxes joined by a line to delineate the human enhancer construct used in
the generation of transgenic mice. Mouse ENCODE ChIP-Seq data for the cohesin complex
member RAD21 and CTCF are shown below DREs (Birney et al., 2007). Proximal promoter
(pP), distal promoter (dP), murine erythroleukemia cells (MEL differentiated with 2% DMSO).
38
Figure 3-7. CTCF bound upstream region of Lmo2 in different cell types.
The mouse Lmo2 upstream region on chromosome 2 is depicted with chromosome coordinates
shown at the top. HindIII restriction sites are indicated by blue lines. The two Lmo2 promoters
are indicated by red boxes. Distal regulatory element (DRE) homology regions are indicated by
black boxes joined by a line to delineate the human enhancer construct used in the generation of
transgenic mice. Mouse ENCODE ChIP-Seq data for the CTCF are shown below DREs in
different cell types (Birney et al., 2007). Proximal promoter (pP), distal promoter (dP), murine
erythroleukemia cells (MEL differentiated with 2% DMSO), bone marrow (BM), embryonic
stem cells (ES-Bruce4), mouse embryonic fibroblasts (MEF).
39
3.3 Discussion
In this chapter, I have used bioinformatic analysis of ChIP-Seq data for mature erythroid cells
and HPC7 hematopoietic progenitor cells to identify cell type-specific enhancer elements which
are located upstream of the Lmo2 gene in the mouse genome (Birney et al., 2007; Cheng et al.,
2009; Soler et al., 2010; Wilson et al., 2010). Furthermore ChIP-Seq data to identified cell type-
specific epigenetic features at these DREs and the binding of specific chromatin associated
proteins with a possible role in regulating the expression of the Lmo2 gene in erythroid cells.
3.3.1 TFs bind to the distal regulatory elements
Multiple TFs were bound to many of the identified DREs and other intergenic regions at the
Lmo2/Caprin1 locus both in erythroid and in HPC7 cells with the highest density of
transcription factor binding at the 75 DRE (Figures 3-2 and 3-3). Previous studies in circulating
erythroid cells of transgenic mice have show that the human 75 DRE has the strongest enhancer
activity in erythroid cells and drives the expression of a reporter gene, cooperatively with the
Lmo2 proximal promoter and +1 enhancer element (Landry et al., 2009). Interestingly, one of
the TFs that is bound to the 75 DRE is the protein LMO2 itself (Figures 3-2 and 3-3). In
addition to the 75DRE, the transcription factor LMO2 was also bound to the DREs 12, 25 and
40 in indicating that LMO2 might be involved in the regulation of its own transcription (Figures
3-2 and 3-3). In addition to LMO2 two other TFs (TAL1, GATA1) were bound to the 75DRE in
erythroid cells and to the 75 and 40DREs in the HPC7 cell line (Figure 3-3). Of note is the fact
that these three TFs (TAL1, GATA1, LMO2) along with LDB1 are a part of an oligomeric
complex that controls transcription of other important haematopoietic genes such as α-globin
(Hba), β-globin (Hbb) and erythroid Kruppel-like factor (Eklf) in erythroid cells (Anderson et
al., 1998; Anguita et al., 2004; Osada et al., 1995; Song et al., 2007; Song et al., 2010; Valge-
Archer et al., 1994; Wadman et al., 1997). Furthermore, my analysis of TF binding in erythroid
cells also shows that LDB1 binds to the 75DRE along with TAL1 and GATA1 (Figure 3-2).
Hence, this oligomeric complex might be involved in the regulation of Lmo2 in mouse erythroid
cells as well. Interestingly, GATA-1 (which is one of the components of the LMO2 oligomeric
complex) is the only transcription factor amongst the five TFs (TAL1, MTGR1, KLF1, LDB1,
GATA1) whose ChIP-Seq data from erythroid cells were analysed, that was bound to the three
DREs both in erythroid and in HPC7 cells suggesting that this TF might be playing an important
40
role in the regulation of Lmo2 transcription (Figures 3-2 and 3-3). In fact studies by Wang et al.
have already shown that GATA-1 is involved in the regulation of Lmo2 by binding to its
promoter (Wang et al., 2007).
Analysis of the ChIP-Seq data from the HPC7 cell line underlined the fact that two ETS factors
ERG and FLI1 have the highest number of binding sites across the entire Lmo2/Caprin1 locus
(Figure 3-3). Both the TFs bind not only to the DREs but also to other intergenic region across
the Lmo2/Caprin1 locus in the HPC7 cell line (Figure 3-3). Hence, it seems that these two TFs
might be playing an important role in the regulation of the Lmo2 gene. In agreement with my
analyses are two studies that have shown that FLI1 and ERG both play important roles in the
regulation of the Lmo2 gene (Landry et al., 2005; Oram et al., 2010). Whereas one of the studies
in human T-ALL samples has shown that ERG regulates LMO2 transcription by binding to the
LMO2 Intermediate promoter, another study conducted by Landry et al. has shown that the
transcription factor FLI1 regulates transcription of the human LMO2 gene by binding to its
proximal promoter (Landry et al., 2005; Oram et al., 2010). Interestingly, my analyses suggests
that in addition to controlling Lmo2 transcription individually these two ETS factors might be
part of a common circuitry that regulates Lmo2 transcription in mouse erythroid cells essentially
because multiple intergenic regions are co-bound by FLI1 and ERG (Figure 3-3). In fact a study
has shown that both the ERG and FLI1 can together mediate the aberrant expression of LMO2 in
T-cells by binding to the gene promoter in a subset of T-ALL patients (Oram et al., 2010).
In addition to ERG and FLI1 another transcription factor another transcription factor that has
been implicated in the regulation of Lmo2 transcription is PU.1 (Wang et al., 2007). Analysis of
the TF binding across the entire Lmo2-Caprin1 locus has shown that indeed in the HPC7 cell
line PU.1 binds to the DREs 75, 25 and +1, and also to a region located in between the 90 and
75DREs (Figure 3-3). Hence analysis of the HPC7 ChIP-Seq data seems to suggest that PU.1
might play a role in regulating Lmo2 transcription by binding to the upstream DRE elements
thereby helping Lmo2 to be transcribed in a tissue specific manner.
At the end it can be summarized that binding of multiple erythroid and HPC7 cell specific TFs
to the DREs and other intergenic regions at the Lmo2/Caprin1 locus suggests that transcription
of Lmo2 in erythroid cells requires the coordinated action of various erythroid cell specific TFs.
This data also underlines the potential role of these upstream sequences that are co-bound by
41
multiple TFs in the regulation of Lmo2 transcription.
3.3.2 The DREs have erythroid cell-specific epigenetic features
Analysis of the available ChIP-Seq data from the mouse ENCODE project to identify p300
binding, DNaseI hypersensitivity and the histone modification pattern across the entire
Lmo2/Caprin1 locus indicated that the DREs have multiple erythroid cell specific chromatin
features (Birney et al., 2007). My analyses showed that many of the DREs were not only bound
by the transcriptional co-activator protein p300 (also known as EP300) but were also associated
with increased sensitivity to DNaseI (Figure 3-4). According to recent studies the presence of
p300 binding and increased sensitivity to DNase I treatment suggests that the regions in question
might have tissue specific enhancer activities (Blow et al., 2010; Visel et al., 2009; Xi et al.,
2007). Hence, in this case I have identified three DREs (75, 25, and 12) all of which have both
the features (Figure 3-4). In agreement with my analysis a previous study has already shown
that in circulating erythroid cells of transgenic mice the human 75 DRE has the strongest
enhancer activity in erythroid cells and drives the expression of a reporter gene, cooperatively
with the Lmo2 proximal promoter and +1 enhancer element (Landry et al., 2009). Hence it
seems that the other two DREs (12 and 25) can also function as enhancers of the Lmo2 gene.
Interestingly, comparison of the ChIP-Seq data for DNase I sensitivity between erythroid cells
and the kidney cells showed that DNaseI hypersensitivity at 75, 25 and 12 was erythroid cell
specific as peaks were absent from kidney cells (Figure 3-4). In addition to DNaseI
hypersensitivity and p300 binding the entire Lmo2/Caprin1 region was marked with high levels
of histone H3 lysine 4 monomethylation (H3K4me1), which is believed to be an epigenetic
mark for enhancers, however as this data could not be compared with any other cell type, hence
it is difficult to understand whether or not this is an erythroid cell specific feature (Figure 4)
(Chen et al., 2008; Gross and Garrard, 1988; Ren et al., 2007; Visel et al., 2009; Visel et al.,
2010; Wu, 1980). Hence, at the end it can be summarized that both the entire Lmo2/Caprin1
locus as well as the individual DREs were marked with specific epigenetic features in erythroid
cells suggesting that some of the DREs especially the DREs 75, 25, and 12 as well some other
intergenic regions in the locus might be involved in an erythroid cell specific regulatory activity.
42
3.3.3 Intergenic transcription occurs at the distal regulatory elements
Intergenic transcripts were detected at various locations across the entire Lmo2/Caprin1 locus
(Figure 3-5). My analysis suggests that the entire region encompassing the 90, 75, 70, 64, and
58 DREs are transcribed at moderate levels in erythroid cells (Figure 5). This picture seems very
similar to what happens at the Hbb LCR, wherein a large domain encompassing all the DREs
and the region in between the regulatory elements is transcribed at fairly high levels (Gribnau et
al., 2000; Miles et al., 2007; Tuan et al., 1992). Though the exact function of these transcripts
are not known however they seem to be in some way related to the regulation of Lmo2
transcription especially due to the fact that they were mostly transcribed from DREs that seem
to play an important role in enhancing Lmo2 transcription in erythroid cells.
3.3.4 CTCF and RAD21 bind to multiple regions across the Lmo2/Caprin1
locus
The proteins CTCF and RAD21 have already been implicated in regulating gene expression in a
tissue specific manner at the murine ß-globin locus and at the human cytokine (IFNG) gene
locus by mediating the formation of cell type specific higher order chromatin conformation
(Chien et al., 2011; Hadjur et al., 2009). In this chapter I have also shown the presence of both
CTCF and RAD21 at multiple locations across the entire Lmo2/Caprin1 locus. Hence, analyses
of the ChIP-Seq data suggests that these two proteins might be involved in the regulation of the
Lmo2 gene probably by mediated the formation a higher order chromatin structure that brings in
spatial proximity the DREs and the Lmo2 gene thereby ensuring a higher level of transcription
of Lmo2 in erythroid cells (Birney et al., 2007).
3.3.5 The big picture
In this chapter bioinformatic and ChIP-Seq data analyses has helped me identify the unique
features of the Lmo2/Caprin1 locus in the mouse erythroid cells (Birney et al., 2007; Cheng et
al., 2009; Soler et al., 2010; Wilson et al., 2010). I have not only identified tissue specific DREs
at this locus but have also identified erythroid cell specific chromatin features and key
epigenetic marks of the entire Lmo2/Caprin1 locus. Hence, an in depth analysis of the
transcription factor binding pattern, DNaseI hypersensitivity pattern and a close look at the
43
binding of proteins p300, CTCF, and RAD21 and identification of histone signatures has helped
me to build a big picture of how the Lmo2/Caprin1 locus can be epigenetically different
between kidney and erythroid cells, thereby helping me to understand how the Lmo2 gene might
be regulated in erythroid cells (Birney et al., 2007).
3.4 Future work
Much of the ChIP-Seq data that has been used in this study has been obtained from the HPC7
cell line (Pinto do et al., 1998; Wilson et al., 2010). The HPC7 cells are a model hematopoietic
stem/progenitor cell line (early fetal multipotent hematopoietic progenitor cell line), they have
the potential to form any of the hematopoietic cells, addition of necessary growth factors such as
Steel factor and Erythropoietin can efficiently induce differentiation of the HPC7 cells to form
both primitive and definitive erythroid cells (Pinto do et al., 1998). Hence, in future a ChIP
study performed in erythroid cells with the TFs (ERG, FLI1, GATA2, LMO2, LYL1, PU1,
RUNX1, MEIS1) whose data are currently unavailable would be extremely helpful in
identifying the key TFs that regulate Lmo2 transcription in mature erythroid cells. In addition,
ChIP-Seq studies for the histone modification H3K4me1 in kidney cells can be used to identify
if the H3K4me1 modification pattern that has been noticed at the Lmo2/Caprin1 locus in
erythroid cells is tissue specific. Furthermore, my analysis shows that the histone acteyl
transferase p300 protein binds to the 75DRE in erythroid cells; however, due to the lack of
ChIP-Seq data for p300 in other cell types, it is difficult to understand whether or not this p300
binding pattern is specific to erythroid cells. Hence inclusion of some more ChIP-qPCR data
would be extremely helpful in understanding the tissue specific regulation of Lmo2 in erythroid
cells.
44
4 Chapter 4: Chromatin-Chromatin interactions at the Lmo2 locus
45
4.1 Introduction
The three dimensional organization of a locus plays an important role in the regulation of the
transcription of the gene in a tissue specific manner (Palstra et al., 2003). For example studies
with the mouse ß-globin locus has shown when the gene is actively transcribed, all the distal
regulatory elements of the gene are located in close spatial proximity to each other within the
nuclear space, thereby forming a compartment termed the Active Chromatin Hub (de Laat and
Grosveld, 2003; Palstra et al., 2003). In this chapter I have investigated whether the Lim domain
only 2 (Lmo2) gene is regulated in a similar manner in erythroid cells as observed at the ß-
globin locus essentially because just like at the ß-globin locus, several distal regulatory elements
too have been identified upstream of the Lmo2 gene in the human and mouse genomes (Landry
et al., 2009). These elements are capable of enhancing reporter gene expression in erythroid
cells and hence may be responsible for the high level transcription of Lmo2 in erythroid cells
(Landry et al., 2009). Nevertheless, it is unclear how these elements regulate transcription of
Lmo2 and whether or not they function cooperatively in the endogenous context. In this chapter
I have investigated whether or not chromatin-chromatin interactions indeed exist between Lmo2
proximal promoter and the upstream regulatory elements. As chromatin-chromatin interations
between DREs and several genes including; Hba, Shh, TH2, HoxB1 and olfactory receptor genes
have typically been detected using the chromosome conformation capture technique, hence in
chapter I have used the same technique to identify chromatin-chromatin interactions between the
DREs and the Lmo2 proximal promoter (Amano et al., 2009; Lomvardas et al., 2006;
Spilianakis and Flavell, 2004; Vernimmen et al., 2007; Wurtele and Chartrand, 2006). Hence the
focus of this chapter is to investigate long-range regulation of Lmo2 transcription in the context
of its endogenous genomic location using the 3C technique and also to identify possible proteins
that might be involved in mediating these chromatin-chromatin interactions
46
4.2 Results
4.2.1 The 75 distal regulatory element contacts the Lmo2 proximal promoter
To investigate whether or not chromatin loops form in erythroid cells which bring the 75 DRE
into proximity with the Lmo2 promoter I performed 3C experiments in adult erythroid cells
isolated from mouse anemic spleens 5 days after the initiation of Phenylhydrazine treatment
(Dekker et al., 2002; Dickerman et al., 1976). On day five the anaemic spleen is composed of
>85% mature globin expressing erythroid cells (Osborne et al., 2004). For comparison I used
kidney as a tissue in which Lmo2 is not transcribed at robust levels. I confirmed robust
transcription of Lmo2 in isolated anaemic spleen by measuring the levels of the primary
transcript by RT-qPCR (Figure 4-2). By contrast Lmo2 primary transcript levels were more
than fifteen fold lower in kidney. I also examined the primary transcript levels of the cell cycle
associated protein Caprin1 which is located 172 Kb upstream of Lmo2. I found that Caprin1 is
transcribed at similar levels in both adult erythroid and kidney cells (Figure 4-2).
My previous analysis of the ChIP-Seq data revealed p300 binding and the highest density of
bound TFs at the 75 DRE, hence, I performed the first set of locus wide 3C experiments using
the HindIII fragment containing the 75 DRE as the anchor fragment (Figure 4-3, restriction
fragment map and primers shown in Figure 4-2). My 3C analyses revealed that indeed
significantly increased interaction exists between the 75 DRE and the Lmo2 proximal promoter
fragment in erythroid cells as compared to the interaction frequency in kidney cells. Of note the
Lmo2 proximal promoter fragment also contains the +1 enhancer element found to cooperate
with the 75 DRE for optimal expression in circulating erythrocytes of transgenic mice (Landry
et al., 2009). I also found increased interaction in erythroid cells with the two fragments
upstream of the Lmo2 proximal promoter (Figure 4-3). This region contains the 3 DRE
identified by Landry et al 2009 which appeared not to have enhancer activity in transgenic
analysis (Landry et al., 2009). These increased interactions were not detected in cells isolated
from kidney where Lmo2 is transcribed at very low levels (Figure 4-3). I did not identify
increased interaction of the 75 DRE with the fragment containing the distal promoter and the 25
DRE in either anaemic spleen or in kidney (Figure 4-3). The 75 DRE was also found to have
significantly increased interaction in anaemic spleen as compared to kidney with a fragment
overlapping the 90 DRE but not with other DREs suggesting that the 75 and 90 DREs are more
47
closely associated in erythroid cells (Figure 4-3). The fragment upstream of the 90 DRE also
showed increased interaction in anaemic spleen compared to kidney. I found no increased
interaction between the 75 DRE and the transcribed Caprin1 gene in either tissue type (Figure
4-3). Hence, my results suggest that chromatin-chromatin interactions between the 75 DRE and
the Lmo2 proximal promoter and +1 enhancer occur when Lmo2 is transcribed in erythroid cells
but not in kidney where the gene is transcribed only at very low levels (Figure 4-3).
Figure 4-1.Lmo2 primary transcripts are abundant in anaemic spleen erythroid cells
Primary transcript levels in adult mouse anaemic spleen (red) and kidney (blue) cells for: Lmo2
(exon2-intron2), Caprin1 (exon3-intron2), Slc4a1 (exon1-intron1), Pkd2 (intron2-exon3),
Epn1(exon1-intron1), Gapdh (exon1-intron1) and Vh16 (genic). Levels were quantitatively
assessed by RT-qPCR and expressed relative to Gapdh. Epn1 and Gapdh are ubiquitously
expressed reference genes, Slc4a1 is an erythroid cell specific transcript, Pkd2 is a kidney
specific transcript, Vh16 is not expressed in either tissue.
48
Figure 4-2. The Lmo2/Caprin1 region on mouse chromosome 2
Primers used in chromosome conformation capture (3C) and HindIII restriction sites are shown
across the Lmo2/Caprin1 region of mouse chromosome 2. Promoters and distal regulatory
elements (DREs) are depicted in red and black respectively. Anchor fragments used in the
Caprin1, 75 DRE and Lmo2 3C experiments are marked with an asterisk. Distal promoter (pP),
proximal promoter (pP).
49
Figure 4-3. The 75DRE interacts with the Lmo2 proximal promoter in anaemic spleen
erythroid cells
Quantitative chromosome conformation capture (3C) was performed to detect chromatin-
chromatin interactions between the 75 DRE (distal regulatory element) upstream of Lmo2 and
the rest of the Lmo2-Caprin1 region of mouse chromosome 2. The profile of interactions
identified in anaemic spleen (red) and kidney (blue) is displayed. Black box indicates the
anchor fragment at the 75 DRE and alternating intensities of grey boxes indicate the fragments
investigated for interactions. Significantly increased interaction in anaemic spleen compared to
kidney was detected at the Lmo2 proximal promoter (pP) and at a region upstream of the 75
DRE. Data points are an average of three independent biological replicates. Error bars depict
the SEM, ** p<0.01, and *** p<0.001.
50
4.2.2 Several upstream distal regulatory elements contact the Lmo2 promoter
After detecting chromatin-chromatin interactions between the 75 DRE and the Lmo2 pP, I
wanted to investigate if any of the other DREs also contact the Lmo2 pP in a tissue specific
manner. I performed a locus-wide 3C using the HindIII fragment containing the Lmo2 proximal
promoter as the anchor fragment. 3C analyses detected significantly increased interaction
between the Lmo2 pP and multiple HindIII fragments across the locus in erythroid cells
compared to kidney cells (Figure 4-4). Interestingly, the HindIII fragments that I found to
contact the Lmo2 pP in erythroid cells contained the different DREs that I had already identified,
namely the 12, 25, 70, 75 and 90 DREs. I also identified two broad domains interacting with the
Lmo2 pP in a tissue specific manner. The first domain is located close to the Lmo2 pP and
contains the distal promoter as well as the 12 and 25 DREs. Of note is the observation that
another promoter, termed the intermediate promoter, has been identified immediately upstream
of the 12 DRE (Oram et al., 2010). The second interaction domain located further upstream of
Lmo2 contains the 70, 75 and 90 DREs as well as the HindIII fragment upstream of the 90 DRE
(Figure 4-4). Interestingly, the HindIII fragment upstream of the 90DRE interacted with the
Lmo2 pP in a very high frequency in erythroid cells when compared to all the other HindIII
fragments belonging to the second domain (HindIII fragments containing the DREs 90, 70, 75,
and 64) that were found to interact with the Lmo2 pP (Figure 4-4). This observation is especially
striking as the interaction frequency between the HindIII fragments should become
progressively lesser as the distance between the fragments become fairly large. In this case the
distance between the fragments is more than 90kb, hence the fact that the interaction frequency
in erythroid cells between the HindIII fragment located upstream of 90 DRE and the fragment
containing the Lmo2 pP is high is definitely noteworthy. I also detected a peak in the relative
interaction frequency at the 35 DRE; however this interaction was detected in both kidney and
erythroid cells (Figure 4-4).
51
Figure 4-4. Distal regulatory elements interact with the Lmo2 proximal promoter in
anaemic spleen erythroid cells
Quantitative chromosome conformation capture (3C) was performed to detect chromatin-
chromatin interactions between the Lmo2 proximal promoter (pP) and distal regulatory elements
(DREs). The profile of interactions identified in anaemic spleen (red) and kidney (blue) is
displayed. Black box indicates the anchor fragment at the Lmo2 pP and alternating intensities of
grey boxes indicate the fragments investigated for interactions. Significantly increased
interaction in anaemic spleen compared to kidney was detected at the 12, 25, 70, 75 and 90
DREs. Data points are an average of five independent biological replicates. Error bars depict
the SEM, * p<0.05, ** p<0.01, and *** p<0.001.
52
4.2.3 The Caprin1 promoter does not interact with the identified distal
regulatory elements
Caprin1 is a ubiquitously expressed gene located 172kb upstream of Lmo2 and transcribed from
the opposite strand. As Caprin1 is transcribed in erythroid cells I was interested to investigate
whether or not the Caprin1 promoter physically interacts with the DREs located between
Caprin1 and Lmo2. My initial 3C experiments performed with the 75 DRE as the anchor
fragment did not show any increase in the relative interaction frequency between the Caprin1
promoter and the 75 DRE in erythroid cells as compared to kidney cells (Figure 4-3), however
as Caprin1 could physically interact with other DREs located between Caprin1 and Lmo2 I
performed locus wide 3C experiments using the HindIII fragment containing the Caprin1
promoter as the anchor fragment (Figure 4-5). My results did not show any significant peaks in
the relative interaction frequency with the Caprin1 promoter in the entire region between the
Caprin1 and Lmo2 genes (Figure 4-5). Furthermore I did not identify any significant
differences between the relative interaction frequency of any HindIII fragments with the
Caprin1 promoter for cells isolated from anaemic spleen and kidney (Figure 4-5). Hence, my
locus wide 3C experiments with the Caprin1 promoter showed that in erythroid cells the
Caprin1 promoter does not interact with any of the identified DREs located upstream of the
Lmo2 gene.
53
Figure 4-5. The distal regulatory elements upstream of Lmo2 do not interact with the
Caprin1 promoter
Quantitative chromosome conformation capture (3C) was performed to detect chromatin-
chromatin interactions between the Caprin1 promoter and distal regulatory elements (DREs).
The profile of interactions identified in anaemic spleen (red) and kidney (blue) is displayed.
Black box indicates the anchor fragment at Caprin1 and alternating intensities of grey boxes
indicate the fragments investigated for interactions. Data points are an average of three
independent biological replicates. Error bars depict the SEM, no significant differences were
identified throughout this region.
54
4.3 Discussion
In this chapter I investigated the presence of chromatin-chromatin interactions throughout the
mouse genomic region containing Lmo2 and Caprin1 genes. My 3C experiments identified
several erythroid cell specific interactions between the Lmo2 proximal promoter and upstream
DREs, specifically a cluster of three transcription factor bound DREs 70-90 kb upstream interact
with the Lmo2 promoter as do the more proximal 12 and 25 DREs each of which are close to an
alternate promoter (intermediate and distal respectively). My 3C experiments revealed no
significant interactions between the Caprin1 promoter and the DREs suggesting these elements
are specific to Lmo2.
4.3.1 The 75 distal regulatory element and several other upstream distal
regulatory elements contact the Lmo2 proximal promoter
My 3C data confirmed specific interaction of the 75 DRE with the Lmo2 proximal promoter and
interaction of the Lmo2 proximal promoter with a broad region containing the 75 DRE as well
as the 90 and 70 DREs. Previous studies in circulating erythroid cells of transgenic mice have
shown that the 75 DRE has the strongest enhancer activity in erythroid cells and drives the
expression of a reporter gene, cooperatively with the Lmo2 proximal promoter and +1 enhancer
element (Landry et al., 2009). I have shown that the 75 DRE functions via contacting the Lmo2
proximal promoter/+1 DRE region to form an erythroid cell specific chromatin loop.
Furthermore, I identified a broad region of interaction with the Lmo2 proximal promoter region
containing the 70, 75, 90 DREs and a region located upstream of the 90 DRE. This suggests
that in the endogenous context the three DREs coordinately regulate Lmo2 transcription in
erythroid cells.
I also identified increased interaction between fragments containing the 25 and 12 DREs and the
Lmo2 proximal promoter in erythroid cells compared to kidney cells. This suggests that this
more proximal region also contributes to transcriptional regulation of endogenous Lmo2 through
increased interaction with the Lmo2 proximal promoter region. All three of these regions
contain a promoter (distal, intermediate and proximal) close to the identified DRE (25, 12 and
+1), therefore the increased interaction I identified could be mediated by the DREs, promoters
55
or both. However, in chapter 3 I have shown that the 25 and 12 DREs overlap with multiple
transcription factor peaks in HPC7 cells and are bound by p300 in erythroid cells. Hence, this
piece of data seems to suggest that the 12 and 25DREs may have a regulatory role.
Interestingly, I did not identify interactions of the 75 DRE with the fragments containing the 25
or 12 DREs. Transgenic analysis by Landry et al 2009 of these elements did reveal that the 25
and 12 DREs conferred expression in the fetal liver whereas only the 75 DRE conferred
expression in fetal liver as well as in circulating blood cells suggesting that these elements have
different functional roles in regulating Lmo2 expression (Landry et al., 2009). The fact that I
did not identify specific interactions between the 75 DRE and the fragment containing the 25 or
12 DREs suggests that the interactions of these two regions with the proximal promoter are
mutually exclusive. These mutually exclusive interactions could occur in different sub-
populations of cells within the anaemic spleen or the interactions could be dynamic within
individual cells with the proximal promoter region alternately contacting the 25-12 region and
the 70-90 region.
I did identify one distal chromatin-chromatin interaction with the Lmo2 proximal promoter in
both erythroid cells and kidney cells. This interaction occurred at the 35 DRE that also showed
comparable, though low, levels of intergenic transcription in both tissues as noted in chapter 3.
In transgenic mice a construct containing the 35 DRE and the proximal promoter showed similar
levels of LacZ reporter gene expression in endothelial cells when compared with the proximal
promoter alone (Landry et al., 2009). The chromatin-chromatin interactions that I identified
between the 35 DRE and the Lmo2 proximal promoter in both erythroid and kidney cells
suggests this region has a subtle role in regulating low level transcription of Lmo2 in endothelial
cells.
56
4.4 Future work
In this chapter I have shown that Lmo2/Caprin1 locus adopts an erythroid cell specific
conformation that correlates with the enhanced level of transcription of the gene in erythroid
cells. An interesting study that can be performed in future is to investigate how the locus is
organized in a cell type where the gene is not expressed for example in mouse embryonic stem
cells (ESCs). An important aspect that can be investigated is the manner in which the chromatin
loops are acquired as ESCs differentiate to hematopoietic progenitor cells and mature erythroid
cells wherein the Lmo2 gene is transcribed at high levels. Capturing the spatial and temporal
changes that occur as the Lmo2 gene goes from a non expressed state to a highly transcribed
state would provide invaluable information about how changes in chromatin conformation
correlates with the regulation of Lmo2 transcription.
An important conclusion that I derived from my combines analyses from this chapter and
chapter 3 is that the distal regulatory elements located upstream of the Lmo2 gene function
through a chromatin looping mechanism. Analysis of ChIP-seq data from chapter 3 suggests that
these chromatin loops are supported by cohesin associated with CTCF and transcription factor
bound regions. Hence, an important experiment that can be conducted in the future is to
knockdown the CTCF and Cohesin proteins in erythroid cells to identify what are the changes
that happen not only to the conformation of the Lmo2/Caprin1 locus but also to the levels of
transcription of both the genes. According to my model the gene Caprin1 does not interact with
any of the DREs possibly because of the presence of the insulator protein CTCF at three
different regions located between the 90DRE and the Caprin1 gene promoter. A depletion of the
CTCF protein may allow the DREs to interact with Caprin1 potentially enhancing transcription
of the gene. Furthermore, my model also suggests that the cohesin subunit RAD21 mediates the
formation of the chromatin loops between the DREs and the Lmo2 proximal promoter, therefore
a depletion in the levels of the cohesin protein complex should result in the loss of these
chromatin loops thereby decreasing the level of Lmo2 transcription in erythroid cells. Therefore
a knockdown of the Cohesin and CTCF proteins would provide concrete evidence to back up
my model indicating that the DREs indeed function through a chromatin looping mechanisms
and the formation of these loops are supported by the presence of the Cohesin and CTCF protein
bound regions.
57
5 Chapter 5: General Discussion
58
5.1 Discussion
In the post genome era, the availability of multiple low-cost, high-throughput technologies has
helped in sequencing genomes of several organisms, and the functions of the distal regulatory
elements that lie embedded within the non-coding part of these sequenced genomes are at last
beginning to be unveiled. Previous studies have shown that distal regulatory elements can
regulate gene transcription in a tissue specific manner and that specific chromatin features such
as histone modification patterns can help in identifying these tissue specific DREs (de Laat and
Grosveld, 2003; Heintzman et al., 2009; Heintzman et al., 2007; Palstra et al., 2003; Tolhuis et
al., 2002). The purpose of the work presented in this thesis is to integrate various biological data
together to understand the role of the DREs and other factors in the tissue specific regulation of
Lmo2.
In this study using a combination of ChIP-Seq data for mature erythroid cells and HPC7
hematopoietic progenitor cells I identified binding of multiple TFs and other chromatin
associated proteins to many of the DREs that have been identified upstream of the Lmo2 gene
(Birney et al., 2007; Cheng et al., 2009; Soler et al., 2010; Wilson et al., 2010). Furthermore my
3C data also confirmed the presence of increased interactions between Lmo2 proximal promoter
with a broad region containing the 75 DRE as well as the 90 and 70 DREs in erythroid cells
when compared with kidney cells.
Although I identified erythroid-cell specific chromatin loops between upstream DREs and the
Lmo2 proximal promoter; however the question remains as to which factors are mediating these
looping interactions. Possible candidates that could be mediating the formation of these loops
are the Cohesin protein complex subunit RAD21, CTCF, the TFs LMO2, LDB1, GATA1, and
KFL1. Both CTCF and RAD21 have been shown to mediate the formation of chromatin loops at
other genomic locations (Chien et al., 2011; Hadjur et al., 2009). Analyses of the ChIP-Seq data
revealed the presence of several CTCF and cohesin (RAD21) bound regions across the entire
Lmo2/Caprin1 locus indicating that CTCF and RAD21 could be mediating the formation of
chromatin loops even at the Lmo2/Caprin1 locus (Figures 3-6 and 3-7) (Birney et al., 2007).
Cohesin (RAD21) is bound within both the distal and proximal interacting domains, specifically
at the 75, 25 DREs as well as at the proximal promoter suggesting cohesin bound at the
upstream DREs supports erythroid cell specific looping interactions with the Lmo2 proximal
59
promoter. Cohesin is also recruited to CTCF occupied regions and identified several
CTCF/RAD21 bound regions throughout the Lmo2 upstream region, all of which occur within
erythroid cell specific interacting domains, though the majority of the CTCF sites were not
specific to erythroid cells (Parelho et al., 2008; Rubio et al., 2008). This is similar to the
findings at the Hbb locus where CTCF bound regions, invariant between cell types, formed cell
type specific chromatin loops (Hou et al., 2010). CTCF is often associated with insulator
activity; the CTCF/RAD21 occupied region upstream of the 90 DRE could have an important
role in preventing Caprin1 from interacting with the DREs that enhance Lmo2 transcription
(Phillips and Corces, 2009). As overexpression of Caprin1 causes inhibition of cell division it is
critical to prevent its aberrant up-regulation in rapidly dividing erythroid cells (Grill et al.,
2004). I did identify one CTCF/RAD21 bound region, downstream of the 75 DRE, which was
enriched predominantly in erythroid cells and may be critical in generating the tissue specific
looping pattern that I identified.
My results suggest that chromatin-chromatin interactions throughout the Lmo2 locus are being
supported by cohesin recruited both to CTCF bound regions (upstream of 90, downstream of 75)
as well as at transcription factor and p300 bound DREs not associated with CTCF (90, 70, 25,
12). The results also suggest that in the endogenous context the three DREs coordinately
regulate Lmo2 transcription in erythroid cells. I also identified increased chromatin-chromatin
interactions between fragments containing the 25 and 12 DREs and the Lmo2 proximal
promoter in erythroid cells compared to kidney cells. This suggests that this more proximal
region also contributes to transcriptional regulation of endogenous Lmo2 through increased
interaction with the Lmo2 proximal promoter region. As all three of these regions contain a
promoter (distal, intermediate and proximal) close to the identified DRE (25, 12 and +1),
therefore the increased interaction that I identified could be mediated by the DREs, promoters or
both. However, the 25 and 12 DREs were found to overlap multiple transcription factor peaks
in HPC7 cells and are bound by p300 in erythroid cells suggesting they have a regulatory role.
In addition to CTCF and RAD21, proteins other proteins that might mediate the formation of the
chromatin loops are the TFs LMO2, LDB1, GATA1, AND KLF1. As LMO2 itself was bound to
the upstream DREs, specifically at 75, 25 and 12, all of which showed increased interaction with
the Lmo2 promoter, hence my results suggest that the LMO2 oligomeric complex can play an
60
important role in mediating the formation of chromatin loops. In support of this, a recent study
found LDB1 (a member of the LMO2 complex) at regions of chromatin interaction with the
LDB1 bound Hbb-b1 promoter (Soler et al., 2010). GATA1 and KLF1 are also bound within
the distal interacting region, these TFs have been shown to be required, though not sufficient for
chromatin looping between Hbb-b1 and the LCR (Drissen et al., 2004; Kooren et al., 2007;
Vakoc et al., 2005) and may have a similar role in regulating looping within the Lmo2 locus.
At the end it can be summarized that integration of a wide range of biological data obtained
from ChIP-Seq, 3C, and RT-qPCR experiments has helped me in identifying the long range
chromatin-chromatin interactions that occur between the Lmo2 proximal promoter and two
broad regions, 3-31 and 66-105 kb upstream of Lmo2 containing transcription factor bound
regulatory elements suggesting that these elements cooperate in regulating high level
transcription of Lmo2 in erythroid cells. Furthermore my data also supports a model in which
these distal regulatory elements function through a chromatin looping mechanism supported by
cohesin associated with CTCF and transcription factor bound regions thereby enhancing the
transcription of Lmo2 in mouse erythroid cells.
61
5.2 Summary
The contributions of the work presented in this thesis include highlighting the role of chromatin
organization in regulation of Lim domain only 2 (Lmo2) gene transcription and identifying
various other biological factors that help in the formation of an erythroid cell specific
conformation of the entire Lmo2/Caprin1 locus. The ChIP-Seq data sets that have been used in
this study identified the wide range of biological factors (such as presence of TFs and other
chromatin associated proteins, histone modification patterns, genomic features, presence of
intergenic transcripts) that help to regulate the transcription of Lmo2 in erythroid cells (Birney et
al., 2007; Cheng et al., 2009; Soler et al., 2010; Wilson et al., 2010)
The significance of the study lies in explaining how multiple distal regulatory elements regulate
transcription of Lmo2 in erythroid cells and whether or not they function cooperatively in the
endogenous context. My study not only shows that strong interactions exist between upstream
regulatory elements and the Lmo2 gene promoter in erythroid cells, but also identifies the
specific regulatory elements (i.e. the 12, 25, 70, 75 and 90 DREs) that might be playing a more
important role in regulating of the Lmo2 gene in erythroid cells. I also found that the Lmo2-
Caprin1 locus adopts a tissue-specific conformation in erythroid cells. This tissue specific
organization of the locus brings several but not all of the DREs into proximity with the Lmo2
proximal promoter while excluding the Caprin1 promoter. A distal region covering 39 kb and
containing three DREs (90, 75 and 70) and forms a strong interaction with the Lmo2 proximal
promoter in erythroid cells. In addition a more proximal region containing the 12 and 25 DREs
as well as the distal and intermediate promoters interacts with the Lmo2 proximal promoter in
erythroid cells. Furthermore, a CTCF bound region upstream of the 90 DRE interacts with the
Lmo2 proximal promoter and may function as an insulator preventing the interaction of the
Caprin1 promoter with the erythroid cell specific DREs. In addition, multiple locations within
the entire Lmo2/Caprin1 locus has erythroid cell specific epigenetic signatures such as enriched
histone modification patterns, and DNaseI sensitivity which is in agreement with previous
studies that have shown the correlation between tissue specific epigenetic marks, chromatin
organization, and tissue specific regulation of specific genes. Hence, it can be concluded that a
combination of ChIP-Seq data, bioinformatics and 3C analyses has not only helped in
identifying distal regulatory elements that are located upstream of the Lmo2 gene in the mouse
62
genome but has also helped me in understanding how these enhancer-gene interactions are
mediated within the erythroid cells so as to regulate transcription of Lmo2 in a tissue specific
manner.
63
References
Amano, T., Sagai, T., Tanabe, H., Mizushina, Y., Nakazawa, H., and Shiroishi, T. (2009). Chromosomal dynamics at the Shh locus: limb bud-specific differential regulation of competence and active transcription. Dev Cell 16, 47-57.
Anderson, K.P., Crable, S.C., and Lingrel, J.B. (1998). Multiple proteins binding to a GATA-E box-GATA motif regulate the erythroid Kruppel-like factor (EKLF) gene. J Biol Chem 273, 14347-14354.
Anguita, E., Hughes, J., Heyworth, C., Blobel, G.A., Wood, W.G., and Higgs, D.R. (2004). Globin gene activation during haemopoiesis is driven by protein complexes nucleated by GATA-1 and GATA-2. EMBO J 23, 2841-2852.
Banerji, J., Rusconi, S., and Schaffner, W. (1981). Expression of a beta-globin gene is enhanced by remote SV40 DNA sequences. Cell 27, 299-308.
Barrett, T., Troup, D.B., Wilhite, S.E., Ledoux, P., Evangelista, C., Kim, I.F., Tomashevsky, M., Marshall, K.A., Phillippy, K.H., Sherman, P.M., et al. (2011). NCBI GEO: archive for functional genomics data sets-10 years on. Nucleic Acids Res 39, D1005-D1010.
Bell, A.C., and Felsenfeld, G. (2000). Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature 405, 482-485.
Birney, E., Stamatoyannopoulos, J.A., Dutta, A., Guigo, R., Gingeras, T.R., Margulies, E.H., Weng, Z., Snyder, M., Dermitzakis, E.T., Thurman, R.E., et al. (2007). Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799-816.
Blackwood, E.M., and Kadonaga, J.T. (1998). Going the distance: a current view of enhancer action. Science 281, 60-63.
Blow, M.J., McCulley, D.J., Li, Z., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry, M., Wright, C., Chen, F., et al. (2010). ChIP-Seq identification of weakly conserved heart enhancers. Nat Genet 42, 806-810.
Boehm, T., Foroni, L., Kaneko, Y., Perutz, M.F., and Rabbitts, T.H. (1991). The rhombotin family of cysteine-rich LIM-domain oncogenes: distinct members are involved in T-cell translocations to human chromosomes 11p15 and 11p13. Proc Natl Acad Sci U S A 88, 4367-4371.
Cai, S., Lee, C.C., and Kohwi-Shigematsu, T. (2006). SATB1 packages densely looped, transcriptionally active chromatin for coordinated expression of cytokine genes. Nat Genet 38, 1278-1288.
Carter, D., Chakalova, L., Osborne, C.S., Dai, Y.F., and Fraser, P. (2002). Long-range chromatin regulatory interactions in vivo. Nat Genet 32, 623-626.
64
Chambeyron, S., and Bickmore, W.A. (2004). Chromatin decondensation and nuclear reorganization of the HoxB locus upon induction of transcription. Genes Dev 18, 1119-1130.
Chapman, R.D., Heidemann, M., Albert, T.K., Mailhammer, R., Flatley, A., Meisterernst, M., Kremmer, E., and Eick, D. (2007). Transcribing RNA polymerase II is phosphorylated at CTD residue serine-7. Science 318, 1780-1782.
Chen, C.Y., Morris, Q., and Mitchell, J.A. (2012). Enhancer identification in mouse embryonic stem cells using integrative modeling of chromatin and genomic features. BMC Genomics 13, 152.
Chen, X., Xu, H., Yuan, P., Fang, F., Huss, M., Vega, V.B., Wong, E., Orlov, Y.L., Zhang, W., Jiang, J., et al. (2008). Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106-1117.
Cheng, Y., Wu, W., Kumar, S.A., Yu, D., Deng, W., Tripic, T., King, D.C., Chen, K.B., Zhang, Y., Drautz, D., et al. (2009). Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications, and mRNA expression. Genome Res 19, 2172-2184.
Chien, R., Zeng, W., Kawauchi, S., Bender, M.A., Santos, R., Gregson, H.C., Schmiesing, J.A., Newkirk, D.A., Kong, X., Ball, A.R., Jr., et al. (2011). Cohesin mediates chromatin interactions that regulate mammalian beta-globin expression. J Biol Chem 286, 17870-17878.
Cho, Y., Song, S.H., Lee, J.J., Choi, N., Kim, C.G., Dean, A., and Kim, A. (2008). The role of transcriptional activator GATA-1 at human beta-globin HS2. Nucleic Acids Res 36, 4521-4528.
Chung, J.H., Whiteley, M., and Felsenfeld, G. (1993). A 5' element of the chicken beta-globin domain serves as an insulator in human erythroid cells and protects against position effect in Drosophila. Cell 74, 505-514.
Conaway, R.C., and Conaway, J.W. (2011). Function and regulation of the Mediator complex. Curr Opin Genet Dev 21, 225-230.
Corden, J.L. (1993). RNA polymerase II transcription cycles. Curr Opin Genet Dev 3, 213-218.
Cuddapah, S., Jothi, R., Schones, D.E., Roh, T.Y., Cui, K., and Zhao, K. (2009). Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. Genome Res 19, 24-32.
Cui, K., Zang, C., Roh, T.Y., Schones, D.E., Childs, R.W., Peng, W., and Zhao, K. (2009). Chromatin signatures in multipotent human hematopoietic stem cells indicate the fate of bivalent genes during differentiation. Cell Stem Cell 4, 80-93.
de Laat, W., and Grosveld, F. (2003). Spatial organization of gene expression: the active chromatin hub. Chromosome Res 11, 447-459.
Degner, S.C., Verma-Gaur, J., Wong, T.P., Bossen, C., Iverson, G.M., Torkamani, A., Vettermann, C., Lin, Y.C., Ju, Z., Schulz, D., et al. (2011). CCCTC-binding factor (CTCF) and
65
cohesin influence the genomic architecture of the Igh locus and antisense transcription in pro-B cells. Proc Natl Acad Sci U S A 108, 9566-9571.
Dekker, J., Rippe, K., Dekker, M., and Kleckner, N. (2002). Capturing chromosome conformation. Science 295, 1306-1311.
Dickerman, H.W., Cheng, T.C., Kazazian, H.H., Jr., and Spivak, J.L. (1976). The erythropoietic mouse spleen-a model system of development. Arch Biochem Biophys 177, 1-9.
Dillon, N. (2006). Gene regulation and large-scale chromatin organization in the nucleus. Chromosome Res 14, 117-126.
Dillon, N., and Sabbattini, P. (2000). Functional gene expression domains: defining the functional unit of eukaryotic gene regulation. Bioessays 22, 657-665.
Dostie, J., Richmond, T.A., Arnaout, R.A., Selzer, R.R., Lee, W.L., Honan, T.A., Rubio, E.D., Krumm, A., Lamb, J., Nusbaum, C., et al. (2006). Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res 16, 1299-1309.
Dostie, J., Zhan, Y., and Dekker, J. (2007). Chromosome conformation capture carbon copy technology. Curr Protoc Mol Biol Chapter 21, Unit 21 14.
Drissen, R., Palstra, R.J., Gillemans, N., Splinter, E., Grosveld, F., Philipsen, S., and de Laat, W. (2004). The active spatial organization of the beta-globin locus requires the transcription factor EKLF. Genes Dev 18, 2485-2490.
Ethier, S.D., Miura, H., and Dostie, J. (2012). Discovering genome regulation with 3C and 3C-related technologies. Biochim Biophys Acta 1819, 401-410.
Ferraiuolo, M.A., Rousseau, M., Miyamoto, C., Shenker, S., Wang, X.Q., Nadler, M., Blanchette, M., and Dostie, J. (2010). The three-dimensional architecture of Hox cluster silencing. Nucleic Acids Res 38, 7472-7484.
Fisch, P., Boehm, T., Lavenir, I., Larson, T., Arno, J., Forster, A., and Rabbitts, T.H. (1992). T-cell acute lymphoblastic lymphoma induced in transgenic mice by the RBTN1 and RBTN2 LIM-domain genes. Oncogene 7, 2389-2397.
Fitzgerald, T.J., Neale, G.A., Raimondi, S.C., and Goorha, R.M. (1992). Rhom-2 expression does not always correlate with abnormalities on chromosome 11 at band p13 in T-cell acute lymphoblastic leukemia. Blood 80, 3189-3197.
Foroni, L., Boehm, T., White, L., Forster, A., Sherrington, P., Liao, X.B., Brannan, C.I., Jenkins, N.A., Copeland, N.G., and Rabbitts, T.H. (1992). The rhombotin gene family encode related LIM-domain proteins whose differing expression suggests multiple roles in mouse development. J Mol Biol 226, 747-761.
Forrester, W.C., Epner, E., Driscoll, M.C., Enver, T., Brice, M., Papayannopoulou, T., and Groudine, M. (1990). A deletion of the human beta-globin locus activation region causes a
66
major alteration in chromatin structure and replication across the entire beta-globin locus. Genes Dev 4, 1637-1649.
Fuda, N.J., Ardehali, M.B., and Lis, J.T. (2009). Defining mechanisms that regulate RNA polymerase II transcription in vivo. Nature 461, 186-192.
Fullwood, M.J., Liu, M.H., Pan, Y.F., Liu, J., Xu, H., Mohamed, Y.B., Orlov, Y.L., Velkov, S., Ho, A., Mei, P.H., et al. (2009). An oestrogen-receptor-alpha-bound human chromatin interactome. Nature 462, 58-64.
Garcia, I.S., Kaneko, Y., Gonzalezsarmiento, R., Campbell, K., White, L., Boehm, T., and Rabbitts, T.H. (1991). A Study of Chromosome-11p13 Translocations Involving Tcr-Beta and Tcr-Delta in Human T-Cell Leukemia. Oncogene 6, 577-582.
Gardiner-Garden, M., and Frommer, M. (1987). CpG islands in vertebrate genomes. J Mol Biol 196, 261-282.
Gershenzon, N.I., and Ioshikhes, I.P. (2005). Synergy of human Pol II core promoter elements revealed by statistical sequence analysis. Bioinformatics 21, 1295-1300.
Gonzalez-Zulueta, M., Bender, C.M., Yang, A.S., Nguyen, T., Beart, R.W., Van Tornout, J.M., and Jones, P.A. (1995). Methylation of the 5' CpG island of the p16/CDKN2 tumor suppressor gene in normal and transformed human tissues correlates with gene silencing. Cancer Res 55, 4531-4535.
Gowri, P.M., Yu, J.H., Shaufl, A., Sperling, M.A., and Menon, R.K. (2003). Recruitment of a repressosome complex at the growth hormone receptor promoter and its potential role in diabetic nephropathy. Mol Cell Biol 23, 815-825.
Gribnau, J., Diderich, K., Pruzina, S., Calzolari, R., and Fraser, P. (2000). Intergenic transcription and developmental remodeling of chromatin subdomains in the human beta-globin locus. Mol Cell 5, 377-386.
Grill, B., Wilson, G.M., Zhang, K.X., Wang, B., Doyonnas, R., Quadroni, M., and Schrader, J.W. (2004). Activation/division of lymphocytes results in increased levels of cytoplasmic activation/proliferation-associated protein-1: prototype of a new family of proteins. J Immunol 172, 2389-2400.
Gross, D.S., and Garrard, W.T. (1988). Nuclease hypersensitive sites in chromatin. Annu Rev Biochem 57, 159-197.
Hacein-Bey-Abina, S., Von Kalle, C., Schmidt, M., McCcormack, M.P., Wulffraat, N., Leboulch, P., Lim, A., Osborne, C.S., Pawliuk, R., Morillon, E., et al. (2003). LMO2-associated clonal T cell proliferation in two patients after gene therapy for SCID-X1. Science 302, 415-419.
Hadjur, S., Williams, L.M., Ryan, N.K., Cobb, B.S., Sexton, T., Fraser, P., Fisher, A.G., and Merkenschlager, M. (2009). Cohesins form chromosomal cis-interactions at the developmentally regulated IFNG locus. Nature 460, 410-413.
67
Handoko, L., Xu, H., Li, G., Ngan, C.Y., Chew, E., Schnapp, M., Lee, C.W., Ye, C., Ping, J.L., Mulawadi, F., et al. (2011). CTCF-mediated functional chromatin interactome in pluripotent cells. Nat Genet 43, 630-638.
Harnicarova, A., Kozubek, S., Pachernik, J., Krejci, J., and Bartova, E. (2006). Distinct nuclear arrangement of active and inactive c-myc genes in control and differentiated colon carcinoma cells. Exp Cell Res 312, 4019-4035.
He, H.H., Meyer, C.A., Shin, H., Bailey, S.T., Wei, G., Wang, Q., Zhang, Y., Xu, K., Ni, M., Lupien, M., et al. (2010). Nucleosome dynamics define transcriptional enhancers. Nat Genet 42, 343-347.
Heintzman, N.D., Hon, G.C., Hawkins, R.D., Kheradpour, P., Stark, A., Harp, L.F., Ye, Z., Lee, L.K., Stuart, R.K., Ching, C.W., et al. (2009). Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108-112.
Heintzman, N.D., and Ren, B. (2009). Finding distal regulatory elements in the human genome. Curr Opin Genet Dev 19, 541-549.
Heintzman, N.D., Stuart, R.K., Hon, G., Fu, Y., Ching, C.W., Hawkins, R.D., Barrera, L.O., Van Calcar, S., Qu, C., Ching, K.A., et al. (2007). Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 39, 311-318.
Hou, C., Dale, R., and Dean, A. (2010). Cell type specificity of chromatin organization mediated by CTCF and cohesin. Proc Natl Acad Sci U S A 107, 3651-3656.
Jaenisch, R., and Bird, A. (2003). Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat Genet 33 Suppl, 245-254.
Jin, F., Li, Y., Ren, B., and Natarajan, R. (2011). Enhancers: multi-dimensional signal integrators. Transcription 2, 226-230.
Johnson, K.D., Grass, J.A., Boyer, M.E., Kiekhaefer, C.M., Blobel, G.A., Weiss, M.J., and Bresnick, E.H. (2002). Cooperative activities of hematopoietic regulators recruit RNA polymerase II to a tissue-specific chromatin domain. Proc Natl Acad Sci U S A 99, 11760-11765.
Jones, P.A., and Baylin, S.B. (2002). The fundamental role of epigenetic events in cancer. Nat Rev Genet 3, 415-428.
Jothi, R., Cuddapah, S., Barski, A., Cui, K., and Zhao, K. (2008). Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res 36, 5221-5231.
Kagey, M.H., Newman, J.J., Bilodeau, S., Zhan, Y., Orlando, D.A., van Berkum, N.L., Ebmeier, C.C., Goossens, J., Rahl, P.B., Levine, S.S., et al. (2010). Mediator and cohesin connect gene expression and chromatin architecture. Nature 467, 430-435.
Kharchenko, P.V., Tolstorukov, M.Y., and Park, P.J. (2008). Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol 26, 1351-1359.
68
Kim, A., Song, S.H., Brand, M., and Dean, A. (2007). Nucleosome and transcription activator antagonism at human beta-globin locus control region DNase I hypersensitive sites. Nucleic Acids Res 35, 5831-5838.
Kim, T.K., Hemberg, M., Gray, J.M., Costa, A.M., Bear, D.M., Wu, J., Harmin, D.A., Laptewicz, M., Barbara-Haley, K., Kuersten, S., et al. (2010). Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182-187.
Komarnitsky, P., Cho, E.J., and Buratowski, S. (2000). Different phosphorylated forms of RNA polymerase II and associated mRNA processing factors during transcription. Genes Dev 14, 2452-2460.
Kooren, J., Palstra, R.J., Klous, P., Splinter, E., von Lindern, M., Grosveld, F., and de Laat, W. (2007). Beta-globin active chromatin Hub formation in differentiating erythroid cells and in p45 NF-E2 knock-out mice. J Biol Chem 282, 16544-16552.
Kurukuti, S., Tiwari, V.K., Tavoosidana, G., Pugacheva, E., Murrell, A., Zhao, Z., Lobanenkov, V., Reik, W., and Ohlsson, R. (2006). CTCF binding at the H19 imprinting control region mediates maternally inherited higher-order chromatin conformation to restrict enhancer access to Igf2. Proc Natl Acad Sci U S A 103, 10684-10689.
Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860-921.
Landry, J.R., Bonadies, N., Kinston, S., Knezevic, K., Wilson, N.K., Oram, S.H., Janes, M., Piltz, S., Hammett, M., Carter, J., et al. (2009). Expression of the leukemia oncogene Lmo2 is controlled by an array of tissue-specific elements dispersed over 100 kb and bound by Tal1/Lmo2, Ets, and Gata factors. Blood 113, 5783-5792.
Landry, J.R., Kinston, S., Knezevic, K., Donaldson, I.J., Green, A.R., and Gottgens, B. (2005). Fli1, Elf1, and Ets1 regulate the proximal promoter of the LMO2 gene in endothelial cells. Blood 106, 2680-2687.
Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, -.
Larson, R.C., Fisch, P., Larson, T.A., Lavenir, I., Langford, T., King, G., and Rabbitts, T.H. (1994). T cell tumours of disparate phenotype in mice transgenic for Rbtn-2. Oncogene 9, 3675-3681.
Lecuyer, E., Herblot, S., Saint-Denis, M., Martin, R., Begley, C.G., Porcher, C., Orkin, S.H., and Hoang, T. (2002). The SCL complex regulates c-kit expression in hematopoietic cells through functional interaction with Sp1. Blood 100, 2430-2440.
Lettice, L.A., Heaney, S.J., Purdie, L.A., Li, L., de Beer, P., Oostra, B.A., Goode, D., Elgar, G., Hill, R.E., and de Graaff, E. (2003). A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum Mol Genet 12, 1725-1735.
69
Li, B., Carey, M., and Workman, J.L. (2007). The role of chromatin during transcription. Cell 128, 707-719.
Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289-293.
Lomvardas, S., Barnea, G., Pisapia, D.J., Mendelsohn, M., Kirkland, J., and Axel, R. (2006). Interchromosomal interactions and olfactory receptor choice. Cell 126, 403-413.
Mahy, N.L., Perry, P.E., and Bickmore, W.A. (2002). Gene density and transcription influence the localization of chromatin outside of chromosome territories detectable by FISH. J Cell Biol 159, 753-763.
Marshall, N.F., Peng, J., Xie, Z., and Price, D.H. (1996). Control of RNA polymerase II elongation potential by a novel carboxyl-terminal domain kinase. J Biol Chem 271, 27176-27183.
Maston, G.A., Evans, S.K., and Green, M.R. (2006). Transcriptional regulatory elements in the human genome. Annu Rev Genomics Hum Genet 7, 29-59.
McCormack, M.P., and Rabbitts, T.H. (2004). Activation of the T-cell oncogene LMO2 after gene therapy for X-linked severe combined immunodeficiency. N Engl J Med 350, 913-922.
Merkenschlager, M., Hadjur, S., Williams, L.M., Ryan, N.K., Cobb, B.S., Sexton, T., Fraser, P., and Fisher, A.G. (2009). Cohesins form chromosomal cis-interactions at the developmentally regulated IFNG locus. Nature 460, 410-U130.
Miles, J., Mitchell, J.A., Chakalova, L., Goyenechea, B., Osborne, C.S., O'Neill, L., Tanimoto, K., Engel, J.D., and Fraser, P. (2007). Intergenic transcription, cell-cycle and the developmentally regulated epigenetic profile of the human beta-globin locus. PLoS One 2, e630.
Mills, F.C., Harindranath, N., Mitchell, M., and Max, E.E. (1997). Enhancer complexes located downstream of both human immunoglobulin Calpha genes. J Exp Med 186, 845-858.
Mishiro, T., Ishihara, K., Hino, S., Tsutsumi, S., Aburatani, H., Shirahige, K., Kinoshita, Y., and Nakao, M. (2009). Architectural roles of multiple chromatin insulators at the human apolipoprotein gene cluster. EMBO J 28, 1234-1245.
Mito, Y., Henikoff, J.G., and Henikoff, S. (2007). Histone replacement marks the boundaries of cis-regulatory domains. Science 315, 1408-1411.
Moreau, P., Hen, R., Wasylyk, B., Everett, R., Gaub, M.P., and Chambon, P. (1981). The SV40 72 base repair repeat has a striking effect on gene expression both in SV40 and other chimeric recombinants. Nucleic Acids Res 9, 6047-6068.
Morris, K.V. (2009). Long antisense non-coding RNAs function to direct epigenetic complexes that regulate transcription in human cells. Epigenetics 4, 296-301.
70
Murayama, A., Kim, M.S., Yanagisawa, J., Takeyama, K., and Kato, S. (2004). Transrepression by a liganded nuclear receptor via a bHLH activator through co-regulator switching. EMBO J 23, 1598-1608.
Nativio, R., Wendt, K.S., Ito, Y., Huddleston, J.E., Uribe-Lewis, S., Woodfine, K., Krueger, C., Reik, W., Peters, J.M., and Murrell, A. (2009). Cohesin is required for higher-order chromatin conformation at the imprinted IGF2-H19 locus. PLoS Genet 5, e1000739.
Noonan, J.P., and McCallion, A.S. (2010). Genomics of long-range regulatory elements. Annu Rev Genomics Hum Genet 11, 1-23.
Ong, C.T., and Corces, V.G. (2011). Enhancer function: new insights into the regulation of tissue-specific gene expression. Nat Rev Genet 12, 283-293.
Ono, Y., Fukuhara, N., and Yoshie, O. (1998). TAL1 and LIM-only proteins synergistically induce retinaldehyde dehydrogenase 2 expression in T-cell acute lymphoblastic leukemia by acting as cofactors for GATA3. Molecular and Cellular Biology 18, 6939-6950.
Oram, S.H., Thoms, J.A., Pridans, C., Janes, M.E., Kinston, S.J., Anand, S., Landry, J.R., Lock, R.B., Jayaraman, P.S., Huntly, B.J., et al. (2010). A previously unrecognized promoter of LMO2 forms part of a transcriptional regulatory circuit mediating LMO2 expression in a subset of T-acute lymphoblastic leukaemia patients. Oncogene 29, 5796-5808.
Osada, H., Grutz, G., Axelson, H., Forster, A., and Rabbitts, T.H. (1995). Association of erythroid transcription factors: complexes involving the LIM protein RBTN2 and the zinc-finger protein GATA1. Proc Natl Acad Sci U S A 92, 9585-9589.
Osborne, C.S., Chakalova, L., Brown, K.E., Carter, D., Horton, A., Debrand, E., Goyenechea, B., Mitchell, J.A., Lopes, S., Reik, W., et al. (2004). Active genes dynamically colocalize to shared sites of ongoing transcription. Nat Genet 36, 1065-1071.
Palstra, R.J. (2009). Close encounters of the 3C kind: long-range chromatin interactions and transcriptional regulation. Brief Funct Genomic Proteomic 8, 297-309.
Palstra, R.J., Tolhuis, B., Splinter, E., Nijmeijer, R., Grosveld, F., and de Laat, W. (2003). The beta-globin nuclear compartment in development and erythroid differentiation. Nat Genet 35, 190-194.
Parelho, V., Hadjur, S., Spivakov, M., Leleu, M., Sauer, S., Gregson, H.C., Jarmuz, A., Canzonetta, C., Webster, Z., Nesterova, T., et al. (2008). Cohesins functionally associate with CTCF on mammalian chromosome arms. Cell 132, 422-433.
Perissi, V., Aggarwal, A., Glass, C.K., Rose, D.W., and Rosenfeld, M.G. (2004). A corepressor/coactivator exchange complex required for transcriptional activation by nuclear receptors and other regulated transcription factors. Cell 116, 511-526.
Phillips, J.E., and Corces, V.G. (2009). CTCF: master weaver of the genome. Cell 137, 1194-1211.
71
Pinto do, O.P., Kolterud, A., and Carlsson, L. (1998). Expression of the LIM-homeobox gene LH2 generates immortalized steel factor-dependent multipotent hematopoietic precursors. EMBO J 17, 5744-5756.
Ptashne, M. (1986). Gene regulation by proteins acting nearby and at a distance. Nature 322, 697-701.
Ren, B., Heintzman, N.D., Stuart, R.K., Hon, G., Fu, Y.T., Ching, C.W., Hawkins, R.D., Barrera, L.O., Van Calcar, S., Qu, C.X., et al. (2007). Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 39, 311-318.
Rhead, B., Karolchik, D., Kuhn, R.M., Hinrichs, A.S., Zweig, A.S., Fujita, P.A., Diekhans, M., Smith, K.E., Rosenbloom, K.R., Raney, B.J., et al. (2010). The UCSC Genome Browser database: update 2010. Nucleic Acids Res 38, D613-D619.
Royer-Pokora, B., Loos, U., and Ludwig, W.D. (1991). TTG-2, a new gene encoding a cysteine-rich protein with the LIM motif, is overexpressed in acute T-cell leukaemia with the t(11;14)(p13;q11). Oncogene 6, 1887-1893.
Rubio, E.D., Reiss, D.J., Welcsh, P.L., Disteche, C.M., Filippova, G.N., Baliga, N.S., Aebersold, R., Ranish, J.A., and Krumm, A. (2008). CTCF physically links cohesin to chromatin. Proc Natl Acad Sci U S A 105, 8309-8314.
Sawado, T., Igarashi, K., and Groudine, M. (2001). Activation of beta-major globin gene transcription is associated with recruitment of NF-E2 to the beta-globin LCR and gene promoter. Proc Natl Acad Sci U S A 98, 10226-10231.
Schoenfelder, S., Sexton, T., Chakalova, L., Cope, N.F., Horton, A., Andrews, S., Kurukuti, S., Mitchell, J.A., Umlauf, D., Dimitrova, D.S., et al. (2010). Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells. Nat Genet 42, 53-61.
Schones, D.E., Cui, K., Cuddapah, S., Roh, T.Y., Barski, A., Wang, Z., Wei, G., and Zhao, K. (2008). Dynamic regulation of nucleosome positioning in the human genome. Cell 132, 887-898.
Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A., and Cavalli, G. (2012). Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458-472.
Sims, R.J., 3rd, Belotserkovskaya, R., and Reinberg, D. (2004). Elongation by RNA polymerase II: the short and long of it. Genes Dev 18, 2437-2468.
Smale, S.T., and Kadonaga, J.T. (2003). The RNA polymerase II core promoter. Annu Rev Biochem 72, 449-479.
Soler, E., Andrieu-Soler, C., de Boer, E., Bryne, J.C., Thongjuea, S., Stadhouders, R., Palstra, R.J., Stevens, M., Kockx, C., van Ijcken, W., et al. (2010). The genome-wide dynamics of the binding of Ldb1 complexes during erythroid differentiation. Genes Dev 24, 277-289.
72
Song, S.H., Hou, C., and Dean, A. (2007). A positive role for NLI/Ldb1 in long-range beta-globin locus control region function. Mol Cell 28, 810-822.
Song, S.H., Kim, A., Ragoczy, T., Bender, M.A., Groudine, M., and Dean, A. (2010). Multiple functions of Ldb1 required for beta-globin activation during erythroid differentiation. Blood 116, 2356-2364.
Spilianakis, C.G., and Flavell, R.A. (2004). Long-range intrachromosomal interactions in the T helper type 2 cytokine locus. Nat Immunol 5, 1017-1027.
Tolhuis, B., Palstra, R.J., Splinter, E., Grosveld, F., and de Laat, W. (2002). Looping and interaction between hypersensitive sites in the active beta-globin locus. Mol Cell 10, 1453-1465.
Tuan, D., Kong, S., and Hu, K. (1992). Transcription of the hypersensitive site HS2 enhancer in erythroid cells. Proc Natl Acad Sci U S A 89, 11219-11223.
Tuan, D.Y., Solomon, W.B., London, I.M., and Lee, D.P. (1989). An erythroid-specific, developmental-stage-independent enhancer far upstream of the human "beta-like globin" genes. Proc Natl Acad Sci U S A 86, 2554-2558.
Vakoc, C.R., Letting, D.L., Gheldof, N., Sawado, T., Bender, M.A., Groudine, M., Weiss, M.J., Dekker, J., and Blobel, G.A. (2005). Proximity among distant regulatory elements at the beta-globin locus requires GATA-1 and FOG-1. Mol Cell 17, 453-462.
Valge-Archer, V.E., Osada, H., Warren, A.J., Forster, A., Li, J., Baer, R., and Rabbitts, T.H. (1994). The LIM protein RBTN2 and the basic helix-loop-helix protein TAL1 are present in a complex in erythroid cells. Proc Natl Acad Sci U S A 91, 8617-8621.
Van der Ploeg, L.H., Konings, A., Oort, M., Roos, D., Bernini, L., and Flavell, R.A. (1980). gamma-beta-Thalassaemia studies showing that deletion of the gamma- and delta-genes influences beta-globin gene expression in man. Nature 283, 637-642.
Vassetzky, Y., Gavrilov, A., Eivazova, E., Priozhkova, I., Lipinski, M., and Razin, S. (2009). Chromosome conformation capture (from 3C to 5C) and its ChIP-based modification. Methods Mol Biol 567, 171-188.
Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., et al. (2001). The sequence of the human genome. Science 291, 1304-1351.
Vernimmen, D., De Gobbi, M., Sloane-Stanley, J.A., Wood, W.G., and Higgs, D.R. (2007). Long-range chromosomal interactions regulate the timing of the transition between poised and active gene expression. EMBO J 26, 2041-2051.
Visel, A., Blow, M.J., Li, Z., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry, M., Wright, C., Chen, F., et al. (2009). ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854-858.
73
Visel, A., Blow, M.J., McCulley, D.J., Li, Z.R., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry, M., Wright, C., et al. (2010). ChIP-Seq identification of weakly conserved heart enhancers. Nat Genet 42, 806-U107.
Volpi, E.V., Chevret, E., Jones, T., Vatcheva, R., Williamson, J., Beck, S., Campbell, R.D., Goldsworthy, M., Powis, S.H., Ragoussis, J., et al. (2000). Large-scale chromatin organization of the major histocompatibility complex and other regions of human chromosome 6 and its response to interferon in interphase nuclei. J Cell Sci 113 ( Pt 9), 1565-1576.
Wadman, I.A., Osada, H., Grutz, G.G., Agulnick, A.D., Westphal, H., Forster, A., and Rabbitts, T.H. (1997). The LIM-only protein Lmo2 is a bridging molecule assembling an erythroid, DNA-binding complex which includes the TAL1, E47, GATA-1 and Ldb1/NLI proteins. EMBO J 16, 3145-3157.
Wallace, J.A., and Felsenfeld, G. (2007). We gather together: insulators and genome organization. Curr Opin Genet Dev 17, 400-407.
Wang, Q., Zhang, M., Wang, X., Yuan, W., Chen, D., Royer-Pokora, B., and Zhu, T. (2007). A novel transcript of the LMO2 gene, LMO2-c, is regulated by GATA-1 and PU.1 and encodes an antagonist of LMO2. Leukemia 21, 1015-1025.
Warren, A.J., Colledge, W.H., Carlton, M.B., Evans, M.J., Smith, A.J., and Rabbitts, T.H. (1994). The oncogenic cysteine-rich LIM domain protein rbtn2 is essential for erythroid development. Cell 78, 45-57.
Wijgerde, M., Gribnau, J., Trimborn, T., Nuez, B., Philipsen, S., Grosveld, F., and Fraser, P. (1996). The role of EKLF in human beta-globin gene competition. Genes Dev 10, 2894-2902.
Williams, R.R., Broad, S., Sheer, D., and Ragoussis, J. (2002). Subchromosomal positioning of the epidermal differentiation complex (EDC) in keratinocyte and lymphoblast interphase nuclei. Exp Cell Res 272, 163-175.
Wilson, N.K., Foster, S.D., Wang, X., Knezevic, K., Schutte, J., Kaimakis, P., Chilarska, P.M., Kinston, S., Ouwehand, W.H., Dzierzak, E., et al. (2010). Combinatorial transcriptional control in blood stem/progenitor cells: genome-wide analysis of ten major transcriptional regulators. Cell Stem Cell 7, 532-544.
Wu, C. (1980). The 5' Ends of Drosophila Heat-Shock Genes in Chromatin Are Hypersensitive to Dnase-I. Nature 286, 854-860.
Wurtele, H., and Chartrand, P. (2006). Genome-wide scanning of HoxB1-associated loci in mouse ES cells using an open-ended Chromosome Conformation Capture methodology. Chromosome Res 14, 477-495.
Xi, H., Shulha, H.P., Lin, J.M., Vales, T.R., Fu, Y., Bodine, D.M., McKay, R.D., Chenoweth, J.G., Tesar, P.J., Furey, T.S., et al. (2007). Identification and characterization of cell type-specific and ubiquitous chromatin regulatory structures in the human genome. PLoS Genet 3, e136.
74
Yang, J., and Corces, V.G. (2011). Chromatin insulators: a role in nuclear organization and gene expression. Adv Cancer Res 110, 43-76.
Yu, M., Riva, L., Xie, H., Schindler, Y., Moran, T.B., Cheng, Y., Yu, D., Hardison, R., Weiss, M.J., Orkin, S.H., et al. (2009). Insights into GATA-1-mediated gene activation versus repression via genome-wide chromatin occupancy analysis. Mol Cell 36, 682-695.
Zhao, Z., Tavoosidana, G., Sjolinder, M., Gondor, A., Mariano, P., Wang, S., Kanduri, C., Lezcano, M., Sandhu, K.S., Singh, U., et al. (2006). Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions. Nat Genet 38, 1341-1347.
75
Appendices
Appendix 1. Coordinates of distal regulatory elements located upstream of the Lmo2
promoter in the mouse genome
Distal regulatory elements are named acording to their distance upstream of the annotated Lmo2
transcription start site overlapping the proximal promoter. Coordinates are given for homology
regions identified by BLAT. All fragments were mapped in NCBI m37 mouse assembly
(mm9).
Distal Regulatory
Element
Chromosome Starts at Ends at
90 Chr2 103720746
103720876
Chr2 103720495
103720607
Chr2 103720233
103720316
Chr2 103720614
103720670
Chr2 103720684
103720729
Chr2 103720909
103720959
75 Chr2 103736385
103736662
Chr2 103733907
103733995
Chr2 103734073 103734138
Chr2 103741698
103741839
Chr2 103740691
103740810
70 Chr2 103743108
103743214
Chr2 103743323
103743439
Chr2 103740817
103740859
Chr2 103743246
103743284
Chr2 103742632
103742668
Chr2 103743288
103743312
Chr2 103741882
103741906
Chr2 103741847
103741854
64 Chr2 103746643 103746795
Chr2 103746829 103746975
Chr2 103746393 103746425
76
Chr2 103746616 103746634
Chr2 103746616 103746634
58 Chr2 103753051 103753386
Chr2 103753006 103753038
47 Chr2 103764161 103764463
43 Chr2 103768003 103768252
40 Chr2 103770972
103771182
Chr2 103770692
103770883
Chr2 103770897
103770953
35 Chr2 103776285 103776669
Chr2 103776677 103776702
25 Chr2 103786638
103786800
Chr2 103786084
103786206
Chr2 103785920 103785949
Chr2 103786575
103786598
Chr2 103786541
103786557
Chr2 103786363
103786383
Chr2 103785810
103785823
Chr2 103786233
103786243
Chr2 103786568
103786573
12 Chr2 103798740 103798811
Chr2 103797914 103797994
Chr2 103798059 103798101
3 Chr2 103807666 103807799
Chr2 103807840 103807879
Chr2 103807626 103807654
Chr2 103807906 103807921
+1 Chr2 103811961 103812138
Chr2 103811728 103811830
Chr2 103811901 103811939
Chr2 103811837 103811880
77
+7 Chr2 103818142 103818313
Chr2 103818037 103818138
Chr2 103817938 103817999
Chr2 103818354 103818374
Chr2 103818324 103818338
Chr2 103818018 103818031
Appendix 2. Coordinates of the Lmo2 proximal and distal promoters in the mouse genome
The coordintes of proximal promoters and distal promoters for the Lmo2 gene in the mouse
genome are listed in the table. Coordinates are given for homology regions identified by BLAT.
All fragments were mapped in NCBI m37 mouse assembly (mm9).
Promoter Element Chromosome Starts at Ends at
Distal Promoter (dp)
Chr2 103788235
103788347
Chr2 103788107
103788162
Chr2 103788392
103788432
Chr2 103788641
103788683
Chr2 103788499
103788515
Chr2 103788081
103788091
Chr2 103788372
103788379
Chr2 103788202
103788208
Chr2 103788222
103788228
Chr2 103788168
103788172
Proximal promoter
(pP)
Chr2 103810212
103810400
Chr2 103810495
103810533
Chr2 103810398
103810445
Chr2 103810212
103810400
78
Proximal promoter
extended (pPex)
Chr2 103810620
103810724
Chr2 103810734
103810812
Chr2 103809864
103809907
Chr2 103810495
103810545
Chr2 103809630
103809688
Chr2 103810398
103810445
Chr2 103810140
103810170
Appendix 3. Primers used in quantitative chromosome conformation capture and RT-
qPCR
Specific primers are listed for the chromosome conformation capture (3C) and RT-qPCR analyses. Left primer (L), right primer (R), primers used to test HindIII restriction digestion efficiency are marked as REX.
3C primers Sequence 5`-3`
LMO2-3C-pP-L AGGAGAGAAACAACAACCCTTT
LMO2-3C-upstream pP-L GGGGACCTAGGTTTTTCTCCT
LMO2-3C-downstream 12E-L TTCAGACTTCTGACATCCTTATTTC
LMO2-3C-12E-L CTGCCTTACCTTGAGCTTGG
LMO2-3C-25E-L ACCCTTGGCAATTAACGTGT
LMO2-3C-downstream 35E-L TGTGGAACACCAACTTTTCCT
LMO2-3C-35E-L CTGGGCCAAGGGGTATAGAG
LMO2-3C-47E-L CTACTCCCGCTCAAAACTGC
LMO2-3C-downstream 58E-R CTTGGTACCCAGGAACTAGCA
LMO2-3C-64E-R CTCCCCTCCCTCAAACATTA
79
LMO2-3C-70E-L GGACTACGGAGCTGAAACCA
LMO2-3C-75E-R CCCTACAACATGCATCTCCA
LMO2-3C-upstream 75E-R TGCTTGATCATGGTTACAGGTC
LMO2-3C-downstream 90E-R TTGGGGTCATTATCTCTTTGCT
LMO2-3C-90E-L GGCCCTTATAATTTGGCACA
LMO2-3C-blankregion1-L TGGCACACATCTACAAGAGCA
LMO2-3C-blankregion2-L TCTCTGAACTGTTCCCTGGAG
LMO2-3C-upstream Caprinpr-L TTTCATCAAGTGCATCTTTGC
LMO2-3C-Caprinpr-L CCAGAGAGGCTGTTGGTTACT
LMO2-3C-downstream Caprinpr-L CATTGGTATGTTCATTACCTAGACA
Alpha aortic actinHIII-4-3C-L CCCTAGTCAGCCATCTCCTCT
Alpha aortic actinHIII-5-3C-L TGCAGTTATGTTCCACAGCAG
RT-qPCR Primers Sequence 5`-3`
CTCF peak1-L ATGCTGGTTTGTCATCTCCTGA
CTCF peak1-R AGTGCATGAGGATGTGCAATTA
CTCF peak 2-L ACTCACAGATTTGCTGGAGAGAC
CTCF peak 2-R TGGTGATTAACCAACTTCAGACA
CTCF peak 3-L GGTTGCTATGGTTGCAGATAGAG
CTCF peak3-R CAGGAGTTGTGTAGACCGAGAAT
u90-L TCATCAGCACTTACAGCCTCA
u90-R ATATGGCTGCAACAATTTCTGA
90DRE-L ATTCTTGTTTATGTAGGGGTGATGT
80
90DRE-R GATACCATAAAATCAGAGGCAGGTA
d90-L CCCCCTTTAGAGTACTGCACTG
d90-R CCAGATACGATGCCTGTGATAG
d90A-L TCTTCCAGTTCAATATGCTCCTAAC
d90A-R GGAGAGGTCTGATACAGTCGTTTTA
d90BL ATATTTAGAAAGGCCAGAATTTTGCT
d90B-R GTTTGGGAATTATAGCCCTACGATA
90-75A-L ACTGAGGAAGTGCAGCAGATTAAC
90-75A-R AAGAATTTCAGCGAACTCTAAGGA
u75A-L GCAGGTGGTATTGTTTAGTGAGGTA
u75A-R AGAAGCATGGGGTAGTGGATT
u75B-L ACCAAGCGGAGGCTGTATTA
u75B-R TTAGCTGCCTCAGAAGATAATGG
75DRE-L CAGCTAACTGTTACAGGAGAAGGAG
75DRE-R TGGGATCTGGGAGAGTATACTACAG
d75-L GGAAGTAAGGGAGACCCATTG
d75-R TTTGCTAGAAATCCCAACGTG
u70-L AACTATGGGGAGCATAAGCAAA
u70-R CAGGCAAATATCTAGGGGAAAA
70DRE-L AAAGGGGCCAGCTAGGAG
70DRE-R CTCAACCTGTTGGCGTATCC
47DRE-R GTCCGAACCTTTCAGTGTTCTC
81
d70-R TACTGGGTTAAAGAAGGGGTGA
u64A-L ACCAAGTTGGCTAAGGGTAGTTTT
u64B-L GAGGAGCCAGAGTTAAACCAAGT
u64AB-R GGACACCTAATAACGTGTTAGGATTAG
u64-L GGCTAAGGGTAGTTTTGCAGAG
u64-R CGATCAGACTGAGTGTGTGAGA
64DRE-L AAGGATCAGTGTGGAACTTGC
64DRE-R2 TGGCGACAGCACAGAAATAG
d64-L CCTGCTGTTTATGCAACACTTC
d64-R GCCTAACAAACTGGGATTCACT
64-58ER GGTAGCAATCTGGATATCTTGGAG
64-58EL ATGTATCCTTCAGAGGAGGCATAG
64-58CR TCTGCTTAATTGTTGGGCCTCT
64-58CL TAAGTCAACCTGCCGTTAATTGTA
u58-L AATAAGAGAGGAGAACGCAGTATGA
u58-R CATGTTTAGAAACAGAGGGTTATGC
58DRE-L TTCAGAACTCCCCGAAGAGA
58DRE-R CTCAGTTCCAAACCGCTCAG
d58-L TGGCATTGATTTTCCCTATTTT
d58-R GCCCTGTACCTACCTCAAGATG
58-47AR TCCTTTTACGGAACATGATGAACT
58-47AL AGAGTGACTTCAATTTGGACCATT
82
58-47BR CATTTTAGCTTCCCAAATGGTTAT
58-47BL AGATACGTGACCTAAACAGCATTC
58-47CL GAAGACTGCCTCGGTTTATTCTTA
58-47CR ATTTACACCTTGTCCTGATTTCGT
58-47DL GTCTATAACACAGATGACCCATGC
58-47DR AGTTTGGGAACAATCAGAAGCTAT
47DRE-L CAGTGCATGGAGTTAATGGAAA
47DRE-R ACTACAACTTGGTGCTGGCAAT
43DRE-L GTGGGCCAATTAGTGTCTGG
43DRE-R CCCCAGGCTTTGTTCTACATT
40DRE-L GAGGGAGGGAGTTCGTAACA
40DRE-R AATAATGAATGCGCGTCTCC
35DRE-L GGCATGATCGATACAAGACAGA
35DRE-R GCACTTAAATGGAACTCCCAAC
d35-L GCCACATACCATCTAAACAGCA
d35-R CTACTGGTGCCCTGTCCTACTC
u25-L ATGACTGGATTCACCACCTTG
u25-R GCTAACCACATCAAACCAACC
25DRE-L GGGGATGAATGCATGATAGACT
25DRE-R GGCTGAAGGGAAACTGTGTAAC
d25-L AGAACAGCCAGGTGAGATGAA
d25-R AGGCATCATCCTAACCAGTGA
83
u12-L1 CTTTTCAACTCCCGGAGGAT
u12-R1 GGGAGAGGTACCTTCTTCAAGC
u12-L2 GAAGTACTGCGGTCCTTGATATG
u12-R2 TATTCTTATACAAGCATGGGCATC
12DRE-L GCAAAAAGTTGCCAGATAAAAGATA
12DRE-R ACATTGTAAGTCTTCGAGGTAGGTG
d12-L GGGATGTTAAAAGGGATCCTG
d12-R CATGAGCGAGCAGAATTTGAC
dP_Int_Lmo2-L ACTTTGCTGACTTCCACAAGGAC
dP_Int_Lmo2-R GATGTAATCCCTGTGACTCCTGAT
d-dPL1 CTAAAGTCACGAGAAGGACCAAA
d-dPR1 CCAAAGACTCCTTACTTGCTCAG
d-dPL2 CTGCACCCTAGATGAATAACACC
d-dPR2 ACTGTTTGGGTATGCTACACTCG
d-dPL3 AAGGACTTGGAATAACCTTGCTAGT
d-dPR3 TGGTAGTAGGAACACTCTCTCGTCT
pP_Int_Lmo2-L GATGGAAGGTTAAGTCCTGAGCA
pP_Int_Lmo2-R AAAGAGAGAGAGCGAATCATCCAG
Lmo2 Exon2-L2 ATCGAAAGGAAGAGCCTGGAC
Lmo2 Intron2 -R2 GGTCGATCCCAGTTACAGCTTC
Pkd2_In2-L GGAGGGAAAGAGCTGACCTTA
Pkd2_EX3-R AGCTCATCATGCCGTAGGTC
84
Vh16 genic-L GGAGGGTCCACTAAACTCTCTTG
Vh16 genic-R GCATAGCCTTTTCCACTCTCATC
GapdhE1I1-L CTTCTTGTGCAGTGCCAGGTG A
GapdhE1I1-R CGCACCAGCATCCCTAGACC
Slc4a1E1I1-L TGGGAGCTCAGCCAGTCACA
Slc4a1E1I1-R CGGGACAGATGCCAA AGGAC
Caprin1E3-L2 CCTTTCCCCTTTATTCATTCG
Caprin1I2-R2 AGCAATGGTCAGTGTTTCAAGTT
EpnE1I1-L CTGGAAGCCCGGTATAAGC
EpnE1I1-R GTACAAAAGCAGCCACAAGC
LMO2 pP REX -L AGGAGAGAAACAACAACCCTTT
LMO2 pP REX -R TGCCTCCCCAACTGTGTAAT
ULmo2 pPREX-L GGGGACCTAGGTTTTTCTCCT
ULmo2 pPREX-R GGAAGTTCCTTCCCGATAAAA
25E REX -R TTTGGCTGATGCAGAGAATG
25E REX -L ACCCTTGGCAATTAACGTGT
70E REX -L GGACTACGGAGCTGAAACCA
70E REX -R CTCCCCTCCCTCAAACATTA
75E REX -R AGCCAGGCACAAATTACCTC
75E REX -L GTGGCACTCTCTGCTGACC
UCaprin1 REX -R TCCCTGTCAAACTGATGCAC
UCaprin1 REX -L TTTCATCAAGTGCATCTTTGC
85
Caprin1pr REX -R TTTCCCAAGTAGGTCCCTGA
Caprin1pr REX -L CCAGAGAGGCTGTTGGTTACT
Appendix 4. Restriction digestion efficiency in chromosome conformation capture
Restriction digest efficiency was between 85 and 95% at several HindIII restriction sites. Lmo2
proximal promoter (pP), Distal regulatory element (DRE), Caprin1 promoter (Caprin1P). “U”
denotes a restriction fragment upstream of the indicated element.
86
Appendix 5. Transcription factor bound regions at the 75 and 12DREs
A) The 75 distal regulatory element (DRE) overlaps multiple transcription factor peaks from
erythroid (MEL and GIE-ER4) cell ChIP-Seq data. B) Similarly the 12 DRE overlaps GATA1
and LDB1 peaks from erythroid (MEL and GIE-ER4) cell ChIP-Seq data. In both DRE
homology regions are indicated by black boxes joined by a line to delineate the human enhancer
construct used in the generation of transgenic mice. Overlapping peaks obtained from p300
mouse ENCODE ChIP-Seq data are also depicted at the 75 and 12 DREs.
87
Appendix 6. QPCR products of gene expression profile run on agarose gel
The QPCR products of the gene expression profile (A-Vh16, B- Gapdh, C-Epn1, D-Slc4a1, E-
Pkd2, F-Caprin1, G-Lmo2, H-Lmo2 proximal promoter, I-Lmo2 distal promoter, J-downstream
of Lmo2 distal promoter) were run on a 2% agarose gel to check the presence of a single
amplicon. bp-base pairs.
Appendix 7. QPCR products of intergenic transripts at 90,75, and12 DREs run on agarose
gel
The QPCR products of the amplified intergenic transcripts were run on a 2% agarose gel to
check for the presence of a single amplicon. (A-d90EA, B-d90EB, C-u90, D-90DRE, E-d90EC,
F-CTCF Peak1, G-CTCF Peak2, H-CTCF Peak3, I-u75E, J-75TFBS, K-75EA, L-75EB, M-
77MID, N-d75E, O-u75, P-u12E, Q-12DRE, R-d12E), bp-base pairs, u-upstream, d-
downstream.
88
Appendix 8.QPCR products of the intergenic transcripts at 25,35,40,43,47,58,64, and 70
DREs run on agarose gel
The QPCR products of the amplified intergenic transcripts were run on a 2% agarose gel to
check for the presence of a single amplicon. (A-35DRE, B-d35E, C-58DRE, E-u58E, F-u64EA,
G-u64EB, H-u64EE, I-64DRE, J-d64E, K-u70E, L-70DRE, M-d70E, N-40DRE, O-43DRE, P-
47DRE, Q-d25E, R-25DRE), bp-base pairs,u-upstream, d-downstream.
Appendix 9. QPCR products of the intergenic transcripts located between the 47and
58DREs, 64 and 58DREs, and between the 90 and 75DREs run on agarose gel
The QPCR products of the amplified intergenic transcripts were run on a 2% agarose gel to
check for the presence of a single amplicon. 90-75: located between the 90 and 75DREs, 64-58:
located between 64 and 58DREs, 58-47: located between 58 and 47DREs.
89
Appendix 10. Chromosome conformation capture (3C) products run on agarose gel (75E
as the anchor fragment)
All the 3C products with 75E as the anchor fragment run on a 2% agarose gel.pP-Lmo2
proximal promoter, b-blank region, d-downstream, u –upstream, bp-base pairs.
Appendix 11. Chromosome conformation capture (3C) products with Lmo2 proximal
promoter as anchor fragment run on agarose gel
All the 3C products with Lmo2 proximal promoter (pP) as the anchor fragment run on a 2%
agarose gel. Cp-Caprin1 promoter, b-blank region, d-downstream, u –upstream, bp-base pairs.
90
Appendix 12. Chromosome conformation capture (3C) products with Caprin1 promoter as
anchor fragment run on agarose gel
All the 3C products with Caprin1 promoter (Cp) as the anchor fragment run on a 2% agarose
gel.pP-Lmo2 proximal promoter, b-blank region, d-downstream, u –upstream, bp-base pairs.
Appendix 13. Primary intergenic transcript levels in adult mouse anaemic spleen and
kidney cells at the Lmo2/Caprin1 locus
Primary intergenic transcript levels as observed in adult mouse anaemic spleen and kidney.
Levels were quantitatively assessed by RT-qPCR and expressed relative to Gapdh. Gapdh is a
ubiquitously expressed reference gene. Chromosome Starts at Ends at Erythroid Kidney
chr2 103719773 103719855 0.07 0.00
chr2 103720768 103720926 0.19 0.00
chr2 103721765 103721936 0.52 0.00
chr2 103733981 103734083 0.00 0.00
chr2 103734036 103734162 0.20 0.00
chr2 103736359 103736503 0.76 0.00
chr2 103733874 103733987 0.70 0.02
91
chr2 103735453 103735544 1.83 0.00
chr2 103737677 103737813 0.72 0.00
chr2 103741899 103741916 1.74 0.00
chr2 103743566 103743701 1.67 0.02
chr2 103740379 103740594 0.98 0.01
chr2 103744683 103744817 1.40 0.02
chr2 103747146 103747258 0.83 0.00
chr2 103753011 103753170 5.77 0.04
chr2 103753441 103753558 4.69 0.03
chr2 103764283 103764406 0.32 0.03
chr2 103768070 103768257 0.32 0.00
chr2 103770958 103771051 0.24 0.01
chr2 103776571 103776653 0.25 0.14
chr2 103776976 103777175 0.25 0.04
chr2 103785900 103786097 0.09 0.01
chr2 103787110 103787196 0.03 0.00
chr2 103785656 103785825 0.16 0.00
chr2 103796380 103796571 2.50 0.02
chr2 103797876 103798028 0.55 0.00
chr2 103799435 103799515 21.57 0.07
chr2 103788221 103788401 0.14 0.02
chr2 103788891 103788984 0.06 0.01
chr2 103810515 103810638 5.89 0.20
chr2 103746604 103746751 2.95 0.01
chr2 103798226 103798325 0.55 0.00
chr2 103712293 103712439 0.15 0.00
chr2 103705781 103705877 0.15 0.00
92
chr2 103688916 103689025 0.04 0.01
chr2 103810841 103811055 15.07 1.28
chr2 103623318 103623538 1.00 0.43
chr2 103746245 103746326 4.73 0.03
chr2 103722341 103722435 0.84 0.00
chr2 103725075 103725176 0.9 0.00
chr2 103731605 103731706 0.18 0.00
chr2 103752577 103752695 3.2 0.00
chr2 103792261 103792371 0.00 0.00
chr2 103794182 103794283 0.00 0.00
chr2 103729128 103729266 0.41 0.00
chr2 103749894 103749988 0.42 0.00
chr2 103750643 103750736 1.71 0.03
chr2 103754295 103754405 0.83 0.00
chr2 103755261 103755406 0.69 0.00
chr2 103759636 103759735 0.43 0.03
chr2 103753751 103753850 2.26 0.00