identification of fusion transcripts with retroviral elements and its application as a cancer...
TRANSCRIPT
Identification of fusion transcripts with retroviral elements and its application as a cancer biomarker Yun-Ji Kim1, Jae-Won Huh2, Dae-Soo Kim3, Hong-Seok Ha1, Kung Ahn1, Ja-Rang Lee1, Yi-Deun Jung1, and Heui-Soo
Kim1
1 Division of Biological Sciences, College of Natural Sciences, Pusan National University, Busan 609-735, Republic of Korea2 National Primate Research Center (NPRC), KRIBB, Ochang, Chungbuk 363-883, Republic of Korea 3 Korea Bi
oinformation Center, KRIBB, Daejeon 305-806, Korea http://www.primate.or.kr
Abstract
Introduction
Materials & Methods
Results
References
The human genome is estimated to be composed of 45% transposable elements (TEs). They have been reported to have capacity for affecting adjacent genes by altering transcriptional regulation. Most TEs are transcriptionally silent in normal tissues. However, TEs have been found to be expressed specifically in cancer cell lines. Here we investigated the cancer specific fusion transcript with TEs using bioinformatics and experimental approaches. To identify the candidate cancer markers, we adopted an analysis pipeline for screening methods to detect cancer-specific expression from expressed human sequences and developed a database. Total 999 genes fused with transposable elements were found to be cancer-specific in our analysis of the EST database. To confirm the candidate marker transcripts, experimental validation was conducted by RT-PCR analysis in tumor/adjacent normal tissues and corresponding cancer cell lines. Our results could contribute greatly to understand the human cancers in relation to transposable element.……..........................……...…...
1.Kim TH, Jeon YJ, Kim WY, Kim HS: HESAS: HERVs expression and structure analysis system. Bioinformatics 2005, 15:1699-1970.
2. Kim DS, Kim TH, Huh JW, Kim IC, Kim SW, Park HS, Kim HS : LINE FUSION GENES: a database of LINE expression in human genes. BMC Genomic 2006, 7:139
Hypothetical model for retroelements in human genome
Promoter region
1 exon
Transcription change
Supplying the Promoter or Enhancer
1 exon 2 exonExonization in UTR and C
DS region
Alternative Promoter1 exon 2 exon
Alternative Polyadenylationlast exon
Retroelement
Retroposon
SINE
Retrotransposon
LINE
RNA intermediate
- LTR element + LTR element
- env + env
- RT + RT
Yeast Ty1/copia/truncated HERVsLTR ORF1 ORF2 LTR
LTR LTR
Human THE1
PPoly(A)
Human Alu
ORF1 ORF2PPoly(A)L1
gag pol envLTR LTR
Full-length HERVs/exogenous retrovirus
Retrovirus
′
11%
82%
6%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
5′UTR CDS 3′UTR
Location of transposable elements fusion EST
Perc
ent o
f exo
ns %
13.6%
3.8%1.6% 0.7% 0.2% 0.1% 0.1%
79.8%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
1 2 3 4 5 6 11 17
Transposable element fuion EST counts
Gen
es %
Aims Most of TEs are tranScr-iptionally silent in human normal tissues, however, some of TEs have been found to be expressed in placenta tissues and cancer cell lines. The L1 antisense promoter-driven transcription has been detected in human tumor cells or normal ones, while HERV LTR elements have shown the bidirectional promoter activity (Medstrand et al., 2001; Nigumann et al., 2002; Dunn et al., 2003; Sin et al., 2006). Those elements could provide biological role of organismal complexity by transcriptional diversity (Landry et al., 2003). Here, we developed a database for understanding the mechanism of cancer develop-ment in relation to TEs in human ESTsequences, and conducted experiemental validation using RT-PCR in tumor/adjacent normal tissues and corresponding cancer cell lines to confirm thecandidate marker transcripts.
RT-PCR & Real-time PCR
Bioinformatics
NCBI,BLAST,MEGA3
Transposable elements
fusion region within genes SINE Family LINE Family LTR Family DNA Family Others
CDS 619 280 85 76 1
76 30 33 5 03′UTR 44 20 14 5 0
Transposable elementsTable. Distribution of transposable element family in region of transposable element exonization
5′UTR
AKR1C2aldo-keto reductase family 1, member C2
Chr.10
p15.1
NM_2058453.1
NM_001354.4
CB106780
1 10
111
LTR/MaLR MLT1L LINE/L1 LTR/MaLR MSTA 30 cycle 32cycle 34 cycle
liver(N
)
liver(C
)
liver(N
)
liver(C
)
liver(N
)
liver(C
)
300 bp
GAPDH 120 bp
NM_004817.2
NM_201629.1
AW604158
Chr.9
q21.11
1 23
AluJo/FRAM Coding region Untranslated regioncolon(N)
colon(C)
colon(N)
colon(C)
tight junction protein 2 (zona occludens 2)TJP2
168 bp
GAPDH 120 bp
1 21
Transposable elements
fusion region within genes SINE Family LINE Family LTR Family DNA Family Others
CDS 619 280 85 76 15 ′UTR 76 30 33 5 03 ′UTR 44 20 14 5 0
Transposable elements
Table. Distribution of transposable element family in region of transposable element exonization
Type of
potential splicing site SINE Family LINE Family LTR Family DNA Family
Accept&Donor 83 68 50 12Accept Site 271 110 33 28Donor Site 216 80 43 18
Transposable elements
Table. Potential splice site are utilized by transposable elements fusion exons
Family SubfamilyAlu 20 1.44AluJ 171 12.35AluS 244 17.62MIR 250 18.05FAM 2 0.14FRAM 18 1.30FLAM 37 2.67HAL 13 0.94L1HS 1 0.07L1P 18 1.30L1M 15
311.05
L2 151 10.90L3 25 1.81
MaLR 67 4.84ERV1 40 2.89ERVL 27 1.95ERVK 6 0.43Charlie 9 0.65
HSMAR2 2 0.14Kanga1 1 0.07MARNA 3 0.22MER 61 4.40Tigger 14 1.01Zaphod2 1 0.07
Others Charlie 1 0.07
SINE
LINE
LTR
DNA
Transposable elementsOccurrences Percent (%)5UTR CDS 3UTR
Alu 0 20 0
AluJ 20 131 12
AluS 13 190 15
AluY 3 37 5
MIR 33 198 7
FAM 0 2 0
FRAM 0 16 2
FLAM 7 25 3
HAL 0 11 0
L1HS 0 1 0
L1P 1 12 5
L1M 6 125 6
L2 22 111 7
L3 1 20 2
MaLR 16 40 6
ERV1 13 23 3
ERVL 4 16 5
ERVK 0 6 0
Charlie 0 9 0
HSMAR2 0 2 0
Kanga1 0 0 1
MARNA 0 3 0
MER 5 50 3
Tigger 0 11 1
Zaphod2 0 1 0
Others Charlie 0 1 0
Transposable elements fusion in gene region
DNA Family
LTR Family
LINE Family
SINE Family
Family Subfamily
Experimental data
tumor/adjacent normal tissues
DATABASE
Computational data