diversity and survival strategies of ltr retrotransposons in the arabidopsis genome
DESCRIPTION
Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome. Brooke Peterson-Burch Voytas Laboratory Iowa State University. Beyond genes. Most DNA in eukaryotes doesn’t code for anything necessary for the survival and replication of the organism. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/1.jpg)
Diversity and survival strategies of LTR
retrotransposons in the Arabidopsis genome
Brooke Peterson-Burch
Voytas Laboratory
Iowa State University
![Page 2: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/2.jpg)
Beyond genes
Most DNA in eukaryotes doesn’t code for anything necessary for the survival and replication of the organism.
How did that sequence get there?Why isn’t it eliminated?
Genome sequences can teach us about genome evolution and the part that retroelements play
![Page 3: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/3.jpg)
What’s a retroelement?
Type of transposable element
A mRNA copy of the parental element ‘genome’ is reverse transcribed into DNA and inserted into a new location in the host
Transposition is replicative
![Page 4: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/4.jpg)
Retroelement genomes
pol
env
LTR
vif
vpr
LTRgagMACANC p6
PR RT RH IN
TMSU
tat
nefHIV-1
vpurevRetroviridae
retroposonsgag
RT RHEN AAAn
MA CA NC PR RT RHINPseudoviridae
MA CA NC
PR RT RH INMetaviridae
DirsRT RH
λ Recombinase
gag
BEL gag PR RT RH IN
![Page 5: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/5.jpg)
Element
Retro living…
Transcription
mRNA
pol
env
LTR
vif
vpr
LTRgagMA CA NC p6
PR RT RH IN
TMSU
tat
nefHIV-1
vpurev
Translation
LTRMA CA NC PR RT RHIN
LTRPseudoviridae
![Page 6: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/6.jpg)
Element
Retroelement life cycle
Particle
Only virusesescape host cell
Packaging
pol
env
LTR
vif
vpr
LTRgagMA CA NC p6
PR RT RH IN
TMSU
tat
nefHIV-1
vpurev
LTRMA CA NC PR RT RHIN
LTRPseudoviridae
![Page 7: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/7.jpg)
Element
Retroelement life cycle
cDNA
Reverse Transcription
pol
env
LTR
vif
vpr
LTRgagMA CA NC p6
PR RT RH IN
TMSU
tat
nefHIV-1
vpurev
LTRMA CA NC PR RT RHIN
LTRPseudoviridae
![Page 8: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/8.jpg)
Element
Retroelement life cycle
New CopycDNA
IN
Integration
pol
env
LTR
vif
vpr
LTRgagMA CA NC p6
PR RT RH IN
TMSU
tat
nefHIV-1
vpurev
LTRMA CA NC PR RT RHIN
LTRPseudoviridae
![Page 9: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/9.jpg)
Retroelements play a major role in the structure and evolution of many genomes
Genome sequences provide a great resource for diversity, distribution, and element identification studies
![Page 10: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/10.jpg)
Retroelements and GenomesGenome data-mining can help answer questions about:
Number of ElementsTypes of ElementsDiversityPhysical distributionImpact on hostOdd or interesting elementsEvolutionary historyElement sequence and domain characteristics
![Page 11: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/11.jpg)
Diversity of the Pseudoviridae
![Page 12: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/12.jpg)
A retroelement family tree
RetroposonsRetroposons
PseudoviridaePseudoviridae
BELBEL
DirsDirs
RetroviridaeRetroviridae
MetaviridaeMetaviridae
![Page 13: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/13.jpg)
3
6
4
1
25
Melm
oth
Tgmr
2 2904626
5 21307623
Tst1X66399
AtRE1
Evelknievel
Hopscotch
Retrofit
Luec
kenb
uess
er (
G)
Oss
er (
G) E
ndovir1-1 S
IRE
1
ToR
TL1
Opie-2
PR
EM
2
Art1 Tpv2-6
1 16648808
copia (I)
RIRE1 BARE 1
Sto 4
Tnt1 94 Tto1
Panzee
Ta1
-3
Tca
5 (F
)17
31
Ty4
(F
)
Ty1
(F
) Tca
2 (F
)
5 8
7838
61
Ta1
1
0.1
5 14977057
4 80
8019
8
Ty5-
6p (F
)
Mos
qcop
ia (I
)
95
68
97
100
92
85
70
9491
95
100
78
86
54
A.thaliana captures all plant Pseudoviridae diversity
Retroposons
Pseudoviridae
BEL
Dirs
Retroviridae
Metaviridae
![Page 14: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/14.jpg)
Mapping proteases to HIV-1 structure helps explain patterns of conservation
LTRMA CA NC RT RHIN
LTRPR
![Page 15: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/15.jpg)
Integrase: what’s happening in the back?
H D D EH CC
(Meta/Retro)viridae
GPF/Y
common region
Other
GKGY
GPF/Y
PseudoviridaeG KGY
Proline rich regionH D D EH CC
GKG Y
GPFY
-- 1 --
-- 1 --
-- 1 --
-- 1 --
-- 1 --
-- 1 --
-- 1 --
-- 1 --
-- 1 --
-- 1 --
- -1 --
-- 4 --
-- 60 --
-- 60 --
-- 57 --
-- 58 --
-- 60 --
-- 68 --
Chromodomain
+/-
Del
Athila5-1
MMLV
SnRV
Tf1
Ty3-2
gypsy
HIV1osvaldo
RSV
WDSV
BARE-1copia
Endovir1-1
Retrofit
Ty1Ty5
Melmoth
1731
Osser
Tnt1-94
Opie-2
Mosqcopia
+-----
…217
…211
…311
…239
…223
…218
…257
…290
…327
…231
…465
…476
…249
…189
…238
…248
…201
…198
…133
...137
…192
…167
…167
ILGD
+/-
---
+--+-----
Chromodomain present
ILGD motifpresent
* * * * **
LTRMA CA NC RT RH
LTRPR IN
![Page 16: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/16.jpg)
24%
2000nt 12000nt10000nt8000nt6000nt4000nt
29%
Calypso
Endovir
SIRE-1
Athila4-6
Cyclops-2
gag pol env?
24%
2000nt 12000nt10000nt8000nt6000nt4000nt
29%
Calypso
Endovir
SIRE-1
Athila4-6
Cyclops-2
gag pol env?
Putative env gene is conserved across species
![Page 17: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/17.jpg)
HIV
-1
Ro
usv
Mo
ML
V
Ty3Gypsy
Del1
Reina
Cyclops
Calypso
Fababean
Athila
4-6
Grande
Tat4
-1
Cin
ful-
1
MA
G
SU
RL
Ty1
cop
ia Tto
1
Tn
t1-94
Ta1-3
Art1
ToRTL1
Opie-2
Endovir1-1
SIRE-1
Tst1
Retrofit
Hop
scot
ch
Eve
lkn
ieve
l
Oss
er
Ty5
-6p
0.1 changes
Retroviridae
Pseudoviridae
Metaviridae
Putative retroviruses
Retroviruses independently evolved at least twice in
plants
![Page 18: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/18.jpg)
retrovirus envlike-coding regions show a bipartite structural organization
Endovir1-1 env
668 aa ToRTL1 env
31% ID
24% ID
648 aa SIRE-1 env
476 aa
pol
env
LTR
vif
vpr
LTRgagMA CA NC p6
PR RT RH IN
TMSU
tat
nefHIV-1
vpurev
![Page 19: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/19.jpg)
Gag surprises…
Putative retrovirus group
(Hemi/Pseudo)virusB
C
C
A
A
BA B
A
C
CB
LTRRT RH
LTRPR INMA CA NC
Gag is much larger in the retroviral lineage
Sequence and structural conservation is evident
![Page 20: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/20.jpg)
Diversity of the Pseudoviridae family summary
Enzymatic regions appear to be highly constrained other than the IN C-terminus.Arabidopsis LTR retrotransposons are representative of plant elements in the familyThe putative retroviruses represent an uniquely evolving Pseudoviridae lineage bearing numerous changes in the retrotransposon genome. Sub-lineage differences suggest areas to focus experimental efforts for functional studies.Gag shows greater sequence conservation than previously thought
![Page 21: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/21.jpg)
Summary continued…
envlike-coding regions have been evolutionarily conserved indicating a functional role for the ORF
features suggestive of viral env proteins have been identified in all LTR retrotransposon envlike ORFs
putative env proteins have evolved in at least two independent plant LTR retrotransposon lineages, giving credence to the hypothesis that retroviruses evolved from retrotransposons
![Page 22: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/22.jpg)
Organization of the retroelement populations of the Arabidopsis genome
![Page 23: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/23.jpg)
Do retroelements of higher eukaryotes choose where they integrate?
Is yeast a good model?Multicellular organism genome projects have noted that transposable element numbers are markedly increased near centromeres. This project quantitatively documents these anecdotal observations for the Arabidopsis genome
![Page 24: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/24.jpg)
Completed genome?
10MB 20 30 40 50 60 70 80 90
3
4
X
28.0
2
![Page 25: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/25.jpg)
RetroMap: a graphical tool for simplifying whole-genome analysis of retroelements
![Page 26: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/26.jpg)
RetroMap FeaturesRetroMap provides the following tools to work with genome
data:• Parse blast results• Assign Lineages or arbitrary groupings to retroelements• View chromosomal locations• Identify and extract LTRS• Identify and extract full length elements• Assign ages to complete LTR retroelements• Extract sequence(s) for hits• Visualize hit open reading frames• Generate information about neighboring annotated features
(Arabidopsis thaliana only)• Generate tab-delimited datafiles of retroelement information for direct
import into statistical software packages
![Page 27: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/27.jpg)
Overview of how RetroMap generates retroelement data for a genome
![Page 28: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/28.jpg)
Starting eprobe sequences
0.1
TAtRL ta11
L1 Hs
R2 Dm.
R1 Dm
Jockey Dm
996
Tca2 Ca.
Ty5 Sp
copia DmArt1 At
Endovir1 1 At
SIRE1 Gm
1000
Pao Bm
BEL Dm
Mazi Dm
Roo Dm1000
Prt1 Pbla
Dirs1 Dd
PAT Pred
861
HIV1
RSV
SnRVMMLV
WDSV
Cer1 CeOsvaldo Db
Athila At con
Ty3 Sc
sushi Fr
Tf1 Spom
946
988
![Page 29: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/29.jpg)
A. thaliana LTR retrotransposon genome overview
0.2
Tat
Athila
Metavirus
root
Metaviridae
0.1
Pseudoviridaeroot
Full-length Solo LTRs RT only A. thal DNARetroposon -- -- 311 0.22%Pseudoviridae 220 483 83 1.25%Metaviridae 217 2803 143 3.16%Athila 47 -- -- 0.60%Tat 48 -- -- 0.50%Metavirus 88 -- -- 0.64%Totals 437 3286 537 4.63%
![Page 30: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/30.jpg)
A. thaliana retroelements consist of retroposons and only two LTR families
Pseudoviridae elements are significantly shorter (p=.0001)
![Page 31: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/31.jpg)
Dating LTR retrotransposons
gag pol
identical at time of insertion
Relative ages can be estimated from the sequence divergence (genetic distance) of the LTRs
e.g. T = d (genetic distance: 1 – (% identity ÷ 100))
2k (k: nucleotide substitution rate for genome)
![Page 32: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/32.jpg)
Pseudos are younger than Metas. The Athila sublineage being the oldest tested
![Page 33: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/33.jpg)
A. thaliana RT distributions
![Page 34: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/34.jpg)
Going solo
homologous recombination loops out and deletes retroelement internal sequences
host DNA
host DNA
Full-length element
solo LTR
![Page 35: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/35.jpg)
Where have they been?
![Page 36: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/36.jpg)
No family distribution is randomMetaviridae Athila and Tat are found preferentially inside heterochromatic regions, others groups are not
Pseudoviridae and retroposon distributions are not significantly different
Solo LTRs show same distributions as full-length family members
![Page 37: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/37.jpg)
Hypotheses
Retroelement lineages show ‘universal’ organizational characteristics on the family levelGeneral retroelement abundance at centromeres is due to reduced elimination…the ‘graveyard scenario’Metaviridae in Arabidopsis are targeted to heterochromatin
![Page 38: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/38.jpg)
ConclusionsHeterochromatic regions DO appear to act as graveyards, at least in the case of the Pseudoviridae (and presumably the retroposons)
Younger Pseudoviridae elements tend to be found outside of heterochromatinSolo LTR distributions indicate that homologous recombination between LTRs is not greatly inhibited in heterochromatin
The Metaviridae lineages appear to use targeting in their interactions with the host genome
![Page 39: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/39.jpg)
AcknowledgementsSo many people helped make this research happen, I couldn’t have done it without their support and input.
Special thanks go to the many members of the Voytas lab, past and present, undergrads too!
I’ve been lucky to have good collaborators who are interesting and fun to work with. These have included Dr. Nettleton, Dr. Wright, Dr. Laten from Loyola University, and always Dr. Voytas.
To the head honcho: no one can say it hasn’t been a crazy, crazy ride. Thanks. :o)
![Page 40: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/40.jpg)
![Page 41: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/41.jpg)
Basic Hit Redundancy Elimination SchemeQuery sequence
1) Simple match, no overlap with nearest hit, no compression
case 1
case 2
2) Overlap case(s) both hits merged into one representing their combined maximum extent on the database sequence
case 3
3) Two non-overlapping hits which should be combined:a) Left checks it’s boundary position on its query sequence and determines
if the other hit falls within that range. If so merge.b) Right repeats the proceedure if Left failed to indicate a merge
case 4
4) An example of a merge case which may lead to false positives
![Page 42: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/42.jpg)
BLAST false-positive amplification problem
RTBlast Round 1
RT RT R TLTR
RT RT RT RT R TLTR R TLTR LTR LTR LTR RT
Blast Round 2
![Page 43: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/43.jpg)
LTR prediction
• Works only for hits of a sequence interior to LTRs
10 kb 10 kb
Blast2Sequences
Genome sequenceHit
H it
Hit
• Blast2Sequences is used to detect repeats• 10kb of sequence upstream and downstream are compared
• Innermost matching repeats are taken to be the LTRs
![Page 44: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/44.jpg)
LTR Identification ErrorsHit
Predicted element Hit
Tandem elements
10 kb 10 kb
Hit1 Hit2
Nested elements
10 kb 10 kb
Hit2Predicted element
Hit
pA pA
10 kb 10 kb
Degenerate or simple internal repeat elements
Hit
![Page 45: Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome](https://reader036.vdocuments.net/reader036/viewer/2022062322/56814685550346895db3a75b/html5/thumbnails/45.jpg)
Sample distribution data
Sample hit neighbors annotation data