using crisprs in micro- evolution studies - lifl.fr · using crisprs in micro-evolution studies...

28
Using CRISPRs in micro- evolution studies Algorithmique, combinatoire du texte et applications en bio-informatique Institut de Génétique et Microbiologie GPMS: Génomes Polymorphisme et Minisatellites http://minisatellites.u-psud.fr/ Encadré par : Christine POURCEL Gilles VERGNAUD Réalisé par : Ibtissem GRISSA 28/09/2007

Upload: buicong

Post on 14-Sep-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Using CRISPRs in micro-evolution studies

Algorithmique, combinatoire du texte et applications en bio-informatique

Institut de Génétique et Microbiologie

GPMS: Génomes Polymorphismeet Minisatellites

http://minisatellites.u-psud.fr/

Encadré par :

Christine POURCELGilles VERGNAUD

Réalisé par : Ibtissem GRISSA

28/09/2007

Outline of the talk- CRISPR properties- Bioinformatics tools

• Background- CRISPR Properties

- Bacterial Defense system

• Results• Results- Bioinformatics tools: CRISPRFinder, CRISPRdb

- Micro-evolution studies

- Bioinformatics tools: CRISPRFinder, CRISPRdb

- Micro-evolution studies

- CRISPR properties- Bioinformatics tools CClusteredlustered RRegularlyegularly IInterspacednterspaced SShorthort PPalindromicalindromic RRepeatepeat

CASS : CRISPR + cas• Structure :

DR Leader

DR(24 – 47 bp)

spacersLeader (AT-rich)Degenerated DR

cas

TTTGATTATTGCCTGTGCGGCAGTGAACTCAGGGGACTGGCGAACAATGTCTTTCATGATTTTCTAAGCTGCCTGTGCGGCAGTGAACGAAAAGGTAAGATGGGCAAGCTTCTAGTAGTTTTTCTAAGCTGCCTGTGCAGCAGTGAACATTATCTGAATGGCATTTTCTTTGGCGCAGATTTTCTAAGCTGCCTGTGCGGCAGTGAACAGTAAGATAATACGATAACATCCTGTTTGTAAAATACTTAT

almost all archaea (29/31)

40% of eubacteria (156/391)• Observed in procaryotic genomes :

Examples of CRISPRs- CRISPR properties- Bioinformatics tools

1 AGGTTTTGCTGCCTTTTCGGCGGGTATC TCAAAGTCAACTTGTAAATGACGATTTTCACG 32

2 ATTTTCAGCTGCCTATTCGGCAGGTCAC AGTTTGGGGCTGAGTTTGCCATTTTCCTAAAT 323 ATTTTCAGCTGCCTATTCGGCAGGTCAC GATGAAGCAGACCACCTCGATTACCCCACGCT 324 ATTTTCAGCTGCCTATTCGGCAGGTCAC ACTATTTATCAAGACCTTCTTTAAAATCAAAC 325 ATTTTCAGCTGCCTATTCGGCAGGTCAC AGTTTGGGGCTGAGTTTGCCATTTTCCTAAAC 326 ATTTTCAGCTGCCTATTCGGCAGGTCAC

(4626121)

(4626448)

** ** * * **

Shewanella sp. ANA-3 (CRISPR_2)

Yersinia pestis KIM (CRISPR_4)

1 TTATTGGGCTGCCTGTGCGGCAGTGAAC GTTATACCCCGCGCAGGGAGTGAAGCGTTGAC 32

2 TTTCTAAGCTGCCTGTGCGGCAGTGAAC TTAAGTTCTTTTTGTCAGCATCTTTAATAAAT 323 TTTCTAAGCTGCCTGTGCGGCAGTGAAC CTGAAATACAAATAAAATAAATCGTCGAACAT 324 TTTCTAAGCTGCCTGTGCGGCAGTGAAC

(2875721)

(2875928)

** **

Sulfolobus tokodaii str. 7 (CRISPR_2)

7 GATGAATCCCAAAAGGAATTGAAAG TGATTGATCACAATGAGAAGACTGTAAAGCTGATAAAC 388 GATGAATCCCAAAAGGAATTGAAAG TGTTGAGGCATAAATTAATCTATCCTTAATGAAAAAT 379 GATGAATCCCAAAAGGAATTGAAAG TTCTTCCTCAGCCTCCATTTTGTTTATGATTTGTAGTGCC 4010 GATGAATCCCAAAAGGAATTGAAAG TTCAATAATCTCTATCTTTCCAAAATCTGTAAATGAAGAC 40

109 GATGAATCCCAAAAGGAATTGAAAG AAAGCACAGTCAATAACGTTATCTGGTATCATATTATCAAA 41110 GATGAATCCCAAAAGGAATTGAAAG CTTTCTCCTTCCCTCTGATCTCTCGCTGAATTGAAAAGA 39111 GATGAATCCCAAAAGGAATTGAAAG GTAAGTATTGATGCTAACATTGACTTCGCTGTCCCAGGGGC 41112 GATGAATCCCAAAAGGAATTGAAGG AAGTATAATAACGATAGTACTAAAATTAATTGATCC 36

113 GATGATTCTCAAAAGGAATTGATAA* * ***

(32702)

(39896)

1 16 CRISPRs

1 248 motifs

CRISPRsCRISPRs : a B: a Bacterial acterial DefenseDefense systemsystem- CRISPR properties- Bioinformatics tools

• CRISPRs spacers generally originate from mobile elements (plasmids, phages) (Y. pestis, S. thermophilus, S. Solfataricus, S. pyogenes…)

• CRISPRs are transcribed and subsequentlyprocessed as micro RNAs (owing to the cas genesmachinery) : RNA interference (RNAi) system to block phage reproduction.

Cas proteins and CRISPR spacer sequencesconstitute a bacterial immune system that works by a mechanism similar to that of RNAi in higherorganisms

BBacterial acterial DefenseDefense against phage invadersagainst phage invaders- CRISPR properties- Bioinformatics tools

CRISPR Provides Acquired Resistance Against Viruses in ProkaryotesBarrangou, Horvath et al, Science 2007

System :Streptococcus thermophilus (used to make yogurt and cheese)

-Infection with phage incorporation of phage-related spacers within CRISPR1

-Such bacteria become resistant to further infection by similar phage strains

-If the spacer is taken out,the resistance is lost

-At least one cas gene is necessaryfor resistance to phage

-At least one cas gene to generatephage-resistant bacteria

Producing more phage-resistantbacterial strains for industrial use?

CRISPRFinder tool- CRISPR properties- Bioinformatics tools

- CRISPRs can be found relatively easily using existing software tools

BUT

- Output not appropriate for this purpose

- Background (tandem repeats,..) further postprocessing and manual curation!!

- Difficulty in defining the DR consensus endpoints + degenerated DR

- Sensitivity (short repeats are generally neglected)

- Absence of Web tool (easy and intuitive)

Dedicated software tool for the identification and preliminary analysis of CRISPRs- Precision- Intuitive and easily used- Web service

CRISPRFinder Workflow- CRISPR properties- Bioinformatics tools

maximal repeats

Sequence(s)

CRISPR possible localizations

DR DR23bp - 55bp

25bp - 60bp

DR’ DR DR23bp - 55bp

[0.6DR - 2.5DR]

23 DR DR[ , ]

CRISPR structure check

Tandem RepeatsElimination

Identification of candidate DRsQuestionable

CRISPRsConfirmedCRISPRs

?

A Ab dc e

Utilisation de Vmatch (Reputer)

CRISPRFinder Output- CRISPR properties- Bioinformatics tools

CRISPRFinder

http://crispr.u-psud.fr/Server/CRISPRfinder.php

CRISPRFinder

CRISPRFinder Output

http://crispr.u-psud.fr/Server/CRISPRfinder.php

- CRISPR properties- Bioinformatics tools

CRISPRdb

http://crispr.u-psud.fr/crispr/CRISPRdatabase.php

CRISPRdb

http://crispr.u-psud.fr/crispr/CRISPRdatabase.php

CRISPRdb

http://crispr.u-psud.fr/crispr/CRISPRdatabase.php

CRISPRdb

http://crispr.u-psud.fr/crispr/CRISPRdatabase.php

CRISPRdb

http://crispr.u-psud.fr/crispr/CRISPRdatabase.php

CRISPRdb

http://crispr.u-psud.fr/crispr/CRISPRdatabase.php

Spacer dictionnary creator- CRISPR properties- Bioinformatics tools

Spacer dictionnary creatorExample of use :

The micro-evolution of Y. pestis species

Evolution de la structure CRISPR

• gain de spacers : insertion polarisée adjacente à la séquence leader

• Perte interstitielle de spacers par recombinaison entre 2 DR

• Conservation de l’ordre des spacersspacer acquisitionelast DR duplication

a DR b c d Leader

CRISPR YP1 in three different strains- CRISPR properties- Bioinformatics tools

tttgattatTGCCTGTGCGGCAGTGAACATATTCTCGAGCGATAGCAATAGCCATTCCAC e TTTCTAAGCTGCCTGTGCGGCAGTGAACTCGGTCAAACAAATTTAGGCGACGATTTAACA f TTTCTAAGCTGCCTGTGCGGCAGTGAACAAAAAGAATTTGGGATTAAAGTTACCCATCAG g TTTCTAAGCTGCCTGTGCGGCAGTGAACTCAATGCCTGAATCTCTGGCGTGATAGCTGCGG h TTTCTAAGCTGCCTGTGCGGCAGTGAACAGTAAGATAATACGATAACATCCTGTTTGTAA

Souche Java9 tttgattatTGCCTGTGCGGCAGTGAACTCAGGGGACTGGCGAACAATGTCTTTCATGAT a TTTCTAAGCTGCCTGTGCGGCAGTGAACGAAAAGGTAAGATGGGCAAGCTTCTAGTAGTT b TTTCTAAGCTGCCTGTGCGGCAGTGAACATTATCTGAATGGCATTTTCTTTGGCGCAGAT c TTTCTAAGCTGCCTGTGCGGCAGTGAACTCGCCATTCCGTGAACCTGAGCGCGTTCGCGA d TTTCTAAGCTGCCTGTGCGGCAGTGAACATATTCTCGAGCGATAGCAATAGCCATTCCAC e TTTCTAAGCTGCCTGTGCGGCAGTGAACTCGGTCAAACAAATTTAGGCGACGATTTAACA f TTTCTAAGCTGCCTGTGCGGCAGTGAACAAAAAGAATTTGGGATTAAAGTTACCCATCAG g TTTCTAAGCTGCCTGTGCGGCAGTGAACTCAATGCCTGAATCTCTGGCGTGATAGCTGCGG h TTTCTAAGCTGCCTGTGCGGCAGTGAACACGTCATCCTGAAGGCTAGGCAGCTCGGCTTC 0 TTTCTAAGCTGCCTGTGCGGCAGTGAACAGTAAGATAATACGATAACATCCTGTTTGTAA

Souche 02-449 tttgattatTGCCTGTGCGGCAGTGAACTCAGGGGACTGGCGAACAATGTCTTTCATGAT a TTTCTAAGCTGCCTGTGCGGCAGTGAACGAAAAGGTAAGATGGGCAAGCTTCTAGTAGTT b TTTCTAAGCTGCCTGTGCGGCAGTGAACATTATCTGAATGGCATTTTCTTTGGCGCAGAT c TTTCTAAGCTGCCTGTGCGGCAGTGAACTCGCCATTCCGTGAACCTGAGCGCGTTCGCGA d TTTCTAAGCTGCCTGTGCGGCAGTGAACATATTCTCGAGCGATAGCAATAGCCATTCCAC e TTTCTAAGCTGCCTGTGCGGCAGTGAACTCGGTCAAACAAATTTAGGCGACGATTTAACA f TTTCTAAGCTGCCTGTGCGGCAGTGAACAAAAAGAATTTGGGATTAAAGTTACCCATCAG g TTTCTAAGCTGCCTGTGCGGCAGTGAACTCAATGCCTGAATCTCTGGCGTGATAGCTGCGG h TTTCTAAGCTGCCTGTGCGGCAGTGAACACGTCATCCTGAAGGCTAGGCAGCTCGGCTTC 0 TTTCTAAGCTGCCTGTGCGGCAGTGAACGAAATTGTGGGTGTAGATGTTGCAGACGCCTC V TTTCTAAGCTGCCTGTGCGGCAGTGAACTCTGACGTTGCCTGTGTTGCCGCTCTCGTATT W TTTCTAAGCTGCCTGTGCGGCAGTGAACAGTAAGATAATACGATAACATCCTGTTTGTAA

Souche 195P

YP1(2769)

YP2(2895) YP3

(1773)prophage(2363-2409)

Ter

Yersinia pestis CO924,653,728bp

Ori

metAaceB

atpA

AspC

nuoMphosphotransferase

Usher protein

http://crispr.u-psud.fr/crispr/MultipleAnalysis/CRISPRdetector1.php

Outil phylogénétique ??Les spacers identiques renseignent sur un ancêtre commun- Spolygotypage chez M. tuberculosis (CRISPR inactif)

strain

s1

s2

s3

s4

s5

s6

s7

s8

s9

s10

s11

s12

s13

s14

s15

s16

s17

s18

s19

s20

s21

s22

s23

s24

s25

s26

s27

s28

s29

s30

s31

s32

s33

s34

s35

s36

s37

s38

s39

s40

s41

s42

s43

199 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

18 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

314 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

310 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

312 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1

207 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

307 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

171 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

318 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1

304 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

306 1 1 1 1 1 1 0 0 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

303 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

334 1 1 0 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

57 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

Spoligotypage M. tuberculosis

Evolution of Y.Pestis CRISPR- CRISPR properties- Bioinformatics tools

CRISPR YP1 evolution (Pourcel et al 2005)

Y. pestis:- 109 sequenced alleles- 29 spacers

tentative evolutionary scenario for YP1 CRISPR

geno 1abcdemn

pestoide Georgiaa2j2b2k2l2m2

geno10 26 33 34 35 38 45abcdefghl/q/r/s/u/x/y/z

geno11@13efgh

geno 37abcdfgh

geno 47abcdefghovw

geno49abcdefghop

geno 44 48 50abcdefgho

geno 9, 14@25, 27@32, 36, 39@43, 46, 51abcdefghorientalis

a2b2c2d2e2

geno3abcdjk

antiqua (Africa)

geno 2abcdj

antiqua (Africa)

intermediateabcd

antiqua

geno 54abci

medievalis

geno 61abct

medievalis

geno 6, 7, 8, 52, 53, 55@60abc

antiqua (Asia), medievalis (Iran)a2b2c2d2

prediction for Y. pestis ancestorabcdef

"91001"adf

pestoide Chinaa2b2c2d2

ancestorabcde

CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophageDNA, and provide additional tools for evolutionary studies

POURCEL et al, Microbiology 2005

Spacers dictionnary creator- CRISPR properties- Bioinformatics tools

Y. Pestis evolution (Antiqua-> Medievalis ->Orientalis)?

Spacers dictionnary creator- CRISPR properties- Bioinformatics tools

Y. Pestis evolution (Antiqua-> Medievalis ->Orientalis)?

Y. Pestis evolution (Antiqua-> Medievalis ->Orientalis)?

Spacers dictionnary creator- CRISPR properties- Bioinformatics tools

Créer un fichier binaire de tous les spacers introduit (0: n’existe pas, 1 : existe)

CR ISPR _DR 29_2

s13

5

s134

s13

3

s13

2

s131

s130

s12

9

s12

8

s127

s126

s12

5

s12

4

s123

s122

s12

1

s12

0

s119

s118

s11

7

s11

6

s115

s11

4

s11

3

s11

2

s111

s11

0

s10

9

s10

8

s107

s10

6

s10

5

s104

s103

s10

2

s10

1

s100

s99

s98

s97

s96

s95

s94

s93

s92

s91

s90

s89

s88

s87

s86

s85

s84

s83

s82

s81

s80

s79

s78

s77

s76

s75

s74

s73

s72

s71

s70

s69

s68

s67

s66

s65

s64

s63

s62

s61

s60

s59

s58

s57

s56

s55

s54

s53

s52

s51

s50

s49

s48

s47

s46

s45

s44

s43

s42

s41

s40

s39

s38

s37

s36

s35

s34

s33

s32

s31

FSLF2-515

FSLJ2-003

FSLN1-017

J2818

str11.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23.24.25.26.27.28.29.30.31.32.33.69.70.72.73.74.75.76.77.79.80.81.82.83.84.85.86.87.88.89.90.97.98.99.100.101.104.

str2 34.35.39.40.41.42.43.44.45.46.47.48.49.50.59.60.61.63.64.65.66.69.72.77.91.92.93.94.97.98.101.102.103.104.

str3 51.52.53.54.55.56.57.59.60.61.62.63.65.66.67.68.69.71.72.76.78.79.80.95.96.98.99.102.103.104.105.

str4 36.37.38.39.40.41.42.49.50.58.59.60.61.63.64.65.67.72.77.103.

Str2

Str1

Str4

str3

SpacerPHYL

Comparaison de deux méthodes : parcimonie et distance

1) Pairwise alignmenta.b.-.-.f.g.h.i.&-.b.c.d.f.-.h.i.j-.*.-.-.*.-.*.*.&

INPUT: (séq orientées : leader à gche)>Yersinia_pestis_CO92b.c.e.h.i.j.k.l.m>Yersinia_pestis_KIMa.b.f.g.h.i.j.l>Yersinia_pestis_Antiquaa.b.c.d.j.k>Yersinia_pestis_Microtusa.d.f>Yersinia_pestis_Nepal516a

& Ending gap- gap* match

2) Matrice de distance

3) Arbre

4) Alignement et Construction d’un arbrephylogénétique

• valueOpeningGap = 20• valueEndingGap = -10• valueManyGaps = 20• valueFirstMatch = 100• valueNextMatch (i) = 100 + i

CRISPR, a tool of micro-evolution analysis- CRISPR properties- Bioinformatics tools

- Intra-species analysis (Evolutionnary history)

- Strains identification

- Epidemiological studies

A good phylogenetic tool ?

Ancient species:

- highly polymorphic in spacer composition- CRISPRs absence

- Strains differenciation : ok- Phylogenetic relations : not sufficient

To Sum up- CRISPR properties- CRISPR extraction

- CRISPRFinder for CRISPR identification

- CRISPR database and related tools

- Spacer dictionnay creator

http://crispr.u-psud.fr/Server/CRISPRfinder.php

http://crispr.u-psud.fr/crispr/CRISPRHomePage.php