sarah pyfrom [email protected] crispr-associated proteins
TRANSCRIPT
Sarah [email protected]
CRISPR-associated Proteins
Research QuestionsWhat Cas-proteins does our species share
with the 10 other species we chose to study?If so, how do they compare?
How do Cas-proteins function in relation to CRISPR units?
[Edit]:Why did JGI change its annotation?
Cas ProteinsProteins that are almost always associated
with (near) CRISPR sequencesOriginally four major families
Now, at least 45 families total
JGI annotation
“Old” Cas-Proteins “New” Cas-Proteins
Cas1Cas2Cas3Cas4TM1800TM1801
Cas1Cas2Cas4Cas5Cas6Csh1Csh2
Changes:TM1800= Cas5TM1801=Csh2Hypothetical protein = Csh1Part of hypothetical protein = Cas6Cas3 = hypothetical proteinCas4:
MTDSSGDPVDRFLAAARDESAELPFRLTGVMFQYYVVCERELWFLSRDVEIDRDTPAIVRGSDVDDSAYADKRRDVRVDGIIAIDVLDSGEILEVKPSSSMTEPARLQLLFYLWYLDRVTGVEKTGVLAHPAEKRRETVELTPETSAEVESAIEGIRAVVTAESPPPAEEKPVCDSCAYHDFCWSC (red = original Cas4)
CRISPRCRISPR
Cas1Cas1 Unidentified
Unidentified Csh1Csh1
Cas2
Cas2 Cas4Cas4 Cas
5Cas5
Transposases
Map of CRISPR region
Csh2Csh2
Cas6Cas6
TM1800TM1800
TM1801TM1801
Cas3Cas3 Hypothetical proteinsHypothetical proteins
Cas1(from Sulfolobus solfataricus)
high-affinity nucleic acid binding proteinbinds DNA, RNA and DNA–RNA hybrid sequence non-specific in a multi-site
binding modepromotes the hybridization of
complementary nucleic acid strands.
From: SSO1450 – A CAS1 protein from Sulfolobus solfataricus P2 with high affinity for RNA and DNA
Cas2 function unknown
Cas3 Cas4
Usually similar to helicasesUnwinds double-
stranded DNAThought to be
involved in DNA metabolism and repair
Often resemble Rec-B exonucleasesBreak down
nucleic acid strainsThought to be
involved in DNA metabolism
From Genbank
Cas5 Cas6
Often found with Cas1, and Cas6.
Share and N-terminal region of about 43 amino acids in length
Are usually 210-265 amino acids long
Characterized by GhGxxxxxGhG motif, where h indicates a hydrophobic residue, at the C-terminus
From: EMBL IPR013422 profile page (: http://www.ebi.ac.uk/interpro/IEntry?ac=IPR013422)
From: Sanger PF09559 Profile page ( http://pfam.sanger.ac.uk/family/PF09559)
Csh1 and Csh2?Protein families determined for ease of
alignmentOften large differences between species
Alignment easier if protein “soup” is divided into more readily-compared subgroups.
CRISPRs thought to create stable secondary RNA structuresSpacers remain
associated with their DR neighbors.
Provide a way for Cas-Proteins to recognize the spacers and facilitate immune response.
From: Evolutionary conservation of sequence and secondary structures in CRISPR repeats
Cas-Proteins and ImmunityThought to act like Slicer and Dicer
(eukaryotic counterpart)Create siRNA that will inhibit/break down
invading RNANot known if Cas-proteins are involved in
integrating pathogenic DNA into spacers
Video of eukaryotic siRNA process: http://www.youtube.com/watch?v=D-77BvIOLd0
Alignments of CasCompared Cas1, Cas2, Cas3 etc. proteins
across all 10 species…
Comparison with other species:(based on “old” proteins)Species Cas1 Cas2 Cas3 Cas4 TM180
0TM180
1H:
vallismortis
H. volcanii
H. sulfurifontis X X
H. sinaiiensis X X X X XH. californiae X X XH. utahensis X X X X X XH. mucosum X X X X X X
H. mediteranei X X X X X X
H. denitrificans X X X X X X
H. mukohataei X X X X X X
Phylogenetic tree comparing amino acid sequences for all CAS-proteins
Halomicrobium mukohataeiHaloarcula sinaiiensisHaloarcula californiaeHaloferax dentrificansHaloferax mediteranei
Haloferax sulfurifontis
Haloferax mucosumHalorhabdus utahensis
22
18011801
18001800
44
11
33
11
44
33
18011801
18001800
1801180118001800
33
18011801
18001800
33
44
11
22
22
33
44
18011801
11
18001800
22
44
11
22
11
18011801
18001800
44
44
22
11
18011801
33
18001800
33
Cas 1 and Cas2 did not change
Cas 4
• JGI revision shortened this protein
•Would expect low sequence similarity near end of protein
TM1801 (Csh2)
• Revision by JGI simply renamed this protein
•Would expect sequence similarity
CRISPRCRISPR
Cas1Cas1 Unidentified
Unidentified Csh1Csh1
Cas2
Cas2 Cas4Cas4 Cas
5Cas5
Transposases
Map of CRISPR region
Csh2Csh2
Cas6Cas6
TM1800TM1800
TM1801TM1801
Cas3Cas3 Hypothetical proteinsHypothetical proteins
We don’t know much….
…but we do know everything that everybody else knows.
In conclusion:
Questions?
References Kunin, V., Sorek, R., Hugenholtz, P. (2007) Evolutionary conservation of
sequence and secondary structures in CRISPR repeats. Genome Biology.http://genomebiology.com/2007/8/4/R61. Accessed 24 Nov, 2009.
Haft, D.H., Selengut, J. Mongodin, EF., Nelson, K.E. (2005). A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLoS Comput Biol. http://www.ncbi.nlm.nih.gov/pubmed/16292354. Accessed 24 Nov 2009.