supplementary materials for - sciencescience.sciencemag.org/content/sci/suppl/2016/01/13/... ·...

26
www.sciencemag.org/cgi/content/full/science.aad8282/DC1 Supplementary Materials for Structures of a CRISPR-Cas9 R-loop complex primed for DNA cleavage Fuguo Jiang, David W. Taylor, Janice S. Chen, Jack E. Kornfeld, Kaihong Zhou, Aubri J. Thompson, Eva Nogales,* Jennifer A. Doudna* *Corresponding author. E-mail: [email protected] (J.A.D.); [email protected] (E.N.) Published 14 January 2016 on Science First Release DOI: 10.1126/science.aad8282 This PDF file includes: Materials and Methods Figs. S1 to S11 Tables S1 and S2 References Other Supplementary Material for this manuscript includes the following: (available at www.sciencemag.org/cgi/content/full/science.aad8282/DC1) Movies S1 to S3

Upload: vubao

Post on 14-Apr-2018

220 views

Category:

Documents


4 download

TRANSCRIPT

www.sciencemag.org/cgi/content/full/science.aad8282/DC1

Supplementary Materials for

Structures of a CRISPR-Cas9 R-loop complex primed for DNA cleavage

Fuguo Jiang, David W. Taylor, Janice S. Chen, Jack E. Kornfeld, Kaihong Zhou, Aubri J. Thompson, Eva Nogales,* Jennifer A. Doudna*

*Corresponding author. E-mail: [email protected] (J.A.D.); [email protected] (E.N.)

Published 14 January 2016 on Science First Release DOI: 10.1126/science.aad8282

This PDF file includes:

Materials and Methods Figs. S1 to S11 Tables S1 and S2 References

Other Supplementary Material for this manuscript includes the following: (available at www.sciencemag.org/cgi/content/full/science.aad8282/DC1)

Movies S1 to S3

2

Materials and Methods

SpyCas9 expression and purification

Recombinant wild-type Streptococcus pyogenes Cas9 (SpyCas9) was expressed in

Escherichia coli strain BL21 (DE3) (Novagen) and purified as described previously (14).

Briefly, WT-SpyCas9 protein expression was induced by addition of isopropyl β-D-1-

thiogalactopyranoside (IPTG) to 0.2 mM for 20 h at 18°C. The harvested cells (~ ten-

gram cell pellets) were resuspended in 100 mL immobilized metal affinity

chromatography (IMAC) buffer containing 25 mM sodium phosphate pH 8.0, 600 mM

NaCl, 5 mM imidazole and 5% glycerol. The cell suspension was sonicated on ice and

further clarified by centrifugation at 20,000g for 45 min. The soluble fraction of His-

SUMO-tagged SpyCas9 fusion protein was then loaded onto a Nickel-charged

nitrilotriacetic acid (NTA) agarose resin (Qiagen) and eluted from NTA resin with IMAC

buffer supplemented with 150 mM imidazole, pH 7.5. After removing the 6xHis-SUMO-

tag by overnight proteolysis at 4°C in 50 mM HEPES pH 7.5, 200 mM NaCl, 5 mM DTT

and 5% glycerol, the protein solution was adjusted to 100 mM NaCl before loading onto

a prepacked SP Sepharose High Performance strong cation exchange column (HiTrap SP

HP, GE Healthcare). The tag-free SpyCas9 was then eluted with a linear NaCl gradient of

100-600 mM over 30 column volumes, followed by heparin-Sepharose chromatography

(HiTrap Heparin HP, GE Healthcare) and gel filtration chromatography (HiLoad 16/60

Superdex200, GE Healthcare). The purified WT-SpyCas9 protein was dialyzed into 50

mM Tris-HCl pH 8.0, 150 mM NaCl, 5 mM DTT, 10% glycerol and 20 mM EDTA, and

further concentrated to 10.5 mg ml-1

before flash-frozen in liquid nitrogen and stored at -

80°C. UV absorption at 280 nm was used to estimate the final SpyCas9 protein

concentration.

Selenomethionine (SeMet)-labeled WT-SpyCas9 was overexpressed in Rosetta

(DE3) cells grown in M9 minimal medium supplemented with 60 mg ml-1

L-SeMet

(Sigma-Aldrich) and specific amino acids (lysine, phenylalanine and threonine at 100 mg

ml-1

; leucine, isoleucine and valine at 50 mg ml-1

) to inhibit endogenous methionine

biosynthesis. The SeMet-modified proteins were purified identically to the native

SpyCas9 protein.

RNA transcription and purification

Large-scale in vitro transcription was used to synthesize the single-guide RNA

used in this study, with the recombinant T7 RNA polymerase and linearized plasmid

DNA as template (Integrated DNA Technologies). Briefly, 20-mL transcription reactions

were conducted in the transcription buffer containing 50 mM Tris-HCl pH 8.1, 40 mM

MgCl2, 2 mM spermidine, 10 mM DTT, 0.1% Triton X-100, 4 mM each NTP, 100 μg

mL-1

His-tagged T7 RNA polymerase, and ∼500 μg linearized DNA template. After

transcription at 37°C for 4-8 h, the target RNA was separated from abortive transcripts

and n+1 additions by 8~10% denaturing urea polyacrylamide gel electrophoresis

(PAGE). The target RNA was then passively diffused into DEPC-treated water overnight

at 4°C. The eluted RNA was concentrated and buffer exchanged into RNase-free water

using a spin concentrator with a 10-kDa cutoff (Millipore). The purified in vitro

transcribed sgRNA was refolded immediately after concentration by heating the RNA to

95°C for 5 min and cooling it slowly to room temperature. RNA concentrations were

3

measured by UV absorbance at 260 nm using a NanoDrop Spectrophotometer (Thermo

Scientific).

Reconstitution of the SpyCas9–sgRNA–dsDNA ternary complex

Reconstitution of the Cas9–sgRNA–dsDNA complex was carried out by mixing

purified SpyCas9, in vitro transcribed sgRNA and target duplex DNA in a 1:1.5:2.0

molar ratio at room temperature for at least 30 min. Notably, 20 mM

ethylenediaminetetraacetic acid (EDTA) was used to chelate metal ions (Mg2+

in

particular) in order to prevent DNA cleavage during complex reconstitution and

subsequent crystallization. The ternary complex was further purified by analytical size-

exclusion chromatography through a Superdex 200 Increase 10/300 GL column (GE

Healthcare) equilibrated with the buffer containing 30 mM Tris-HCl pH 7.5, 200 mM

NaCl, 5 mM DTT, and 20 mM EDTA. Eluates were simultaneously monitored by

ultraviolet absorbance at 260 nm and 280 nm. The eluted ternary complex was

concentrated with a spin concentrator (10-kDa cutoff) to 6.0 mg ml-1

prior to

crystallization. Noticeably, the Seleno-methionine (SeMet)-substituted Cas9–sgRNA–

dsDNA complex was assembled similarly, except that 2 mM Tris(2-

carboxyethyl)phosphine hydrochloride (TCEP) was used instead of 5 mM DTT. The

concentration of the Cas9–sgRNA–dsDNA ternary complex was estimated by UV

absorption, with the sum of SpyCas9 and sgRNA extinction coefficients at 260 nm.

Crystallization and data collection

Crystallization conditions for the purified Cas9–sgRNA–dsDNA complex were

identified with a Mosquito high-throughput crystallization robot (TTP Labtech). The

ternary complex was crystallized in both native and SeMet-derivative forms by the

sitting-drop vapour-diffusion method at 18°C using ammonium sulfate as a precipitant.

Crystals of the SpyCas9–sgRNA–dsDNA ternary complexes were further optimized by

the hanging drop vapor diffusion method at room temperature (~22°C). Aliquots (3.0 μl)

of 6.0 mg ml-1

Cas9–sgRNA–dsDNA complex were mixed with 3.0 μl of reservoir

solutions containing 1.8 M ammonium sulfate, 100 mM Tris-HCl pH8.0 and 10 mM

EDTA. Rod-like crystals appeared after 1-2 days, and reached to a maximum size of 0.10

× 0.15 × 0.30 mm over the course of 4 days. Both SeMet-labeling and 5-

iododeoxyuridines (5-IdU) substituted SpyCas9–sgRNA–dsDNA crystals were grown in

the similar conditions of the native. For cryoprotection, crystals were transferred into

either a 4.5 M saturated ammonium sulfate solution for 10 s or a well solution containing

10% glycerol and 10% xylitol for 30-45 s before flash-frozen in liquid nitrogen.

Diffraction data were collected at 100 K on beamline 8.3.1 at the Advanced Light Source

(ALS). All diffraction data were indexed and integrated using Mosflm (22) and further

scaled in SCALA with anisotropy correction (23). The crystals belong to space group

F222 and contain one Cas9–sgRNA–dsDNA complex per asymmetric unit, with a

corresponding Matthews coefficient of 4.20 Å3

Da-1

and a solvent content of 70.71%.

This high solvent content may explain the limited resolution of the crystals.

Structure determination and refinement

Two-wavelength MAD (multi-wavelength anomalous dispersion) phasing method

was used to determine the crystal structure of Cas9–sgRNA–dsDNA ternary complex

(catalytically-active unwound but pre-cleavage state). The initial MAD density map was

4

obtained by locating 22 selenium atoms in datasets PEAK and HREM (high energy

remote) using the HySS (24) routine implemented in the Phenix graphical user interface

(25). Individual protein domains derived from PAM-containing partial duplex DNA-

bound structure (PDB ID 4UN3) including RuvC domain, helical domain 1, CTD

domain, and the sgRNA in the ssDNA-bound structure (PDB ID 4OO8) were

sequentially positioned using MOLREP (26) or manually into the interpretable MAD

electron density map that is calculated at 3.8 Å resolution by SOLVE (27), followed by

solvent flattening with RESOLVE (28). After rigid body refinement and one subsequent

round of simulated annealing refinement in Phenix, nearly the entire unwound DNA

substrate (except the most 5 end of the non-target DNA, nucleotides from position -20 to

-10) and the rest of the protein domains, including the highly flexible helical domain 2

and 3 in the REC lobe and the most mobile HNH domain from the NUC lobe, could be

discerned unambiguously. Manual model building was performed in COOT (29) with the

guidance from the (2Fo − Fc), (Fo − Fc), B-factor sharpened (2Fo −Fc) and simulated

annealing omit |Fo − Fc| electron density maps, and further judged successfully by a

decrease in Rwork and Rfree during refinement. A near-complete model of the Cas9–

sgRNA–dsDNA ternary complex was then used for rigid body refinement with the

program Phaser (30) in Phenix against the 3.4 Å-resolution native dsDNA-bound data set.

Five cycles of simulated annealing refinement were conducted in Phenix to remove

model bias. Subsequent iterative rounds of structure refinement (grouped and restrained

coordinate refinement, restrained individual isotropic atomic displacement refinement, as

well as translation-libration-screw refinement) were performed using the program

Phenix.refine (25) and Refmac5 (31) in the CCP4 suite. The final refined model has Rwork

and Rfree values of 23.1% and 29.6%, respectively. Model validation with MolProbity

(32) showed 92.0% of the residues in the most favored and 6.4% in the allowed regions

of the Ramachandran plot. Data collection and refinement statistics for Cas9–sgRNA–

dsDNA structure are summarized in Supplementary Table S1. Structural analysis was

performed in Coot (29) and all structure figures were prepared using PyMOL

(Schrodinger LLC), except for for the cryo-EM figures, which were generated with

Chimera (UCSF, Computer Graphics Lab).

Fluorescence-based thermal shift assays

Thermal shift assays were performed using the Applied Biosystems 7900HT Fast

Real-Time PCR machine with 10 µM of SpyCas9 diluted in 20 mM Tris-HCl pH 7.5, 150

mM NaCl, 10 mM EDTA, 5 mM DTT, and assayed with the appropriate concentration of

ligand in a total reaction volume of 50 μl as previously described (33, 34). A 5x dilution

of SYPRO Orange dye (Invitrogen) was used as a probe with fluorescence recorded at

530 nm while the temperature was increased from 25 to 80°C with a ramp of 0.2 °C over

the course of 60 min. Control experiments were carried out in the absence of SpyCas9

protein or nucleic acids (sgRNA and DNA substrates) to check that no fluorescence

signal was recorded. The protein melting temperature (Tm) corresponding to the midpoint

for the protein unfolding transition was calculated by fitting to a Boltzmann sigmoidal

curve using the software Prism 6 (GraphPad Software, Inc.). Changes in the unfolding

transition temperature relative to the protein (SpyCas9 alone) control curve (ΔTm) were

further calculated for each ligand to estimate the protein stability upon sgRNA and target

DNA binding.

5

Cryo-electron microscopy

Target-bound Cas9 complexes in a buffer containing 30mM Tris 8.0, 150mM

NaCl, 20mM EDTA, 5mM DTT and 0.1% glycerol were frozen in vitreous ice.

Immediately after glow-discharging for 8 seconds using a Solaris plasma cleaner, 4 μl

droplets of the sample (~0.25 mg ml-1

) were placed onto C-flat grids containing 4 μm

holes with 2 μm spacing between holes (Protochips Inc.) with a thin layer of carbon over

the holes. The grids were rapidly plunged into liquid ethane using an FEI Vitrobot

MarkIV maintained at 4 °C and 100% humidity after being blotted for 5 seconds with a

blot force of 20. Data were acquired using a FEI Titan Krios transmission electron

microscope (at the HHMI Cryo-EM Shared Resource at Janelia Research Campus)

operating at 300 keV at a nominal magnification of ×29,000 (1.01 A at the specimen

level) with defocus ranging from −2.0 to −4.5 μm. A total of ~5,600 micrographs were

automatically recorded on a Gatan K2 Summit direct electron detector operated in

counting mode using the MSI-Template application within the automated

macromolecular microscopy software LEGINON (35). We collected a 6 s exposure

fractionated into 20, 300 ms frames with a dose of 8 e

-A

-2s

-1.

3D reconstruction and analysis

All image pre-processing was performed in Appion (36). The contrast transfer

function (CTF) of each micrograph was estimated using CTFFind3 (37). Particles were

selected concurrently with data collection with a template-based particle picker using

reference-free 2D class averages of negatively stained Cas9 complexes as templates (13).

We selected ~425,000 particles using this approach.

Individual movie frames were aligned and averaged using ‘dosef_gpu_driftcorr’

drift-correction software from UCSF. These drift-corrected micrographs were binned by

8, and bad micrographs and bad regions of micrographs were removed using the ‘manual

masking’ command within Appion (36). Reference-free 2D alignment and classification

was performed in RELION (38) using 100 classes to remove junk, leaving a starting

stack of ~320,000 particles. Using the negative stain DNA-bound Cas9 structure (EMD-

5860) (13) low-pass filtered to 60-A as a reference, we performed 3D classification with

3 classes within RELION and selected the best model with the largest number of particles

(~215,000) for further processing. The other two classes likely represented unassembled

complexes and junk. We performed a second round of 3D classification using 3 classes

within RELION and selected particles from the two most abundant classes corresponding

to RNA-bound Cas9 and target DNA-bound Cas9 for further processing. We performed

3D autorefine in RELION on ~160,000 and ~50,000 particles of RNA-bound Cas9 and

DNA-bound Cas9 complexes, respectively, using the corresponding class model low-pass

filtered to 60-A as a starting model to produce structures at 4.6 and 6.7-A resolution after

post processing.

In order to improve the accuracy of the alignment of individual particle images of

both RNA- and DNA target-bound datasets, we performed alignparts_lmbfgs on movies

(39) to correct for beam-induced motion. We used a local standard deviation of 500

pixels to group the particles for this alignment. We performed exposure weighting using

the curve derived by Grant et al. (40). For the DNA-bound dataset, these “shiny

particles” were then used for another round of 3D autorefine. For the RNA-bound Cas9

dataset, these “shiny particles” were subjected to a final round of 3D classification. We

6

performed 3D autorefine on the most abundant class of ~135,000 shiny particles to obtain

a final model of RNA-bound Cas9. This protocol yielded final RNA- and DNA-bound

Cas9 reconstructions at 4.47 and 6.06-A resolution, respectively. The local resolution was

calculated using the “blocres” function in the Bsoft package (41). We used RELION to

automatically calculate B-factors and apply them to the post-processed maps. The applied

B-factors were between –160- and –200-A2. All reported resolutions are based on the

gold standard 0.143 FSC criterion using two independent half-maps. While we calculated

the local resolution of both maps, the volumes were filtered according to their global

resolution estimate.

Docking and analysis

The final reconstructions were segmented using Segger in Chimera (42) based on

inspection of the maps and available crystal structures (13-16) rigidly docked into the

maps using Fit in Map (42) in Chimera. All cryo-EM figures and movies were generated

using UCSF Chimera.

Psuedoatomic models of regions missing from the crystal structure were

generated using docking of models into the cryo-EM density. To model the PAM-distal

DNA duplex in the dsDNA-bound Cas9 structure, we first docked the crystal structure of

the Cas9–sgRNA–dsDNA complex (from this study) into the cryo-EM density using Fit

in Map (42). We then docked a 10-bp B-form DNA duplex in the density in our structure

to create an R-loop model. To model the ten 5-most nucleotides (nt 1-10) of the sgRNA

in the RNA-bound Cas9 EM structure, we docked the crystal structure of Cas9–sgRNA

binary (PDB 4ZT0) into the cryo-EM density using Fit in Map (42). Next, we modeled

the backbone of single-stranded RNA using COOT (29) by tracing the EM density of the

5-end of the guide RNA. The electrostatic potential surface for the dsDNA-bound Cas9

crystal structure (Fig. S11) was calculated using the Adaptive Poisson-Boltzmann Solver

(APBS) software package (43). Briefly, the PDB file for DNA-bound Cas9 was

converted to a PQR file using PDB2PQR (44). The PQR file was analyzed through APBS

using the default settings to generate the surface potential map.

7

Fig. S1

Visualization of the wide central channel formed between the NUC and REC lobes,

and the narrow side tunnel located within the NUC lobe. The molecular surface is

shown in the same orientation as in Fig. 1D. (A) Surface representation of the Cas9–

sgRNA–dsDNA ternary complex, with target DNA colored dark blue, non-target DNA in

purple and sgRNA in orange. For clarity, the HNH domain is not shown. (B) Electrostatic

potential surface representation of the ternary complex, calculated using the PyMOL

APBS plugin. The HNH nuclease domain is omitted for clarity. (C) Surface

representation of the Cas9 protein with the HNH domain depicted as ribbon diagram,

showing that the non-target DNA strand runs across the side tunnel formed between the

HNH and RuvC nuclease domains.

8

Fig. S2

Evidence for correct sequence register of the crystallographic model using 5-

iododeoxyuridines (5-IdU) in place of three thymidines on the non-target strand. (A)

Nucleotide sequences of the target dsDNA and sgRNA used in this study, where X

indicates the 5-IdU in the non-target DNA strand. (B) The sequence register of non-target

DNA strand in our crystallographic model was confirmed by carrying out molecular

replacement against the 5-IdU derivative dataset (4.2 Å resolution) with the refined

model from the 3.4 Å-resolution native dataset as the search model. Anomalous

difference Fourier map showed an electron density feature (green mesh) with a peak

height of 8.0 standard deviations above mean peak height (σ) adjacent to the 5 position of

residues IdU(+1) and IdU(-8). Analysis of dissolved crystals on 7 M Urea-denaturing

PAGE gel demonstrated that the dsDNA substrate in the crystalline state is intact (data

not shown). Therefore, the lack of electron density for residues IdU(-13) indicates

crystallographic disorder of the 5 end of the non-target DNA strand, but not DNA

scission.

9

Fig. S3

Structural comparison of SpyCas9 in different states showing concerted

conformational reorganization upon dsDNA binding. (A) Superposition of existing

DNA-bound structures. The overlaid structures are viewed from the front (left) and side

(right) orientations after optimal superposition of the nuclease lobe (except for HNH),

indicating additional conformational rearrangements that occur in Cas9 upon dsDNA

substrate binding. This structural reorganization makes Cas9 more compact upon dsDNA

binding. The movements of the mobile modules from the ssDNA-bound and PAM-

containing partial duplex bound structures (cyan and beige, respectively) to their

positions in the dsDNA-bound structure are indicated by arrows. (B-D) Overlaying DNA-

bound structures shown side-by-side individually for clarity. The DNA-bound complexes

are color-coded as in Fig. 1.

10

Fig. S4

Comparison of the Cas9 enzyme structure in the R-loop complex with prior nucleic

acid-bound complexes indicates that the closing of the Cas9 conformation upon

dsDNA-substrate binding drives the helical domain 2 to directly contact the target

strand. For clarity, only Cas9’s HNH and helical domain 2 are shown. Notably, the

helical domain 2 in the REC lobe does not make contacts with RNA-DNA heteroduplex

in previously reported structures, including Cas9–sgRNA pre-targeting complex (A, PDB

ID 4ZT0), ssDNA-bound (B, PDB ID 4OO8) and PAM-containing partial duplex-bound

(C, PDB ID 4UN3) structures. However, this domain makes hydrogen-bonding

interactions with the deoxyribose-phosphate backbone in the target DNA strand through

residues Ser267

and Asp269

in the dsDNA-bound structure (D). Indeed, deletion of this

domain resulted in markedly decreased Cas9 activity (~50% loss). It is also worth noting

that the HNH makes very few contacts with the rest of the Cas9 protein in prior

structures, while it forms extensive interactions with the helical domain 2 upon binding to

dsDNA. These newly formed intra-domain interactions appear to play an important role

in locking the HNH domain in an activated conformation for subsequent DNA strand

scission.

11

Fig. S5

Thermal stability assays of SpyCas9 in the absence or presence of sgRNA and/or

DNA substrates. (A) Representative raw fluorescence data collected from a quantitative

real-time thermocycler. SYPRO Orange fluorescence is used as reporter of protein

thermal stability. Protein melting temperature (Tm) was determined by fitting a

Boltzmann sigmoid curve to the fluorescence intensity data. (B) The bar graph shows the

mean melting temperature difference (ΔTm) compared with the protein alone and the error

bars represent the standard deviation from three independent measurements.

12

Fig. S6

Position of the HNH domain (green) relative to the target DNA cleavage site (red) at

different stages. Arrows represent the locations of HNH catalytic residue His840

presented in all available structures. Prior structures, including the Cas9–sgRNA pre-

targeting complex (A, PDB ID 4ZT0), ssDNA-bound (B, PDB ID 4OO8) and PAM-

containing partial duplex-bound (C, PDB ID 4UN3) states, have the HNH active site

positioned in an inactive conformation, neither pointed toward nor located near the

cleavable site on the target DNA strand. Superposition of the existing structure suggests

that the HNH domain curves outward toward the target DNA strand to reach an active

conformation upon binding to dsDNA (D). Of note, the target DNA strand and sgRNA

complexes in existing DNA-bound Cas9 structures (with sgRNA scaffold omitted for

clarity) superimposed well. Therefore, the overlaying RNA-DNA complexes (color-

coded as in Fig. 1) are shown in each panel to depict the relative distance between HNH

domain and cleavable phosphate in the target DNA strand.

13

Fig. S7

Cryo-EM of Cas9 ribonucleoprotein particles. (A) A drift-corrected cryo-EM

micrograph of dsDNA target-bound Cas9 recorded on a Gatan K2 Summit direct electron

detector. The scale bar indicates 100 nm. (B) Reference-free 2D class averages of Cas9

showing the clam-like particles with some α-helices and RNA or DNA extensions clearly

visible. The width of the boxes is ~288 Å. (C) Fourier shell correlation (FSC) curves

between independent half maps for the final reconstructions, showing the resolution to be

~4.5- and ~6.0-Å using the 0.143 FSC criterion for the sgRNA-bound and target-bound

Cas9 structures, respectively. (D,E) Local resolution analysis of sgRNA-bound (D) and

DNA target-bound (E) Cas9 structures. (F) Region of density and docked atomic models

for the Arg of the RNA-bound structure. This density is consistent with the estimated

resolution of ~4.5-Å. (G,H) Euler angle distributions for the RNA-bound (G) and DNA-

bound (H) Cas9 structures, respectively. There is a predominant orientation of RNA-

bound Cas9.

14

Fig. S8

Classification and refinement workflow used in this study. A total starting stack of

~425,00 particles were subjected to 2D reference-free alignment and classification. A

subset of ~320,000 particles from good 2D classes were selected for 3D classification

using 3 classes and the negative stain DNA-bound Cas9 structure (EMD-5860) low-pass

filtered to 60-A as a reference. The class with the most particles (~215,000) was

subjected to an additional round of 3D classification using its model as a reference. This

yielded a DNA-bound Cas9 dataset, which was subsequently refined to 6.7-A resolution

and an RNA-bound Cas9 dataset, which was subsequently refined to 4.6-A resolution. To

15

further improve the reconstruction, both the RNA- and DNA-bound datasets were

subjected to “particle polishing” using alignparts_lmbfgs from John Rubinstein’s lab on

movies (39). The “polished” DNA-bound Cas9 dataset was subjected to a final round of

3D autorefine yielding a 6.06 A reconstruction. The “polished” RNA-bound Cas9 dataset

was subjected to another 3D classification. A final dataset of ~135,000 “polished” RNA-

bound Cas9 particles was subjected to 3D autorefine yielding a 4.47 A reconstruction. All

steps were performed in RELION (38), with the exception of particle polishing.

16

Fig. S9

Beam-induced motion correction for Cas9 RNPs. Representative 5-fold exaggerated

per-particle movements (rainbow lines) from a cryo-EM micrograph of DNA-bound Cas9

complexes. The x-axis and y-axis show the x dimension of the movie and the y

dimension of the movie in pixels, respectively. The far right y-axis shows the color code

corresponding to each of the 20 frames of the movie.

17

Fig. S10

Architecture of the Cas9-R loop complex. (A, C) Cryo-EM reconstructions of sgRNA-

bound Cas9 (A) and Cas9-sgRNA bound to a 40-bp target dsDNA (C) at 4.5- and 6.0-Å

resolution (using the 0.143 gold standard Fourier Shell Correlation criterion),

respectively. Subunits are segmented and colored as indicated. (B, D) Docking of the

Cas9-sgRNA crystal structure (PDB 4ZT0) into our sgRNA-bound Cas9 cryo-EM

reconstruction (B) and our pseudo-atomic Cas9-R loop structure created by docking the

crystal structure of Cas9–sgRNA bound to a 30-bp target dsDNA (Fig. 1) and modeling

the PAM-distal 10-bp of dsDNA into the target-bound cryo-EM structure (D) using

COOT (29).

18

Fig. S11

Surface features of DNA-bound Cas9 based on electrostatic potential. The molecular

surface of Cas9 colored according to electrostatic potential. Red and blue indicate

negatively charged and positively charged surface features, respectively. The black

dashed line indicates the un-modeled region of the non-target DNA strand. The inset

shows an enlarged view of this region.

19

Table S1. Crystallographic data and refinement statistics

Dataset Native Se-MAD (peak) Se-MAD (remote) 5-IdU

X-ray source ALS 8.3.1 ALS 8.3.1 ALS 8.3.1 ALS 8.3.1

Space group F2 2 2 F2 2 2 F2 2 2 F2 2 2

Cell dimensions

a, b, c (Å) 147.9, 230.1, 417.6 147.6, 231.8, 418.1 147.6, 231.8, 418.1 149.8, 233.4, 420.5

(o) 90, 90, 90 90, 90, 90 90, 90, 90 90, 90, 90

Wavelength (Å) 1.116 0.980 0.957 1.115

Resolution (Å)* 119.2-3.4 (3.6-3.4) 119.2-3.6 (3.8-3.6) 92.9-3.8 (4.0-3.8) 93.7-4.2 (4.4-4.2)

Rmerge (%)* a 0.096 (0.520) 0.232 (1.207) 0.175 (1.116) 0.162 (0.952)

Rpim (%)* b 0.056 (0.314) 0.050 (0.453) 0.02 (0.245) 0.07 (0.459)

I/I* 6.7(1.9) 10.0 (2.1) 10.9 (2.3) 10.6 (1.8)

CC1/2* c

0.99 (0.97) 0.98 (0.62) 0.99 (0.60) 0.99 (0.65)

Completeness (%)* 99.9 (99.7) 98.9 (98.5) 99.8 (98.7) 99.9 (100.0)

Redundancy* 3.8 (3.6) 8.6 (8.0) 8.2 (7.8) 4.2 (4.0)

Wilson B factor (Å2) 114 120.5 137.9 154.8

Figure of merit d 0.42

Refinement

Resolution (Å) 69.7-3.4

No. reflections 182405

Unique reflections 48447

Rwork / Rfree e 0.231/0.296

No. atoms

Protein 10146

Nucleic Acid 3491

Ligands 15

B-factors (Å2)

mean 131

Protein 129

Nucleic Acid 135

Ligands 132

R.m.s. deviations

Bond lengths (Å) 0.01

Bond angles (o) 1.49

Ramachandran plot

% favoured 92.00

% allowed 6.40

% outliers 1.60

Molprobity

Clashscore 14

20

* Highest resolution shell is shown in parentheses.

a. Rmerge = Σhkl|I - <I>|/ΣIhkl, where I is the observed intensity for a reflection and <I> is the

average intensity of a unique reflection obtained from symmetry-related and multiple

measurements. b.

Rpim is the precision-indicating (multiplicity-weighted) Rmerge. c.

CC1/2 is the percentage of correlation between intensities from random half-datasets. d.

Figure of merit = <|ΣP(α)eiα

/ ΣP(α)|>, where α is the phase and P(α) is the phase

probability distribution. e.R

= hkl||Fobs| - |Fcalc||/hkl|Fobs|, where Fcalc and Fobs are calculated and observed structure

factor amplitudes, respectively. Rfree was calculated as Rwork using 5% of reflections

excluded in refinement. Rwork is calculated for the remaining reflections.

21

Table S2. List of nucleic acid reagents used in this study

Description DNA substrates used in this study

λ1 DNA sequence

(Top: non-target strand;

Bottom: target strand;

PAM colored red)

5’-GGCGCATAAAGATGAGACGCTGG-3’

|||||||||||||||||||||||

3’-CCGCGTATTTCTACTCTGCGACC-5’

λ1 target DNA duplex

(30bp) used in

crystallization and

thermal shift assay

5’-GGCGCATAAAGATGAGACGCTGGCGATTAG

||||||||||||||||||||||||||||||

CCGCGTATTTCTACTCTGCGACCGCTAATC-3’

λ1 target DNA duplex

(40bp) used in cryo-EM

studies

5’-GCACAACTACGGCGCATAAAGATGAGACGCTGGCGATTAG

||||||||||||||||||||||||||||||||||||||||

CGTGTTGATGCCGCGTATTTCTACTCTGCGACCGCTAATC-3’

λ1 target DNA strand

(20 nt)

5’-GCGTCTCATCTTTATGCGCC-3’

λ1 PAM-containing

partial duplex DNA

5’-TGGCGATTAG

||||||||||

CCGCGTATTTCTACTCTGCGACCGCTAATC-3’

λ1 target DNA duplex

(30 bp) with a mutated

PAM sequence

(highlighted in green)

5’-GGCGCATAAAGATGAGACGCTCACGATTAG

||||||||||||||||||||||||||||||

CCGCGTATTTCTACTCTGCGAGTGCTAATC-3’

λ4 DNA duplex (30 bp,

lacking a λ1 target

sequence, colored

purple)

5’-GAGATATAGCCTGGTGGTTCTGGCGATTAG

||||||||||||||||||||||||||||||

CTCTATATCGGACCACCAAGACCGCTAATC-3’

22

Movie S1

Architecture of the sgRNA-bound Cas9. This movie shows 360° rotations of the

sgRNA-bound Cas9 reconstruction with and without the crystal structure (PDB 4ZT0)

(14) docked into the structure. First, the sgRNA is shown with the 5-most ten nucleotides

(nts. 1–10) of the guide region of the RNA built into the EM density. Second, the Hel I–

III, Arg and CTD domains are added; third, the RuvC and HNH domains are added to

show the architecture of the entire complex. The crystal structure for sgRNA-bound Cas9

is docked as in Fig. S10B.

Movie S2

Architecture of the dsDNA target-bound Cas9. This movie shows 360° rotations of the

DNA-bound Cas9 reconstruction with and without the crystal structure (this study)

docked into the structure. First, the sgRNA is shown, excluding the 20 nts of the guide

region; second, the sgRNA:target-strand DNA heteroduplex, PAM duplex, non-target

DNA strand and PAM-distal duplex are added; third, the Hel I–III, Arg and CTD

domains are added; fourth the RuvC and HNH domains are added to show the

architecture of the entire complex. The crystal structure for DNA-bound Cas9 is docked

as in Fig. S10D.

Movie S3

Conformational rearrangements of RNA-bound Cas9 upon target-DNA binding and

R-loop formation. This movie shows a morph between the RNA-bound Cas9 and DNA-

bound Cas9. First, RNA-bound Cas9 is shown. Next, to emphasize the conformational

rearrangements required for binding target DNA and formation of an R-loop, the domains

of Cas9 from the RNA-bound structure are morphed from their RNA- to DNA-bound

position. Once in the DNA-bound state, the sgRNA:target-strand DNA heteroduplex,

PAM duplex, non-target DNA strand and PAM-distal DNA duplex density are shown in

medium blue. The movie then repeats in reverse (from DNA-bound to the RNA-bound

state). The morph was generated using the morph map function in UCSF Chimera. The

two maps were aligned based on the sgRNA density, and morph maps were generated for

each individual domain.

References 1. B. Wiedenheft, S. H. Sternberg, J. A. Doudna, RNA-guided genetic silencing systems in

bacteria and archaea. Nature 482, 331–338 (2012). Medline doi:10.1038/nature10886

2. R. Sorek, C. M. Lawrence, B. Wiedenheft, CRISPR-mediated adaptive immune systems in bacteria and archaea. Annu. Rev. Biochem. 82, 237–266 (2013). Medline doi:10.1146/annurev-biochem-072911-172315

3. R. Barrangou, L. A. Marraffini, CRISPR-Cas systems: Prokaryotes upgrade to adaptive immunity. Mol. Cell 54, 234–244 (2014). Medline doi:10.1016/j.molcel.2014.03.011

4. L. A. Marraffini, CRISPR-Cas immunity in prokaryotes. Nature 526, 55–61 (2015). Medline doi:10.1038/nature15386

5. J. van der Oost, E. R. Westra, R. N. Jackson, B. Wiedenheft, Unravelling the structural and mechanistic basis of CRISPR-Cas systems. Nat. Rev. Microbiol. 12, 479–492 (2014). Medline doi:10.1038/nrmicro3279

6. F. Jiang, J. A. Doudna, The structural biology of CRISPR-Cas systems. Curr. Opin. Struct. Biol. 30, 100–111 (2015). Medline doi:10.1016/j.sbi.2015.02.002

7. M. M. Jore, M. Lundgren, E. van Duijn, J. B. Bultema, E. R. Westra, S. P. Waghmare, B. Wiedenheft, U. Pul, R. Wurm, R. Wagner, M. R. Beijer, A. Barendregt, K. Zhou, A. P. Snijders, M. J. Dickman, J. A. Doudna, E. J. Boekema, A. J. Heck, J. van der Oost, S. J. Brouns, Structural basis for CRISPR RNA-guided DNA recognition by Cascade. Nat. Struct. Mol. Biol. 18, 529–536 (2011). Medline doi:10.1038/nsmb.2019

8. M. D. Szczelkun, M. S. Tikhomirova, T. Sinkunas, G. Gasiunas, T. Karvelis, P. Pschera, V. Siksnys, R. Seidel, Direct observation of R-loop formation by single RNA-guided Cas9 and Cascade effector complexes. Proc. Natl. Acad. Sci. U.S.A. 111, 9798–9803 (2014). Medline doi:10.1073/pnas.1402597111

9. M. Rutkauskas, T. Sinkunas, I. Songailiene, M. S. Tikhomirova, V. Siksnys, R. Seidel, Directional R-Loop Formation by the CRISPR-Cas Surveillance Complex Cascade Provides Efficient Off-Target Site Rejection. Cell Reports 10, 1534–1543 (2015). doi:10.1016/j.celrep.2015.01.067

10. T. R. Blosser, L. Loeff, E. R. Westra, M. Vlot, T. Künne, M. Sobota, C. Dekker, S. J. Brouns, C. Joo, Two distinct DNA binding modes guide dual roles of a CRISPR-Cas protein complex. Mol. Cell 58, 60–70 (2015). Medline doi:10.1016/j.molcel.2015.01.028

11. M. Jinek, K. Chylinski, I. Fonfara, M. Hauer, J. A. Doudna, E. Charpentier, A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012). Medline

12. G. Gasiunas, R. Barrangou, P. Horvath, V. Siksnys, Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc. Natl. Acad. Sci. U.S.A. 109, E2579–E2586 (2012). Medline doi:10.1073/pnas.1208507109

13. M. Jinek, F. Jiang, D. W. Taylor, S. H. Sternberg, E. Kaya, E. Ma, C. Anders, M. Hauer, K. Zhou, S. Lin, M. Kaplan, A. T. Iavarone, E. Charpentier, E. Nogales, J. A. Doudna,

23

Structures of Cas9 endonucleases reveal RNA-mediated conformational activation. Science 343, 1247997 (2014). Medline doi:10.1126/science.1247997

14. F. Jiang, K. Zhou, L. Ma, S. Gressel, J. A. Doudna, A Cas9-guide RNA complex preorganized for target DNA recognition. Science 348, 1477–1481 (2015). Medline doi:10.1126/science.aab1452

15. H. Nishimasu, F. A. Ran, P. D. Hsu, S. Konermann, S. I. Shehata, N. Dohmae, R. Ishitani, F. Zhang, O. Nureki, Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell 156, 935–949 (2014). Medline doi:10.1016/j.cell.2014.02.001

16. C. Anders, O. Niewoehner, A. Duerst, M. Jinek, Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature 513, 569–573 (2014). Medline doi:10.1038/nature13579

17. H. Nishimasu, L. Cong, W. X. Yan, F. A. Ran, B. Zetsche, Y. Li, A. Kurabayashi, R. Ishitani, F. Zhang, O. Nureki, Crystal Structure of Staphylococcus aureus Cas9. Cell 162, 1113–1126 (2015). Medline doi:10.1016/j.cell.2015.08.007

18. F. J. M. Mojica, C. Díez-Villaseñor, J. García-Martínez, C. Almendros, Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology 155, 733–740 (2009). Medline doi:10.1099/mic.0.023960-0

19. S. H. Sternberg, B. LaFrance, M. Kaplan, J. A. Doudna, Conformational control of DNA target cleavage by CRISPR-Cas9. Nature 527, 110–113 (2015). Medline doi:10.1038/nature15544

20. M. Newman, T. Strzelecka, L. F. Dorner, I. Schildkraut, A. K. Aggarwal, Structure of Bam HI endonuclease bound to DNA: Partial folding and unfolding on DNA binding. Science 269, 656–663 (1995). Medline doi:10.1126/science.7624794

21. Y. W. Yin, T. A. Steitz, Structural basis for the transition from initiation to elongation transcription in T7 RNA polymerase. Science 298, 1387–1395 (2002). Medline doi:10.1126/science.1077464

22. T. G. G. Battye, L. Kontogiannis, O. Johnson, H. R. Powell, A. G. W. Leslie, iMOSFLM: A new graphical interface for diffraction-image processing with MOSFLM. Acta Crystallogr. D Biol. Crystallogr. 67, 271–281 (2011). Medline doi:10.1107/S0907444910048675

23. M. D. Winn, C. C. Ballard, K. D. Cowtan, E. J. Dodson, P. Emsley, P. R. Evans, R. M. Keegan, E. B. Krissinel, A. G. Leslie, A. McCoy, S. J. McNicholas, G. N. Murshudov, N. S. Pannu, E. A. Potterton, H. R. Powell, R. J. Read, A. Vagin, K. S. Wilson, Overview of the CCP4 suite and current developments. Acta Crystallogr. D Biol. Crystallogr. 67, 235–242 (2011). Medline doi:10.1107/S0907444910045749

24. R. W. Grosse-Kunstleve, P. D. Adams, Substructure search procedures for macromolecular structures. Acta Crystallogr. D Biol. Crystallogr. 59, 1966–1973 (2003). Medline doi:10.1107/S0907444903018043

25. P. D. Adams, P. V. Afonine, G. Bunkóczi, V. B. Chen, I. W. Davis, N. Echols, J. J. Headd, L. W. Hung, G. J. Kapral, R. W. Grosse-Kunstleve, A. J. McCoy, N. W. Moriarty, R. Oeffner, R. J. Read, D. C. Richardson, J. S. Richardson, T. C. Terwilliger, P. H. Zwart,

24

PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221 (2010). Medline doi:10.1107/S0907444909052925

26. A. Vagin, A. Teplyakov, Molecular replacement with MOLREP. Acta Crystallogr. D Biol. Crystallogr. 66, 22–25 (2010). Medline doi:10.1107/S0907444909042589

27. T. Terwilliger, SOLVE and RESOLVE: Automated structure solution, density modification and model building. J. Synchrotron Radiat. 11, 49–52 (2004). Medline doi:10.1107/S0909049503023938

28. T. C. Terwilliger, Maximum-likelihood density modification. Acta Crystallogr. D Biol. Crystallogr. 56, 965–972 (2000). Medline doi:10.1107/S0907444900005072

29. P. Emsley, K. Cowtan, Coot: Model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132 (2004). Medline doi:10.1107/S0907444904019158

30. A. J. McCoy, R. W. Grosse-Kunstleve, P. D. Adams, M. D. Winn, L. C. Storoni, R. J. Read, Phaser crystallographic software. J. Appl. Cryst. 40, 658–674 (2007). doi:10.1107/S0021889807021206

31. G. N. Murshudov, P. Skubák, A. A. Lebedev, N. S. Pannu, R. A. Steiner, R. A. Nicholls, M. D. Winn, F. Long, A. A. Vagin, REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr. D Biol. Crystallogr. 67, 355–367 (2011). Medline doi:10.1107/S0907444911001314

32. V. B. Chen, W. B. Arendall 3rd, J. J. Headd, D. A. Keedy, R. M. Immormino, G. J. Kapral, L. W. Murray, J. S. Richardson, D. C. Richardson, MolProbity: All-atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 66, 12–21 (2010). Medline doi:10.1107/S0907444909042073

33. F. H. Niesen, H. Berglund, M. Vedadi, The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nat. Protoc. 2, 2212–2221 (2007). Medline doi:10.1038/nprot.2007.321

34. F. Jiang, A. Ramanathan, M. T. Miller, G. Q. Tang, M. Gale Jr., S. S. Patel, J. Marcotrigiano, Structural basis of RNA recognition and activation by innate immune receptor RIG-I. Nature 479, 423–427 (2011). Medline doi:10.1038/nature10537

35. C. Suloway, J. Pulokas, D. Fellmann, A. Cheng, F. Guerra, J. Quispe, S. Stagg, C. S. Potter, B. Carragher, Automated molecular microscopy: The new Leginon system. J. Struct. Biol. 151, 41–60 (2005). Medline doi:10.1016/j.jsb.2005.03.010

36. G. C. Lander, S. M. Stagg, N. R. Voss, A. Cheng, D. Fellmann, J. Pulokas, C. Yoshioka, C. Irving, A. Mulder, P. W. Lau, D. Lyumkis, C. S. Potter, B. Carragher, Appion: An integrated, database-driven pipeline to facilitate EM image processing. J. Struct. Biol. 166, 95–102 (2009). Medline doi:10.1016/j.jsb.2009.01.002

37. J. A. Mindell, N. Grigorieff, Accurate determination of local defocus and specimen tilt in electron microscopy. J. Struct. Biol. 142, 334–347 (2003). Medline doi:10.1016/S1047-8477(03)00069-8

25

38. S. H. W. Scheres, RELION: Implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012). Medline doi:10.1016/j.jsb.2012.09.006

39. J. L. Rubinstein, M. A. Brubaker, Alignment of cryo-EM movies of individual particles by optimization of image translations. J. Struct. Biol. 192, 188–195 (2015). Medline doi:10.1016/j.jsb.2015.08.007

40. T. Grant, N. Grigorieff, Measuring the optimal exposure for single particle cryo-EM using a 2.6 Å reconstruction of rotavirus VP6. eLife 4, e06980 (2015). Medline doi:10.7554/eLife.06980

41. J. B. Heymann, D. M. Belnap, Bsoft: Image processing and molecular modeling for electron microscopy. J. Struct. Biol. 157, 3–18 (2007). Medline doi:10.1016/j.jsb.2006.06.006

42. E. F. Pettersen, T. D. Goddard, C. C. Huang, G. S. Couch, D. M. Greenblatt, E. C. Meng, T. E. Ferrin, UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004). Medline doi:10.1002/jcc.20084

43. N. A. Baker, D. Sept, S. Joseph, M. J. Holst, J. A. McCammon, Electrostatics of nanosystems: Application to microtubules and the ribosome. Proc. Natl. Acad. Sci. U.S.A. 98, 10037–10041 (2001). Medline doi:10.1073/pnas.181342398

44. T. J. Dolinsky, P. Czodrowski, H. Li, J. E. Nielsen, J. H. Jensen, G. Klebe, N. A. Baker, PDB2PQR: Expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res. 35 (Web Server), W522–W525 (2007). Medline doi:10.1093/nar/gkm276

26