standard practices for x-ray crystallographic structure...
TRANSCRIPT
Standard practices for X-ray crystallographic
structure determination in the Nowick laboratory
(Version 1.0.4)
Patrick J. Salveson, Adam G. Kreutzer, and Nicholas L. Truex
E-mail:
Contents
Contributions to this guide 4
A note about this guide 5
Programs needed 6
Collecting data 7
Rigaku (In-house) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
SSRL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
ALS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Processing data 9
Transferring image files to SSRL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Transferring images to SSRL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Setting up a working directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Processing data collected on the Rigaku with XDS . . . . . . . . . . . . . . . . . 12
1
Processing data sets collected on ALS/SSRL with XDS . . . . . . . . . . . . . . . 13
Processing data sets with Imosfilm . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Merging multiple data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
BLEND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Phenix 23
Xtriage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Reflection file editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
HySS and AutoSol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Phaser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Building models 29
Anatomy of a model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Introduction to Coot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Fitting a model to density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Changing the numbers of residues/ligands in Chains . . . . . . . . . . . . . . . . . 30
Adding normal amino acids to your model . . . . . . . . . . . . . . . . . . . . . . 31
Adding unnatural amino acids or ligands to your model . . . . . . . . . . . . . . . 31
Numbering residues in a completed macrocycle . . . . . . . . . . . . . . . . . . . 34
Placing additional macrocycles in your model . . . . . . . . . . . . . . . . . . . . 35
Refinement 38
Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Introduction to phenix.refine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Files made and used by phenix.refine . . . . . . . . . . . . . . . . . . . . . . . . . 40
Forcing amide restraint for N-to-C cyclization . . . . . . . . . . . . . . . . . . . . 41
Adding waters to your model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Depositing 43
2
Preparing a PDB for deposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Miscellaneous 44
Making CIF files for ligand/amino acids with eLBOW . . . . . . . . . . . . . . . . 44
Fixing issues with hydrogens “blowing off” of your model in Coot . . . . . . . . . 44
3
Contributions to this guide
Latest revision: November 18th 2015
P.J.S. wrote the Building models and Refinement sections, the Merging multiple
data sets, Making CIF files for ligand/amino acids with eLBOW, Processing data
sets collected on ALS/SSRL with XDS, Processing data sets with Imosfilm and
Fixing issues with hydrogens “blowing off” of your model in Coot, Preparing a
PDB for deposition subsections, in addition to complied the guide as LaTeX document.
A.G.K. wrote the Transferring image files to SSRL, Setting up project-directory,
and Processing data sets collected on the Rigaku with XDS subsections.
N.L.T wrote the subsections Setting up a working directory on SSRL, BLEND,
and Reflection file editor.
The majority of these tips/procedures exist due to countless other unmentioned sources
passing their knowledge down from student to student (Particularly R.K.S.), in addition to
a number of academic sources, that may be added to this guide in the future.
4
A note about this guide
This guide contains strategies used to solve X-ray crystallographic structures by members of
the Nowick Laboratory. The guide mainly focuses on which buttons to press to accomplish
certain tasks. This guide was written to be useful for future members of the Nowick lab,
and to help ensure that certain skills do not become lost over time. This guide may also
be useful for scientists outside of the Nowick lab, but it is particularly written to help with
solving X-ray crystallographic structures of the Nowick Lab, and especially of macrocyclic
β-sheets.
5
Programs needed
Solving X-ray crystallographic structures often entails the use of various SSH clients, in
addition to a number of auxiliary software. This is a list of programs our lab typically uses.
These are not the only options. They can be found with simple google searches. This guide
will not include links to avoid having to needlessly check the status of links.
Manipulation of atomic coordinates: OSX and Windows: Coot.
SSH client: OSX: FileZila. Windows: Xftp from the Xmanager Enterprise suite.
Remote access of SSRL and ALS: OSX and Windows: NoMachine.
Easy model viewing: OSX and Windows: PyMol.
Validation of ligand properties: OSX and Windows: MacroModel.
6
Collecting data
Rigaku (In-house)
Turning on the Rigaku You will need to book time on the instrument. There is a sign up
sheet on the door when entering the Rigaku room. You will also need to sign into the log
book sitting onto the computer. It takes about 1 hour for the nitrogen stream to become
cool enough to start collecting images.
Turning on the Rigaku is straight forward. there is a laminated printout near the com-
puter that details the steps. You will need to make sure the liquid nitrogen is full. This
entails making sure the big tank isn’t empty, opening the big tank, and pressing the “red
button” on the “blue box” to fill the dewer. The dewer is equipped with an autofil device,
so it will shut off flow of N2 to it once it is full (you will se the gas stop flowing out of the
vent tubes on the dewer). Once this happens you need to close off the big tank, as there is a
leak in the tubing that hasn’t yet been fixed. A full dewer should last for about one day. Its
probably a good idea to refill the dewer periodically, especially if you are collecting a data
set overnight.
Screening After mounting your crystal, open the program CrystalClear 2.0. It will ask
you to initialize the instrument. The left side of the UI has a series of buttons, you will go
from top to bottom when collecting/screening crystals. In order to collect additional images
(say after mounting a new crystal, or with longer exposures), go to initial images click add
scans, and then collect.
Once you have collected some images of a crystal, you need to attempt to index them
(if there is diffraction). Click Index then specify which images you want to try to index. If
the program is able to index the crystal (i.e. assign a space group that isn’t P1), you should
collect a data set.
Rule of thumb: don’t be greedy, if a crystal indexes just collect a data set.
7
SSRL
Needs to be written.
ALS
Needs to be written.
8
Processing data
DISCLAIMER: SKIP DIRECTLY TO THE BLEND SECTION FOR UPLOADING AND
PROCESSING MULTIPLE DATA SETS FROM THE RIGAKU
We prefer to use XDS to process our datasets; however XDS will sometimes fail to
index. In these cases you can try the program Imosfilm. This section will detail how to
process the image files from collection into MTZ files that you will use in Phenix.
Transferring images from the Rigaku
Transferring Images from the Rigaku. Use a large USB drive to transfer your images from
the Rigaku. To access the image files, there is a shortcut on the desktop to where the images
files are stored. After navigating to the Nowick folder, click on your folder, and drag your
images file to your USB.
9
Transferring images to SSRL
We process the images on the SSRL servers, but the images and files need to be stored in
certain places. This subsection explains where to transfer your images to SSRL. You will
need to open your FTP client to transfer your images (see the “Programs Needed” section).
You can also open NoMachine! at this point to help with making directories.
Use the same group login information to access SSRL with your FTP client and NoMa-
chine! After logging in, you will always be in the Nowick group home directory:
/mnt/home1/nowick
The image files should NOT be saved in the home directory. Instead, the image files
should be saved in the data directory. To access this directory, navigate up until you
reach:
/mnt
Then navigate to the Nowick data directory, which is located at:
/mnt/data/u1/nowick
If you do not have a directory here yet, you can make one for yourself by typing the following
command in a terminal window:
mkdir yourname
This is YOUR data directory, where you will upload and store all of your image files:
/mnt/data/u1/nowick/yourname
Make a new directory for storing the images files associated with the data you just collected.
The following command is an easy way to name the directory using terminal window:
mkdir PEPTIDENAME TRAYNUMBER WELLNUMBER
Throughout the remainder of this guide, the above directory name will be referred to as
“WHATEVER” for the sake of saving space.
10
Setting up a working directory on SSRL
After transferring your image files to SSRL, open NoMachine on your computer and log in
to SSRL. Open a terminal and navigate to your folder in the “home directory”. If you do not
have a directory here yet, you can make one for yourself by typing the following command
in the terminal window:
mkdir yourname
Make a new directory (with the same name as the data directory you just created) for
storing processing files. This is your working directory. Use the following format to name
the directory:
mkdir PEPTIDENAME TRAYNUMBER WELLNUMBER
Navigate to the working directory. Now you will create a symbolic link from this directory
to your data directory. The following command will create a symbolic link to your data
directory.
ln -s /data/nowick/yourname/WHATEVER/Images images
For the sake of space, the name “WHATEVER” is for the data directory. You will need to
use the actual name of the data directory. ln The subsequent steps will vary, depending on
how you will be processing your data. These steps will be presented in following sections.
11
Processing one data set from the Rigaku with XDS
Using the GUI file browser on NoMachine copy the files listed below into the working
directory: XDS.INP, XDSCONV.INP. Open a Data Processing terminal by clicking
the red circle icon at the bottom of the screen. Double click on a host server that is not
being used (i.e. all of the numbers are close to 0). In the Data Processing terminal, navigate
to your new working directory.
Open the XDS.INP file in gedit:
gedit XDS.INP
Change beam center coordinates (recorded from Rigaku software), tell XDS what images
to process and where to find them, close and save. XDS.INP is annotated to help you find
where these values are.
Run xds:
xds -par
if it runs successfully record space group and cell dimensions, and run pointless:
pointless xdsin XDS ASCII.HKL hklout pointless.mtz
Next run aimless:
aimless hklin pointless.mtz hklout aimless.mtz | tee aimless.log
exit
Continue on to Phenix.
12
Processing data sets collected on ALS/SSRL with XDS
Create your project-directory as outlines in Setting up project-directory. There is no
need to copy the XDS.INP or XDSCONV.INP files into the project-directory. Inside
of project-directory run autoxds (in a processing terminal). You will need to direct it
inot your images link, and tell it the last number of images to be used. Here is a minimal
example of running autoxds:
autoxds images/NAMEOFIMAGE 001.img -last 100
There are additional parameters you an set in this command. Google “autoxds” for a
website (maintained by SSRL) for examples and options. If autoxds is successful it will
make a directory in project-directory named project-directory-xds, in which will be an
aimless.mtz file. You can move on to Phenix with this aimless file.
13
Processing one data set with Imosfilm
Create your project-directory as outlines in Setting up project-directory. There is
no need to copy the XDS.INP or XDSCONV.INP files into the project-directory (it
doesn’t hurt if they are there). In a processing terminal, in you project-directory, enter:
imosflm
This will launch the Imosfilm user interface (Figure 1). Add images with the green plus
over a blue circle button on the top left. then click on the indexing tab. It will try to index.
If it does will add items to the screen, with one highlighted, this is the space group Imosfilm
most likely thinks your crystal is. Then click the cell refinement button (on the left). Click
Process. Then move to the integration tab. Again click Process. Once this is done click
Quick Scale. If no error messages popped up through this series of steps, Imosfilm should
have made an aimless.mtz file in your project-directory. You can move on to Phenix
with this aimless file.
14
Figure 1: UI of Imosfilm. Add images with the green plus over a blue circle button on thetop left.
15
Merging multiple data sets
Two strategies to merge data sets are presented here: pointless/aimless or XSCALE/XD-
SCONV. The former is simpler, the latter offers more control. A third strategy using the
program “BLEND” is presented in the next subsection. All of this work will be done on
SSRL.
Project folder setup: This will be for both strategies. Create a project-directory.
Inside project-directory create two new directories, bin1 and bin2. Inside bin1 make a
link to the first set of images. Inside bin2 make a link to the second set of images. If there
are more then 2 data sets, make binN with a link the respective images.
Inside of the bin directories, run xds or autoxds. Both will put a file named XDS ASCII.HKL
somewhere inside of binNumber (if you ran autoxds it is probably inside of a newly created
folder). You will use these XDS ASCII.HKL files in both methods of merging.
Pointless/aimless: navigate up to the project-directory and run pointless with the
following command:
pointless xdsin /bin1/...../XDS ASCII.HKL /bin2/...../XDS ASCII.HKL hklout
pointless.mtz
where the ...... will be the path from project-directory into the bin directory to the
XDS ASCII.HKL file.
After doing this run aimless, using the pointless.mtz as the input file, described above.
The resulting MTZ file can be used in Xtrphiage and Hyss/Phaser.
XSCALE/XDSCONV: navigate up to the project-directory. You will need to add
a file named XSCALE.INP to this directory. The XSCALE.INP contains the info that
xscale will use in order to merge the data sets. Google “xscale” for examples of what
XSCALE.INP files look like. The output file should be in the *.ahkl format.
Once you have XSCALE.INP pointing towards the two (or more) XDS ASCII.HKL
files, run xscale. You will then need to add a XDSCONV.INP file to your project-
16
directory. Again google “xdsconv” for examples of what XDSCONV.INP files look like.
Run xdsconv and follow the onscreen instructions, which will tell you to run f2mtz and
cad (it will give you the commands to enter). This should give you an MTZ file which can
be used in Xtriage and Hyss/Phaser.
17
BLEND
This subsection describes the use of BLEND to merge diffraction data. The subsection also
contains a brief overview of uploading images to SSRL and processing them, as thousands of
images from multiple data sets need to be processed before merging the data sets in BLEND.
Setting up the data directory on SSRL. If you have never uploaded images to SSRL before,
please review the subsection “Transferring images to SSRL” before you proceed.
Navigate to your data directory:
/mnt/data/u1/nowick/yourname
Make a new directory for storing the images associated with the data you just collected. Use
the following format to name this directory:
mkdir PEPTIDENAME TRAYNUMBER WELLNUMBER
Navigate into the new directory, and make a series of new directories called:
bin1, bin2, bin3...
The following command is a simple way to make these directories by using a terminal window:
mkdir bin1 bin2 bin3
Upload a set of 360 images to each bin. Look at the last three numbers of the image file
names to establish a set: bin1 should contain images with the numbers 001.img –360.img;
bin2 should contain images with the numbers 400.img –760.img; bin2 should contain images
with the numbers 800.img –1160.img. The numbers may vary slightly for your data.
18
Setting up the working directory. This subsection contains a brief overview of setting
up a working directory, but in the context of using BLEND afterward. If you have never
created a working directory on SSRL, please review the subsection “Setting up a working
directory on SSRL” before proceeding here.
Make a working directory in your home folder. This directory will be located at the
following location:
/mnt/home1/nowick/yourname
Use the same format as the data directory to name the working directory:
mkdir PEPTIDENAME TRAYNUMBER WELLNUMBER
Navigate into the working directory. Create an array of bins for each data set stored in the
data directory. The following command is a simple way to make these directories by using
a terminal window:
mkdir bin1 bin2 bin3
Navigate into bin1. Create a symbolic link from bin1 to the respective bin in the data
directory. Do the same for the other bins. The symbolic link is created by typing the
following command in the terminal window:
ln -s /data/nowick/yourname/WHATEVER/bin1 images
For the sake of space, the name “WHATEVER” is used for the data directory. You will
need to use the actual name of the data directory. Navigate to bin2 and bin3, and create
symbolic links.
19
Processing the data in each bin with XDS. This subsection contains a brief overview of pro-
cessing data sets with XDS, but in the context of using BLEND afterward. If you have
never processed data in XDS, please review the subsection “Processing one data set col-
lected on the Rigaku with XDS” before proceeding here.
Place the XDS.INP and XDSCONV.INP files into bin1. Open a Data Processing termi-
nal by clicking the red circle icon at the bottom of the screen. Double click on a host server
that is not being used (i.e. all of the numbers are close to 0). Navigate to bin1 using the
terminal window and open the XDS.INP file in gedit by typing the following command:
gedit XDS.INP
Edit the XDS.INP file and run XDS for each of the bins. Change beam center coordinates
(recorded from Rigaku software) and tell XDS what images to process. The XDS.INP is
annotated to help you find these values. Close and save the file. Then run XDS by typing
the following command:
xds -par
Copy the XDS.INP and XDSCONV.INP into bin2 and bin3, edit the XDSCONV.INP
files accordingly, and run xds. The following commands can be used to easily copy the
XDS.INP and XDSCONV.INP files to bin2 and bin3 by using the terminal window:
cp XDS.INP ../bin2
cp XDSCONV.INP ../bin2
After running XDS in each bin, you are ready to merge the data using BLEND.
20
Merging the data with BLEND. These instructions describe how to merge multiple data sets
in BLEND. The program merges .hkl files from either XDS or imosfilm. To merge these
files, create a new directory called bin123 by typing the following command in the terminal
window:
mkdir bin123
Copy the XDS ASCII.HKL file from bin1, bin2, bin3, into the new bin123. The following
command is a simple and easy way to perform this task using the terminal window. While
in bin1, the command renames and copies the XDS ASCII.HKL file from bin1 to bin123:
cp XDS ASCII.HKL ../bin123/XDS ASCII 1.HKL
After tranferring each file and renaming them, navigate into bin123. Initialize the blend
program by typing the following into the terminal window and pushing enter:
blend -a /mnt/home1/nowick/yourname/WHATEVER/bin123
After initializing, type the following into the terminal window and push enter twice (caps
does not matter):
ANOMALOUS ON
When the program finishes running, type the following into the terminal window:
gedit CLUSTERS.txt
Make a mental note of the numbers in third column with the heading Cluster Height
(e.g. 1.919 and 2.845). Close the window. Initialize the blend program again by typing the
following into the terminal window, and by using a number larger than the 2.8 value:
blend -s 3
After initializing the program, type the following into the terminal window and push
enter (caps does not matter):
ANOMALOUS ON
Push enter again and wait for the program to finish. The program produces a new
directory within bin123 called “merged files”. The new reflection files are stored within
this directory and are called “scaled 001.mtz”, “scaled 002.mtz”, etc. You are now ready to
21
proceed to phenix and look at the .hkl files using xtriage.
22
Phenix
Once you have an MTZ file, you can start using programs within PHENIX to phase/re-
fine/solve etc. your structure. PHENIX is launched on SSRL by opening a Processing
terminal and entering:
phenix
this command will launch the PHENIX user interface, from which you can access other
programs such as Xtriage, Phenix.refine, or Phaser (to name a few) (Figure 2).
Figure 2: UI of PHENIX. You can open the programs within PHENIX by clicking onthe blue bars on the right side of the window. The space on the left list current and pastprojects.
23
Xtriage
Dr. Ryan Spencer has compiled an HTML guide (at: http://www.chem.uci.edu/~jsnowick/groupweb/Crystallography/crystallography.html)
in which he explains how to use an aimless.mtz in Xtraige to asses the quality of your data.
The guide is illustrated with annotated images that should guide you through Xtriage.
24
Reflection file editor
Before running HySS and Autosol, your mtz file needs to be edited with the reflection file
editor. This gets rid of the “N(+)” and “N(-)” data in your file that somehow autosol has
no idea how to interpret. Under Reflection tools, click the Reflection file editor utility.
Figure 3: UI of Reflection file editor.
Click on the “Add file” button to add your .mtz file to the editor. In the table called
“All input arrays”, select the files IMEAN,SIGIMEAN and the I(+),SIGI(+),I(-),... and add
them to the table “Output arrays:” by clicking the + button at the bottom of the page.
25
Figure 4: Example of how to edit the reflection file.
Click the “Output options” tab. Change the output options so that the file goes to the
desired location and has a sensible name.
26
Figure 5: Changing the output options in the Reflection file editor.
Then click the Run button at the top of the window. If the run was successful, a message
should appear with the heading “Status” and with the number of reflections written. You
are ready to proceed to HySS and AutoSol.
27
HySS and AutoSol
These programs are used to generate phase information for your data set with anomalous
dispersion techniques (such as SAD). Dr. Ryan Spencer has compiled an HTML guide
(at: http://www.chem.uci.edu/~jsnowick/groupweb/Crystallography/crystallography.html)
in which he explains how to use these two programs to phase your data. The guide is
illustrated with annotated images that should guide you through the process.
Phaser
Phaser is used to determine your phases by molecular replacement or isomorphic replace-
ment.
Use needs to be written.
28
Building models
This section is written to guide you through building your first molecule in your model from
scratch. This will almost entirely be done in Coot. The subsections are not organized in a
linear fashion, you will often be jumping around them through out this process.
Anatomy of a model
A model is composed of various chains. Chains can include polypeptides (i.e. residues
1–16 of a peptide) or lignads. Chain ID’s will be listed as letters (i.e. Chain A or Chain
B etc.) Generally all ligands will be in their own Chain (i.e. all water molecules will be
listed under Chain H whereas all Cl atoms will be under Chain I).
Introduction to Coot
We use Coot to build models, as well as manipulate them during the refinement processes.
The UI for Cootis pretty simple. See figure 6 for more detail. Loading files is simple; for
electron density maps, File -> Auto Open MTZ, select the desired MTZ file and click
Open. For models (i.e. PDB files), File -> Open Coordinates, select the desired PDB
file and click Open. Often times, Coot will color the map and model odd colors. You can
change them with Edit -> Bond Colors or Edit -> Map. Green for bond colors, and
blue for the map provides decent contrast.
29
Figure 6: UI of Coot. Various tools exist in the menus along the top. Commonly used toolshave icons running along the right side of the window. The center of the screen displays themap as a mesh, as the model as lines/sticks. The colors of both can be adjusted. The pinkbox in the center of the screen indicates where items will be placed, should you ever addresidues/ligands to the model.
Fitting a model to density
The main command you will use to fit a residue or model to density is called Real-space
Refinement (blue circle on top right panel in Figure 6). Clicking on this, then a residue
(or range of residues) will move them to what Coot thinks is best. You have the choice to
accept or reject the new placement in addition to dragging the atoms to try to get Real-
space Refinement to readjust. You can also manually adjust a residue or molecule with
other tools on the right panel. There is no best way to get a residue to fit, it will take a lot
of time and practice with Coot. Ornithine turns are particularly challenging.
30
Changing the numbers of residues/ligands in Chains
Use the “Calculate -> renumber resides” in Coot to alter the numbers of residues in a
Chain. You will need to select which PDB file and Chain you would like to alter as well as
which residue(s) you want to renumber. You renumber residues by telling Coot what offset
you would like. To change a residue number from 2 to 3, you would enter residue “”2 to
2” with an offset of “1”. If you wanted to change residue 2 to 1, you would enter “2 to 2”
with an offset of “-1”. This can also be done with ranges of residues. See Figure 7 for an
illustration of this.
Figure 7: Example of renumbering residues in Coot. In this picture, residues 1, 2, and 3 ofChain A will be changed to 3, 4, and 5.
Adding normal amino acids to your model
This is fairly straight forward. On the right hand side of Coot there are two commands
you will use. First, Add residue, click on this, followed by the C -terminus of a Chain.
This will add an alanine, with the proper numbering, to the Chain you clicked. Then use
Simple mutate to mutate the placed alanine to the desired amino acid. You are limited to
the one of the standard twenty amino acids for this.
31
Adding unnatural amino acids or ligands to your model
This requires much more effort then adding simple amino acids. You will use this approach
for ornithines (both turns and α-linked), all N -methyl amino acids, as well as various ligands
(such as methyl-pentanediol etc.) that may exist in your map.
1. Import the CIF file to Coot for the amino acid/ligand you want to place. CIF files are
saved on SSRL in a folder /nowick/Ryan/CIF.... It is advisable to download this folder to
your own computer and keep it somewhere easy to get to. To add a CIF, “File -> Import
CIF dictionary...”, select the CIF file for the residue/ligand you would like to load. Coot
already has CIFs for some ligands, such as HEPES or MPD. It does not have CIFs for
ornithine, or N-methyl amino acids.
2. Add the amino acid. File -> Get Monomer. A window will pop up, you will need
to enter the 3-letter code for the residue/ligand you want to place. You can open the CIF file
in a text editor to view the three letter code. Clicking Okay will place the residue/ligand
at the location of the pink square. The residue/ligand will be placed in a separate PDB file
form your model, you now need to merge the placed item, with your model.
3. Before merging the molecules, you need to change the Chain ID and Residue
Number of the placed residue/ligand to what they need to be in the final model. This is
easiest to explain with a real example. Lets say the next amino acid you need to place is an
Orn turn. It needs to be in Chain D Residue 2. First use the “Calculate -> renumber
resides” to change its residue number. Note you will need to change the Molecule you are
working in (top drop down menu in Figure 7). Now use “Calculate -> Change Chain
ID” to change the ID. Again you will need to select the proper molecule as you did for
chainging residue number. See Figure 8 for an example of changing a chain.
32
Figure 8: Example of changing Chain ID in Coot. In this picture, all of Chain A will bechanged to Chain C.
4. Now you can merge the molecules. Calculate -> Merge Molecules. Check the
ligand you want to merge, as well as the model you want to merge into (i.e. your model),
then click Merge. See Figure 9 for an example.
33
Figure 9: Example of merging molecules Coot. In this picture, DVA will be merged intothe model. Both molecules need to be checked in the Append/Insert Molecule(s) field.Additionally, the model needs to be selected in the drop down menu.
5. The above will copy the residue/ligand into your model, but it will not remove the
residue/ligand you added in step 2. File -> Delete Molecules and Maps. Check the
residue/ligand you added (don’t worry, if you did the merging steps right the coordinates
are in your model), and click Delete Marked Molecules and Maps. All done.
It is highly recommended that you save your coordinates at this point.
Numbering residues in a completed macrocycle
Numbering you macrocycles like this makes building the model much simpler. This entire
guide is based on this numbering scheme. If you deviate from it, you may not be albeit to
copy and paste things. I.e. it will require more thinking on your part.
Numbering is easiest to explain with a real world example. Consider the peptide, orn-
LVFFAED-orn-AIIGLMV as the example, where LVFFAED is the top strand, AIIGLMV
is the bottom strand, and there are two ornithine turn units connecting them. Your final
34
macrocycles should be numbered as follows:
V1-Orn2-L3-V4-F5-F6-A7-E8-D9-Orn10-A11-I12-I13-G14-L15-M16
Use the “Calculate -> renumber resides” in Coot to adjust the numbering to match
the style above. At this point you should have a completed macrocycle.
textbfIt is highly recommended that you save your coordinates at this point.
Placing additional macrocycles in your model
Often your structure will have more then one macrocycle in the asymmetric unit. It is easiest
to just “Copy-Paste” the first structure you made into the density, as opposed to building
each one from scratch. This section will tell you how to “Copy-Paste”.
1. Move the pink box to a place where you would like to place the next macrocycle.
Place another copy of your completed macrocycle in Coot, File -> Open Coordinates.
You must select Recenter Molecule here. Click Open. See figure 10 for an example.
35
Figure 10: Example of placing a new molecule Coot. Note, you need to select RecenterMolecule Here in the dropdown menu, otherwise it will place this model on top of theexisting one.
2. re-orient your newly placed macrocycle in the density.
3. Now change the Chain ID to the next chain, alphabetically (i.e. if you already have
Chain A in your molecule, change the newly placed molecule to Chain B). Now merge this
with your model. Close the macrocycle like you did in Adding unnatural amino acids
or ligands to your model. Repeat this until all your macrocycles have been placed.
It is highly advisable that you save your coordinates after you merge each
36
macrocycle.
37
Refinement
This section is meant to inform on how to use phenix.refine. It is not meant to be an
exhaustive guide on how to get your Rwork or Rfree to be amazing. Think of this section as
a button pushing guide to phenix.refine.
Workflow
We use phenix.refine to refine our structures on SSRL. This requires using the program
NoMachine to log into SSRL’s servers. You will manipulate the coordinates of your models
form the refinement on your own computer. This requires an SSH client to download the
results of reach refinement, and re-upload your modified model.
Introduction to phenix.refine
Phenix.refine is found in the PHENIX software suite. To start phenix.refine, log into
SSRL, open a Data Processing Terminal, and launch phenix. Phenix.refine is found
under on the right under Refinement (See figure 2 for location). phenix.refine has three
tabs: an Input Data tab (Figure 11), a Refinement Settings tab (Figure 12), and an
Output tab (Figure 13). See the figures and their legends for more information.
38
Figure 11: Input Data tab of phenix.refine. You will need to add an MTZ, a PDB, andany CIF files for “odd” ligands/residues that are in your model. The majority of the labelswill autofill once those three items are loaded.
Figure 12: Refinement Settings tab of phenix.refine. Here is where you outline the re-finement strategy. Highly recommend up-ing the number of processers to 8. The Modifyselections for... dropdown menu can be changed.
39
Figure 13: Output tab of phenix.refine. For the most part you will not change anythingon this tab, unless you are making omit maps.
Files made and used by phenix.refine
Phenix.refine will make a folder named Refine NUMBER, inside of which will be placed
a number of files (Figure 14). After each refinement, you should download the Refine NUMBER
to your computer. Load the *.mtz and *.pdb in Coot. Make your modifications and reu-
pload your new PDB into the Refine NUMBER folder on SSRL.
Figure 14: Examples of files made by phenix.refine. The once you will most often be usingare * data.mtz, *.mtz, and *.pdb.
40
Phenix.refine requires an MTZ file. Give it the * data.mtz (make sure you are always
using the * data.mtz from the most recent refinement. Also add your modified PDB as
well as any CIF files. Run a new refinement. Rinse and repeat.
Forcing amide restraint for N-to-C cyclization
On the Refinement Settings tab (Figure 12), click on the Custom Geometry Re-
straints button. A new window will pop up (Figure 15). You need to specify the distance,
angle and plane for connecting residue 1 to residue 16. The rest of the instructions in this
section assume you followed the numbering guide in Numbering residues in a completed
macrocycle. If you didn’t you wont be able to copy and paste.
Figure 15: Custom Geometry Restraints window in phenix.refine.
The following needs to be done for each macrocycle in your model. It is best
to do one, then click “Update and Exit” to save your progress, as this window
is prone to crashing.
Bonds. Click Add. Select (residue 1, atom N) and (residue 16, atom C). Click Update
Selections. Click Other Options, set ideal bond length to 1.326, and sigma to 0.01.
41
Angle. Change to Angle tab. Click Add. Select (residue 16, atom O) and (residue 16,
atom C) and (reside 1, atom N). Click Update Selections. Click Other Options, set
ideal angle to 123 and sigma to 1.6.
Plane. Change to Planes tab. Click Add. Paste/type the following into the selection:
(chain ’A’ and resid ’ 1 ’ and name ’ CA ’ and altloc ’ ’ ) or (chain ’A’ and resid ’ 1 ’ and
name ’ N ’ and altloc ’ ’ ) or (chain ’A’ and resid ’ 16 ’ and name ’ C ’ and altloc ’ ’ ) or
(chain ’A’ and resid ’ 16 ’ and name ’ O ’ and altloc ’ ’ ) or (chain ’A’ and resid ’ 16 ’ and
name ’ CA ’ and altloc ’ ’ ) or (chain ’A’ and resid ’ 1 ’ and name ’ H ’ and altloc ’ ’ )
Change the Chain ID to match each chain in your model. Click Update Selections.
Click Other Options, set planarity sigma to 0.02.
Adding waters to your model
Simply check Update Waters on the Refinement Settings tab (Figure 12). And run
the refinement. In subsequent refinements you will need to uncheck Update Waters. If
you do not, it will replace any waters you may have moved or deleted. You can also specify
various simple ions (Cl, Na etc.) in the adjacent field. It will only place these when Update
Waters.
42
Depositing
Preparing a PDB for deposition
Open the PDB file in a text editor. In the REMARK section of the PDB file, there is a line
that states: “IF THIS FILE IS FOR PDB DEPOSITION: REMOVE ALL FROM THIS
LINE UP”. Go ahead and do that.
You will need to renumber your residues in each chain, in Coot. From the Numbering
residues in a completed macrocycle seciton you should have the following:
V1-Orn2-L3-V4-F5-F6-A7-E8-D9-Orn10-A11-I12-I13-G14-L15-M16
Change it to this:
Orn1-L2-V3-F4-F5-A6-E7-D8-Orn9-A10-I11-I12-G13-L14-M15-V16
The easiest way to do this is to offset residue 1 by +16 (i.e. set it to residue 17) then
offset residues 2–17 by -1.
Remove any TER terms that separate the ligands (i.e. waters, chlorines etc.). There
should be a TER between your last peptide chain and the first ligand as well as between
each peptide chain. There should not be a TER between the last ligand and the END located
at the end of the PDB file.
You will need to change the atom ID for the α-amonium of the ornithine turns. They
are currently NQ.1 In a text editor, find all NQ’s and replace them with “N ”. Note there is
a space there, it is very important that you maintain the number of characters in each line.
Change all CQ’s to “CA” in the same fashion
Change the residue ID for ORT or ORA to ORN.
Once the above changes are made, you can head over to rcsb.org and deposit the structure.
1The NQ desigantion for the α-amine is necessary to build the δ-linkage in Coot. If they’re set to N,Coot has a hard time dealing with the δ-linked Orns. Same issue with α-carbons being named CQ.
43
Miscellaneous
This section is a collection of tips and guides that do not logically fit into other sections.
Making CIF files for ligand/amino acids with eLBOW
Making a CIF file in eLBOW requires the use of PyMOL, MacroModel, and PHENIX.
The first step is to build a PDB of the molecule you need a CIF for (PyMOL is good
for this). If you are making an amino acid, place alanine in PyMOL and build off the side
chain. Amino acids need to be the di-radical (i.e. not -NH2 and -CO2H). Save your newly
built molecule as a PDB file.
Now you will use eLBOW in PHENIX (found under Ligands). Select input file as a
PDB file with hydrogen and/or CONECT records. Use simple optimization. Select the PDB
file you just built as the Geometry File. You can specify the Ligand ID as a three letter
code. The three letter code SHOULD NOT EXIST IN THE CIF DICTIONARY
COOT HAS. Its a good idea to open Coot and use Get Monomer to search for a three
letter code that does not have a corresponding structure.
After running the eLBOW, you will need to check the CIF file against a realistic model.
eLBOW will often have weird bond lengths and angles, so you need to fix these. Build
the molecule in Macromodel, and minimize it with the MMFF or MMFFs force fields. If
you?re making an amino acid model the N-methylamide, N-acetyl derivative. compare all
of the bond lengths and angles in your CIF to those you measure after the minimization is
complete. Replace the values in the CIF to match the measured values. Your CIF is now
complete.
Fixing issues with hydrogens “blowing off” of your model in Coot
Often times when you do a Real-space Refinement in Coot, some hydrogens (typically
on ligands/amino acids for which you had to make the CIF file) will fly off the molecule
44
and end up in random spaces. This is an issue with the CIF file for the residue(s) you are
refining. Make sure the hydrogen atoms that are being “blown off” are named the same in
CIF file for that residue and your model. You can view the proper names of the atoms by
viewing the CIF file in coot. Edit -> Restraint, select residue. A new window will pop
up, select the tab for “Atoms”. To fix the problem make sure each atom name in the pdb
matches the atom name in the CIF.
45