Structure Calculations With Xplor
Structure Calculations With Xplor
Structure Calculations With Xplor
Douglas
1/21/09
The concept
Xplor-NIH is a highly sophisticated computer program that provides an interface between
computational bioinformatics and experimental structural biology. For our purposes, we will consider
Xplor-NIH as a tool to minimize a hybrid energy function.
segment
name=" "
chain
@TOPPAR:toph19.pep
sequence @@seq.txt end
end
end
write structure
output=col1.psf
end
stop
Let’s consider what this script does in a bit of detail. First it executes a “topology” statement. The
topology statement is where Xplor-NIH gets the information about the residues that comprise the
macromolecule, including the atoms, connectivity, bond angles, etc. Rather than enter all this
information in for every new script, we call the file topallhdg.pro. This file lives in the folder xplor-nih-
Justin T. Douglas
1/21/09
X.XX/toppar. The TOPPAR part is a shortcut to this directory. The “end” statement closes the
topology block.
Next the “segment” statement is executed. The segment statement is where Xplor-NIH generates the
molecular structure. The “name” statement can be used to put the molecule name at the end of each
line of any PDB files generated from the molecular structure (in this case it is set to nothing). The
“chain” statement is where Xplor-NIH receives the sequence of residues. The file toph19.pep is a
macro that defines the peptide bond. Next the script executes the “sequence” statement, which reads
in the protein sequence from the file seq.txt. There are “end” statements to close the sequence,
chain and segment blocks. Finally the “write structure” statement writes the structure file just created,
to the file “col1.psf”. The “end” statement closes the write structure block. The “stop” statement exits
the script.
IMPORTANT POINT: Just because your script runs and you get a PSF file, does not mean that you
are out of the woods. The real trick is to make sure the nomenclature of your PSF file matches the
nomenclature in your restraint tables. I have provided a script “check_psf.inp” to help us find any
inconsistencies. We can run this script and check the output file using the command
%grep '%' check_psf.out
If there are any lines like the following:
%NOESET-ERR: error in selection - no atoms spec.
then there are inconsistencies between the PSF file and the restraint table. Fixing these sorts of
problems can be a major headache. The main tool used to fix these inconsistencies is the “vector do”
statement. The syntax of this statement is
vector do ( name=“what you want to change it to”) (name what it is)
for instance
vector do (name="H1") (name HT1)
changes every instance of HT1 in your PSF file to H1.
In the case of 1g6j NMR restraints, we find that every restraint raises an error. Clearly there is
something wrong with our restraint tables. Consider a line from the NOE restraint table.
assi
(( segid " A" and resid 1 and name HA ))
(( segid " A" and resid 2 and name HN ))
The likely culprit is the “segid” statement. At no point have we defined the segid, hence Xplor-NIH
cannot find these atoms in the PSF file. We have two options. We could edit the NOE and dihedral
angle restraint tables and remove the segid statements. Perhaps a more pedagogical solution is to
set the segid to A. Consider the modified file “gen_psf_from_seq.inp”
topology
@TOPPAR:topallhdg.pro
end
segment
Justin T. Douglas
1/21/09
name=" "
chain
@TOPPAR:toph19.pep
sequence @@seq.txt end
end
end
write structure
output=ubi.psf
end
stop
In this case we use a “vector do” statement to set the segid for every atom in the molecule. This
change enables us to read in the restraints with no errors.
2) Calculating an initial extended structure.
To minimize our hybrid energy function, we need a starting structure. An extended starting structure
can be calculated. We will use the file “generate_template.inp” in the nmr tutorial directory as a
template. All the X-PLOR tutorial scripts use the symbol “{====>}” to alert the user to important parts
of the script that might need to edited. I have edited these lines of the script and run it to generate the
following PDB file.
{====>} {====>}
@g_protein_dihe.tbl {*Read restraints
dihedral angle restraints.*}
dihedral
reset
Justin T. Douglas
1/21/09
end
Let’s compare the two files. The “noe” statement allows us to read in the NOE restraints from the file
“noe.tbl”. The “nres” statement defines the maximum number of restraints in the file, and the “class”
statement enables us to set-up different NOE classes, if we chose (here we do not). The “end”
statement closes the noe block. In the original file, control is passed to the file “g_protein_dihe.tbl”,
which live in the nmr tutorial directory and contains Xplor-NIH statements. The file “dihe.tbl” does not
contain these Xplor-NIH statements, so to be more consistent with the logic of the “sa.inp” script, I
have added the “dihedral” statement to the script.
This script runs for ~ 6 minutes on my MacBook Pro with 2.4 GHz Intel Core 2 Duo Processor and 4
GB 667 MHz DDR2 SDRAM. 8 of 10 calculated structures of no violations and total energy less than
150 kcal.
{====>} {====>}
{*Name(s) of the family of final {*Name(s) of the family of final
structures.*} structures.*}
evaluate evaluate
($filename="refine_1_"+encode($count)+".p ($filename="refine_2_"+encode($count)+".p
db") db")
Figure 2. Superposition of lowest energy structure calculated from NOE and dihedral restraints
downloaded from BMRB, accession code 4245, (red) and lowest energy structure of 1G6J.pdb.
Justin T. Douglas
1/21/09