Leadit Flexx Ug
Leadit Flexx Ug
Leadit Flexx Ug
Protein-Ligand Docker
User & Technical Reference
as Part of LeadIT 2.0
The development of FlexX began back in 1993 in the context of the project RELIWE funded by the German
Federal Ministry of Education and Research (BMBF). In the following three years, Matthias Rarey and Stephan
Weng under the leadership of Thomas Lengauer laid the foundation of FlexX. For his PhD thesis, Matthias de-
veloped the rst prototype of FlexX. Subsequently, as the head of the Computational Chemistry group at GMD,
he guided the further development of FlexX and its various extension modules. Today he is the Director of the
Zentrum fuer Bioinformatik at the University of Hamburg and as such initiates a multitude of docking-related
projects and provides guidance to the developers also in his role as a member of the board of BioSolveIT. Since
1996, many people have contributed to this code and the related research. This list is an attempt to acknowledge
all co-workers in this ongoing project.
Developers (in alphabetical order)
Holger Claussen (nowBioSolveIT) developed FlexE to enable docking into ensembles of protein structures and
provided a multitude of general enhancements in FlexX
Ingo Dramburg (now self-employed) developed a SMARTS
TM
-based mechanism and chemical rules for sys-
tematic testing and manipulation of compounds
Marcus Gastreich (now BioSolveIT) contributed signicantly to an improved chemical model in FlexX
Sally Hindle (nowBioSolveIT) developed FlexX-Pharm, which allows docking to take place under constraints
placed in the protein active site
Andreas Kaemper (now Tbingen Univ.) implemented various enhancements for FlexX, e.g. the Tripos force
eld.
Bernd Kramer (now 4SC) developed parts of the scoring function and interaction model in FlexX
Markus Lilienthal (now BioSolveIT) implemented various post-optimization procedures and grid support
Gordon Mueller (now BioSolveIT) is steadily improving the robustness of the software underneath the surface
Matthias Rarey (now ZBH) developed the combinatorial extension module FlexX
c
adjunct for FlexX (c.f.
above)
Frank Sonnenburg (now BioSolveIT) developed the Python-wrapper PyFlexX
Stephan Weng (now BioSolveIT) developed major parts of the physico-chemical models behind FlexX
Other contributors (in alphabetical order)
Gerhard Barnickel (Merck KGaA) Joachim Boehm (Roche) Hans Briem (Schering)
Christian Buning (Sano-Aventis) Gerhard Klebe (Univ. of Marburg) Guenther Metz (Santhera)
Thomas Mietzner (BASF) Martin Stahl (Roche) Gert Vriend (Univ. of Nijmegen)
Students
Students involved in development and documentation (in alphabetical order): Christoph Bernd, Claus
Hiller, Birte Seebeck, and Marc Zimmermann.
Institutions
We (the developers) acknowledge GMD(nowFhG) for supporting the work on FlexX for nearly a decade
and BMBF for funding FlexX-related scientic work. We also thank our early cooperation partners BASF
AG (Ludwigshafen), Merck KGaA (Darmstadt), and Boehringer Ingelheim (Biberach an der Riss) for
enthusiasm, support and many fruitful discussions. Also, we are grateful for support from Tripos during
many years of a fruitful developer-distributor relationship. Finally, we would like to thank all other
companies and academic institutions who collaborated with us and helped us in a multitude of ways to
create this great piece of software.
Copyright
This document contains proprietary information of BioSolveIT GmbH and is protected by copyright.
It is provided together with Software of BioSolveIT under a license agreement and may be used only
in accordance with the terms and conditions of this agreement. The document serves solely for the
purpose of using the Software. No part of the document may be transferred to any third party or
reproduced as a whole or in parts without written permission from BioSolveIT.
Additional copyright notes
Software-basis (C) 2001 by Fraunhofer Gesellschaft (FhI-SCAI)
Getline library (C) 1993 by Chris Thewalt
PVM library (C) 1997 by University of Tennessee, Knoxville TN
Python library (C) 1991-95 by Stichting Math. Centrum, Amsterdam, The Netherlands
Torsion angle data (C) by GMD SCAI, CCDC, BASF AG
c 2011 BioSolveIT GmbH, An der Ziegelei 79, 53757 Sankt Augustin, Germany, www.biosolveit.de
Phone ++49-2241-2525-0, support@biosolveit.de
The development of FlexX began back in 1993 in the context of the project RELIWE funded by the German
Federal Ministry of Education and Research (BMBF). In the following three years, Matthias Rarey and Stephan
Weng under the leadership of Thomas Lengauer laid the foundation of FlexX. For his PhD thesis, Matthias de-
veloped the rst prototype of FlexX. Subsequently, as the head of the Computational Chemistry group at GMD,
he guided the further development of FlexX and its various extension modules. Today he is the Director of the
Zentrum fr Bioinformatik at the University of Hamburg and as such initiates a multitude of docking-related
projects and provides guidance to the developers also in his role as a member of the board of BioSolveIT.
Since 1996, many people have contributed to this code and the related research. This list is an attempt to ac-
knowledge all co-workers in this ongoing project.
Developers (in alphabetical order):
Holger Claussen (now BioSolveIT) developed FlexE to enable docking into ensembles of protein struc-
tures and provided a multitude of general enhancements in FlexX
Ingo Dramburg (now self-employed) developed a SMARTS
TM
-based mechanism and chemical rules for
systematic testing and manipulation of compounds
Marcus Gastreich (now BioSolveIT) contributed signicantly to an improved chemical model in FlexXs
static data
Sally Hindle (now BioSolveIT) developed FlexX-Pharm, which allows docking to take place under con-
straints placed in the protein active site
Andreas Kmper (now Tbingen Univ.) implemented various enhancements and extensions in FlexX,
e.g. the Tripos force eld
Bernd Kramer (now 4SC) developed parts of the scoring function and interaction model in FlexX
Markus Lilienthal (now BioSolveIT) implemented various post-optimization procedures and grid-
enabling technology
Gordon Mller (now BioSolveIT) is steadily improving the robustness of the software underneath the
surface
Matthias Rarey (now ZBH) developed the combinatorial extension module FlexX
c
adjunct for FlexX (c.f.
above)
Frank Sonnenburg (now BioSolveIT) developed the Python-wrapper PyFlexX
Stephan Weng (now BioSolveIT) developed major parts of the physico-chemical models behind FlexX
Other contributors (in alphabetical order):
Gerhard Barnickel (Merck KGaA) Joachim Bhm (Roche) Hans Briem (Schering)
Christian Buning (Sano-Aventis) Gerhard Klebe (Univ. of Marburg) Gnther Metz (Santhera)
Thomas Mietzner (BASF) Martin Stahl (Roche) Gert Vriend (Univ. of Nijmegen)
Students involved in development and documentation (in alphabetical order) were: Christoph Bernd, Claus
Hiller, Birte Seebeck, and Marc Zimmermann.
Institutions:
We (the developers) acknowledge GMD (now FhG) for supporting the work on FlexX for nearly a decade and
BMBF for funding FlexX-related scientic work. We also thank our early cooperation partners BASF AG (Lud-
wigshafen), Merck KGaA(Darmstadt), and Boehringer Ingelheim(Biberach an der Riss) for enthusiasm, support
and many fruitful discussions. Also, we are grateful for support from Tripos during many years of a fruitful
developer-distributor relationship. Finally, we would like to thank all other companies and academic institu-
tions who collaborated with us and helped us in a multitude of ways to create this great piece of software.
This document contains proprietary information of BioSolveIT GmbH and is protected by copyright. It is provided together
with Software of BioSolveIT under a license agreement and may be used only in accordance with the terms and conditions
of this agreement. The document serves solely for the purpose of using the Software. No part of the document may be
transferred to any third party or reproduced as a whole or in parts without written permission from BioSolveIT.
Base software: c 2001 by Fraunhofer Gesellschaft (FhI-SCAI); Getline library: c 1993 by Chris Thewalt; PVM library:
c 1997 by University of Tennessee, Knoxville TN; Python library: c 1991-1995 by Stichting Mathematisch Centrum,
Amsterdam, The Netherlands; Torsion angle data: c by GMD SCAI, CCDC, BASF AG.
c 2011 BioSolveIT GmbH, An der Ziegelei 79, 53757 St. Augustin, Germany
Phone ++49-2241-2525-0, support@biosolveit.de
Foreword
Dear Customer, Dear Users:
BioSolveIT is proud to present FlexX in LeadIT . With this package, you hold in your
hands one of the most cited industrial docking tools. FlexX in LeadIT includes Recep-
tor Intelligence
TM
, a radically different and simple way to setup and conduct a docking. We
have rm belief that FlexX in LeadIT can be used by computational and medicinal chemists
alike. The entire suite has been designed to abolish learning curves.
FlexX accurately predicts protein-ligand complexes; it does so by fragmenting a ligand at
rotatable bonds and reassembling it within a binding pocket. Taking a time average over
screening runs, a single docking is completed within 5 to 10 secs. With the advent of the
optional SCREEN module, dockings take less than 1 second. FlexX has been thoroughly
validated in more than 600 publications to date, and FlexX is continuously improved in
close collaboration with its users.
Here are some highlights of this release:
Aggressively easy Receptor Intelligence to setup and conduct dockings
Project Support including annotations and much more
Direct transfer of data into ReCore with 1 mouse click
KNIME
TM
support
MOE
TM
support
PipelinePilot
TM
dragndrop connection
There are many features that we plan to add in due course. The power of the FlexX
command line of will continue to exist including all its powerful features. If you think
something is missing, please let us know. Just send an email to contact@biosolveit.de.
This guide will provide a brief introduction to the basic procedure for setting up a
FlexX docking run.
Thank you for using FlexX! We wish you much pleasure and success with FlexX please
do give us feedback!
Your developers at BioSolveIT
3
4
Contents
Contents 5
I Introduction
1 About FlexX and This Guide 17
1.1 Running Modes & Workows Very Briey . . . . . . . . . . . . . . . . . . . . . 17
1.2 How to read this guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Installation 19
2.1 Directory Structures, Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 External programs and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1 Torsion angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.2 Flexible ring systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.3 Parallel script execution . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 FlexX and MOE
TM
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 How to get more details if FlexX does not start / work? . . . . . . . . . 22
2.4 Known Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.1 Installation and Startup Issues . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.2 Libraries missing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.3 Ubuntu / Debian Distributions . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.4 Token not numeric error under Linux . . . . . . . . . . . . . . . . . . . 23
2.4.5 Insufcient memory under Windows . . . . . . . . . . . . . . . . . . . 23
2.4.6 Problems at Runtime of FlexX . . . . . . . . . . . . . . . . . . . . . . . . 24
3 Getting Help 25
3.1 PDF Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Commandline options and usage help . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
II User Guide
4 GUI vs. Commandline: The Best of Both Worlds 29
4.1 Conguration Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.1 Modication of conguration at Installation Level . . . . . . . . . . 30
4.1.2 User Level modication using the GUIs Conguration Dialog . . . 31
5
6 CONTENTS
5 FlexX in GUI Mode 35
5.1 A Typical GUI Workow (Tutorial) . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1.1 GUI Protein Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1.2 GUI Receptor Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1.3 Treatment of water molecules in FlexX . . . . . . . . . . . . . . . . . . 42
5.1.4 Pharmacophore Constraints in the GUI . . . . . . . . . . . . . . . . . . 43
5.1.5 Ligands in the GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.1.6 Docking within the GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.1.7 Docking Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1.8 Exporting Poses from the GUI . . . . . . . . . . . . . . . . . . . . . . . 52
5.1.9 Screening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2 Other GUI Export Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2.1 PipelinePilot
TM
Support . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6 FlexX in Commandline Mode I. Basic Usage 55
6.1 The FlexX shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.1.1 Menu navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.1.2 Upper and lower case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.1.3 Command parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.1.4 Command escape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2 Batch Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.3 Commandline Switches / Startup Options . . . . . . . . . . . . . . . . . . . . 57
6.3.1 Loading Projects and receptor les (lename without option) . . . . . 57
6.3.2 Commandline options and their arguments . . . . . . . . . . . . . . . . 57
6.3.3 Arguments for batch processing (-a) . . . . . . . . . . . . . . . . . . . . 57
6.3.4 Batch mode (-b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.3.5 Specifying the execution directory (-d) . . . . . . . . . . . . . . . . . . . 57
6.3.6 Help for command line options (-h, --help, ?) . . . . . . . . . . . . . . . 57
6.3.7 Output the host ID or system ID (-i) . . . . . . . . . . . . . . . . . . . . 57
6.3.8 Logging the FlexX session (-l) . . . . . . . . . . . . . . . . . . . . . . . . 57
6.3.9 Nice value (-n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.3.10 Redirecting output (-o, -om) . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.11 Verbosity (-q) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.12 Interface option (-s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.13 Version information (-v) . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.14 Running docking in GUI mode (--rundock) . . . . . . . . . . . . . . . . 58
6.3.15 Export docking poses (--poses) . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.16 Maximum number of poses to export (--nof_poses) . . . . . . . . . . . 58
6.3.17 Export score table (--scoretab) . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.18 Export solutions table (--soltab) . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.19 Exiting right after docking (--exit) . . . . . . . . . . . . . . . . . . . . . 59
6.4 Errors and warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.5 A Typical Commandline Interface Workow (Tutorial) . . . . . . . . . . . . . 60
6.5.1 Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.5.2 The ligand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.5.3 The receptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.5.4 Docking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
CONTENTS 7
6.5.5 Outputting information . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.5.6 Preparing the input data I: The Ligand . . . . . . . . . . . . . . . . . . 69
6.5.7 Preparing the input data II: The Protein . . . . . . . . . . . . . . . . . . 72
6.6 FlexX and FlexV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.6.1 Choosing your graphics interface . . . . . . . . . . . . . . . . . . . . . . 76
6.6.2 The DRAW commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.6.3 DRAWing and DISPLAYing . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.6.4 What are graphics objects? . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.6.5 Details of the SELxxx graphics commands . . . . . . . . . . . . . . . . 77
7 FlexX in Commandline Mode II. Menus and Commands 79
7.1 Menus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.2 Global commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.2.1 Quitting FlexX (QUIT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.2.2 Returning to the main menu (MAIN) . . . . . . . . . . . . . . . . . . . 80
7.2.3 Returning to the parent menu (END) . . . . . . . . . . . . . . . . . . . 80
7.2.4 Online help (HELP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.2.5 Viewing the User Guide (MANUAL) . . . . . . . . . . . . . . . . . . . 80
7.2.6 Short online help (?) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.2.7 Export conguration le (WRITECFG) . . . . . . . . . . . . . . . . . . 81
7.2.8 Listing environment variable settings (LIST) . . . . . . . . . . . . . . . 81
7.2.9 Changing values of environment variables (SET) . . . . . . . . . . . . . 81
7.2.10 Selecting the output destination for docking results (SELOUTP) . . . . 81
7.2.11 Sending a command to FlexV (TOFLEXV) . . . . . . . . . . . . . . . . 82
7.2.12 Sending FlexX graphic objects to the visualizer (DISPLAY) . . . . . . . 84
7.2.13 Erasing a graphics object (ERASE) . . . . . . . . . . . . . . . . . . . . . 85
7.2.14 Executing shell commands (! and EXEC) . . . . . . . . . . . . . . . . . 85
7.2.15 Executing internal unit tests (UNITTESTS) . . . . . . . . . . . . . . . . 85
7.3 Commands in the root menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.3.1 Deleting everything (DELALL) . . . . . . . . . . . . . . . . . . . . . . . 85
7.3.2 A complete docking run (AUTODOCK) . . . . . . . . . . . . . . . . . . 86
7.3.3 Executing a script le (SCRIPT) . . . . . . . . . . . . . . . . . . . . . . . 86
7.4 Working with projects (PROJECT submenu) . . . . . . . . . . . . . . . . . . . . 86
7.4.1 Reading (READ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.4.2 Run docking (RUNDOCK) . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.5 Working with ligands (LIGAND submenu) . . . . . . . . . . . . . . . . . . . . 87
7.5.1 Reading (READ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.5.2 Direct SMILES parsing (SMILES) . . . . . . . . . . . . . . . . . . . . . . 88
7.5.3 PDB import (FROMPDB) . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.5.4 Setting up the initialization procedure (SELINIT) . . . . . . . . . . . . 88
7.5.5 Cleaning up molecules (REINIT) . . . . . . . . . . . . . . . . . . . . . . 89
7.5.6 Ligand information: INFO/MOLINF . . . . . . . . . . . . . . . . . . . 90
7.5.7 Ligand manipulation (TRANSFORM) . . . . . . . . . . . . . . . . . . . 91
7.5.8 Checking SMARTS
TM
patterns and subgraph occurrence (SMARTS) . 91
7.5.9 Writing (WRITE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.5.10 Deleting (DELETE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.5.11 Reading reference coordinates (READREF) . . . . . . . . . . . . . . . . 93
8 CONTENTS
7.5.12 Assigning reference coordinates by subgraph matching (MAPREF) . . 93
7.5.13 Setting reference coordinates (SETREF) . . . . . . . . . . . . . . . . . . 94
7.5.14 Calculating the ligands solvent-accessible surface (SAS) . . . . . . . . 94
7.5.15 Selecting admin settings for drawing the ligand (SELADM) . . . . . . 94
7.5.16 Selecting graphics settings for drawing the ligand (SELGRA) . . . . . 95
7.5.17 Selecting colors for drawing the ligand (SELCOL) . . . . . . . . . . . . 96
7.5.18 Selecting labels for drawing the ligand (SELLAB) . . . . . . . . . . . . 99
7.5.19 Drawing the ligand (DRAW) . . . . . . . . . . . . . . . . . . . . . . . . 100
7.5.20 Drawing the ligand at multiple positions (MDRAW) . . . . . . . . . . 100
7.5.21 Listing the graphic items (GRAINF) . . . . . . . . . . . . . . . . . . . . 101
7.5.22 Minimizing the ligand coordinates (MINIMIZE) . . . . . . . . . . . . . 101
7.5.23 *Working with ligand conformations (LIGAND/CONFORM submenu) 101
7.6 Working with proteins (receptors) (RECEPTOR submenu) . . . . . . . . . . . 103
7.6.1 Reading (READ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.6.2 Summarizing PDB contents (PDBINFO) . . . . . . . . . . . . . . . . . . 103
7.6.3 ACTIVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.6.4 Printing site atom information (ATLIST) . . . . . . . . . . . . . . . . . . 104
7.6.5 Writing (WRITE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.6.6 Deleting (DELETE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.6.7 Editing the receptor description le (EDIT) . . . . . . . . . . . . . . . . 106
7.6.8 Outputting the most important information about a receptor (INFO) . 106
7.6.9 *Building the receptor triangle hash table (TRIHASH) . . . . . . . . . . 106
7.6.10 Subpocket Denition (DEEPSITE) . . . . . . . . . . . . . . . . . . . . . 107
7.6.11 Calculating the proteins solvent-accessible surface (SAS) . . . . . . . . 107
7.6.12 Selecting admin settings for drawing the receptor (SELADM) . . . . . 107
7.6.13 Selecting graphics settings for drawing the receptor (SELGRA) . . . . 108
7.6.14 Selecting colors for drawing the receptor (SELCOL) . . . . . . . . . . . 110
7.6.15 Selecting labels for drawing the receptor (SELLAB) . . . . . . . . . . . 111
7.6.16 Drawing the receptor (DRAW) . . . . . . . . . . . . . . . . . . . . . . . 112
7.6.17 Listing the graphic items (GRAINF) . . . . . . . . . . . . . . . . . . . . 112
7.7 *Changing the static data (DATABASE submenu) . . . . . . . . . . . . . . . . 113
7.7.1 Adjusting the settings of the scoring function (SELSCO) . . . . . . . . 113
7.7.2 Listing the settings of the scoring function (LISTSCO) . . . . . . . . . . 113
7.7.3 Decrypting static data les (DECRYPT) . . . . . . . . . . . . . . . . . . 113
7.8 Docking (DOCKING submenu) . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.8.1 Selecting the base fragments (SELBAS) . . . . . . . . . . . . . . . . . . 114
7.8.2 Placing base fragments (PLACEBAS) . . . . . . . . . . . . . . . . . . . 115
7.8.3 Building up the complex (COMPLEX) . . . . . . . . . . . . . . . . . . . 116
7.8.4 Local optimization of a complex (OPTIMIZE) . . . . . . . . . . . . . . 117
7.8.5 Interactive selection of solutions (SELECT) . . . . . . . . . . . . . . . . 118
7.8.6 Clustering solutions (CLUSTER) . . . . . . . . . . . . . . . . . . . . . . 118
7.8.7 Writing placements in pdf format (WRITE) . . . . . . . . . . . . . . . . 118
7.8.8 Reading placements in pdf format (READ) . . . . . . . . . . . . . . . . 119
7.8.9 Deleting a docking (DELETE) . . . . . . . . . . . . . . . . . . . . . . . . 119
7.8.10 Sorting the list of placements (SORT) . . . . . . . . . . . . . . . . . . . 119
7.8.11 Outputting the most important quantities of a docking result (INFO) . 119
7.8.12 Outputting a solution tab row (SOLTAB) . . . . . . . . . . . . . . . . . 120
CONTENTS 9
7.8.13 Listing solutions (LISTSOL) . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.8.14 Listing solutions sorted by RMSD (LISTRMS) . . . . . . . . . . . . . . 122
7.8.15 Listing the matches of all solutions (LISTMAT) . . . . . . . . . . . . . . 122
7.8.16 Listing all solutions and matches (LISTALL) . . . . . . . . . . . . . . . 123
7.8.17 Listing one solution and the corresponding matches (LISTONE) . . . . 123
7.8.18 Performing specic queries on solutions and matches (QUERY) . . . . 123
7.8.19 Performing a specic query a second time (QHIST) . . . . . . . . . . . 125
7.8.20 Writing solutions in a table (PRINTSOL) . . . . . . . . . . . . . . . . . 125
7.8.21 Writing all energy and matching scores to a csv le (EXPORT) . . . . . 125
7.8.22 Selecting the admin settings for drawing placements (SELADM) . . . 126
7.8.23 Selecting graphics settings for drawing placements (SELGRA) . . . . . 127
7.8.24 Selecting colors for drawing placements (SELCOL) . . . . . . . . . . . 128
7.8.25 Selecting labels for drawing the placements (SELLAB) . . . . . . . . . 129
7.8.26 Drawing placements (DRAW) . . . . . . . . . . . . . . . . . . . . . . . 130
7.8.27 Drawing multiple placements (MDRAW) . . . . . . . . . . . . . . . . . 130
7.8.28 Listing the graphic items (GRAINF) . . . . . . . . . . . . . . . . . . . . 131
7.8.29 *Special commands for analyzing docking results (ANALYZE) . . . . 131
8 Additional modules for FlexX 141
8.1 Parallel Virtual Machine (PVM submenu) . . . . . . . . . . . . . . . . . . . . . 141
8.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.1.2 Starting PVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.1.3 Conguring PVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.1.4 Executing parallel batch les . . . . . . . . . . . . . . . . . . . . . . . . 143
8.1.5 Aborting and recovering . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.1.6 Killing a single work process . . . . . . . . . . . . . . . . . . . . . . . . 146
8.1.7 Working with parallel FlexX . . . . . . . . . . . . . . . . . . . . . . . . 146
8.1.8 Working with PVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.2 Docking of combinatorial libraries . . . . . . . . . . . . . . . . . . . . . . . . . 148
8.2.1 Generating combinatorial libraries from scaffolds (PERMUTE module) 148
8.2.2 Handling combinatorial libraries (CLIB submenu) . . . . . . . . . . . . 148
8.2.3 Pharmacophore constraints for combinatorial libraries (CLIB menu) . 161
8.2.4 Docking combinatorial libraries (CDOCK submenu) . . . . . . . . . . 162
8.2.5 Compatibility with other modules . . . . . . . . . . . . . . . . . . . . . 168
8.2.6 How to use PERMUTE; a tutorial . . . . . . . . . . . . . . . . . . . . . . 169
8.2.7 Handling of ring systems with PERMUTE . . . . . . . . . . . . . . . . 176
8.3 Docking under pharmacophore type constraints . . . . . . . . . . . . . . . . . 178
8.3.1 Pharmacophore type constraints . . . . . . . . . . . . . . . . . . . . . . 178
8.3.2 Running a FlexX-Pharm calculation . . . . . . . . . . . . . . . . . . . . 178
8.3.3 Conguring FlexX-Pharm . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.3.4 Preparing the input data . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.3.5 Menus and commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.3.6 Static data in FlexX-Pharm . . . . . . . . . . . . . . . . . . . . . . . . . 188
8.3.7 Compatibility with other modules . . . . . . . . . . . . . . . . . . . . . 190
8.3.8 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
8.4 Docking into ensembles of protein structures . . . . . . . . . . . . . . . . . . . 191
8.4.1 The ensemble description le . . . . . . . . . . . . . . . . . . . . . . . . 192
10 CONTENTS
8.4.2 Handling ensembles (ENSEMBLE submenu) . . . . . . . . . . . . . . . 193
8.4.3 Drawing the incompatibility graph (ENSEMBLE/GRAPH submenu) . 199
8.4.4 Generating ensembles (ENSEMBLE/GENRDF submenu) . . . . . . . 202
8.4.5 Additional receptor commands (RECEPTOR submenu) . . . . . . . . . 204
8.4.6 Additional docking commands (DOCKING submenu) . . . . . . . . . 204
8.4.7 FlexE specic parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.4.8 Compatibility with other modules . . . . . . . . . . . . . . . . . . . . . 207
8.5 Lattice energies and grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
8.5.1 Depth of the lattice points . . . . . . . . . . . . . . . . . . . . . . . . . . 208
8.5.2 Gauss function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
8.5.3 Working with grid-based energies (RECEPTOR/GAUSS submenu) . . 209
8.5.4 Using Gaussians for lter constraints . . . . . . . . . . . . . . . . . . . 216
8.5.5 Buriedness of active site . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
8.5.6 Program parameters (conguration) . . . . . . . . . . . . . . . . . . . . . 221
8.6 FlexX-Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
8.6.1 Evaluating molecule properties . . . . . . . . . . . . . . . . . . . . . . . 223
8.6.2 Generating interaction spots . . . . . . . . . . . . . . . . . . . . . . . . 230
8.6.3 Placebase caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
8.6.4 Compatibility with other modules . . . . . . . . . . . . . . . . . . . . . 245
8.7 PPI - The Pair Potential Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 246
8.7.1 File 1: PPI atom types for the receptor . . . . . . . . . . . . . . . . . . . 246
8.7.2 File 2: PPI types for small molecules, PPI_LIG_TYPES . . . . . . . . . . 246
8.7.3 File 3: PPI types for hetero groups and non-standard-residues,
PPI_REC_SUBSTR_TYPES . . . . . . . . . . . . . . . . . . . . . . . . . 247
8.7.4 File 4: The PPI potential le itself . . . . . . . . . . . . . . . . . . . . . . 247
8.7.5 Generating generic amino PPI static data les (AMINO4PPI) . . . . . . 249
8.7.6 Activating PMF/DrugScore Support with Old Files . . . . . . . . . . . 249
8.7.7 Interesting PPI Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . 250
9 Scripting 253
9.1 *Scripts in FlexXs Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
9.1.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
9.1.2 Script parameter lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
9.1.3 Loops: FOR_EACH/END_FOR, WHILE or FOREVER . . . . . . . . . 255
9.1.4 Branches: IF/ELSE/ENDIF . . . . . . . . . . . . . . . . . . . . . . . . . 257
9.1.5 One-of-n selection: SELINP . . . . . . . . . . . . . . . . . . . . . . . . . 257
9.1.6 Special script command: SETVAR . . . . . . . . . . . . . . . . . . . . . 258
9.1.7 Special batch le command: INPUT . . . . . . . . . . . . . . . . . . . . 258
9.1.8 Special script command: INCR . . . . . . . . . . . . . . . . . . . . . . . 258
9.1.9 Special batch le command: OUTPUT and OUTERR . . . . . . . . . . 258
9.1.10 Special batch le command: TIMER . . . . . . . . . . . . . . . . . . . . 258
9.1.11 Special batch le command: PROCSIZE . . . . . . . . . . . . . . . . . . 258
9.1.12 Special batch le command: WAIT . . . . . . . . . . . . . . . . . . . . . 258
9.2 Interface to PYTHON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
9.2.1 Getting started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
9.2.2 Working with PyFlexX . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
9.2.3 Special commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
CONTENTS 11
9.2.4 Changing the start-up conguration for pyexx . . . . . . . . . . . . . 261
III Technical Reference
10 Advanced Setup 265
10.1 Conguring FlexXs Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
10.1.1 Dening directory paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
10.1.2 Static data les . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
10.1.3 Dening paths to external programs . . . . . . . . . . . . . . . . . . . . 268
10.1.4 Dening control ags . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
10.1.5 Dening control strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
10.1.6 Dening a PBC server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
10.1.7 @PARALLEL: Dening a Parallel Virtual Machine . . . . . . . . . . . . 276
10.1.8 @ALIASES: Dening aliases for commands . . . . . . . . . . . . . . . . 276
11 Files and le formats 277
11.1 Molecular input le formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.2 Overview of lename extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 278
11.3 The receptor description le (.rdf le) . . . . . . . . . . . . . . . . . . . . . . 279
11.3.1 Section 1: Specifying the PDB le . . . . . . . . . . . . . . . . . . . . . . 280
11.3.2 Section 2: Specifying atoms to be read . . . . . . . . . . . . . . . . . . . 280
11.3.3 Section 3: Specifying the active site . . . . . . . . . . . . . . . . . . . . . 280
11.3.4 Section 4 (optional): Surface calculation and loading . . . . . . . . . . 281
11.3.5 Section 5: Mapping amino acids to templates . . . . . . . . . . . . . . . 281
11.3.6 Section 6: Including/excluding hetero atoms . . . . . . . . . . . . . . . 283
11.3.7 Section 7: Specifying alternate locations . . . . . . . . . . . . . . . . . . 283
11.3.8 Section 8: Specifying torsion angles for hydrogen atoms . . . . . . . . 284
11.3.9 Section 9: Resolving ambiguities in the PDB le . . . . . . . . . . . . . 285
11.4 Dening program parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
11.4.1 Receptor-ligand overlap . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
11.4.2 Generating conformations . . . . . . . . . . . . . . . . . . . . . . . . . . 286
11.4.3 Selecting the base fragment (docking algorithm, phase 1) . . . . . . . . 288
11.4.4 Placing the base fragment (docking algorithm, phase 2) . . . . . . . . . 289
11.4.5 Building up the complex (docking algorithm, phase 3) . . . . . . . . . 291
11.4.6 Applying lter functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
11.4.7 Force eld parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
11.4.8 Combinatorial docking parameters . . . . . . . . . . . . . . . . . . . . . 294
11.4.9 Approximation of the receptor surface . . . . . . . . . . . . . . . . . . . 295
11.4.10 Calculation of the metal coordination . . . . . . . . . . . . . . . . . . . 295
11.4.11 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
11.5 *Chemical parameters (chempar.dat) . . . . . . . . . . . . . . . . . . . . . . . . . 298
11.5.1 Van der Waals radii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
11.5.2 Bond length of heavy atoms . . . . . . . . . . . . . . . . . . . . . . . . . 298
11.5.3 Bond lengths and angles for hydrogens . . . . . . . . . . . . . . . . . . 299
11.5.4 Atom types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
11.5.5 Valence states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
12 CONTENTS
11.6 *Interaction types and compatibilities (contype.dat) . . . . . . . . . . . . . . . . 301
11.7 *Interaction geometries (geometry.dat) . . . . . . . . . . . . . . . . . . . . . . . . 303
11.7.1 Scoring function parameters . . . . . . . . . . . . . . . . . . . . . . . . 303
11.7.2 Associating interaction geometries with molecular groups . . . . . . . 307
11.7.3 Dening interaction geometries . . . . . . . . . . . . . . . . . . . . . . . 308
11.7.4 Computing the energy contributions of matched interaction groups . . 311
11.8 *Amino data (static data le AMINO) . . . . . . . . . . . . . . . . . . . . . . . 312
11.9 *Charges of receptor atoms (static data le CHARGES) . . . . . . . . . . . . . 316
11.10*Assigning data to the ligand: the subgraph data les . . . . . . . . . . . . . . 316
11.10.1 Dening groups of atoms . . . . . . . . . . . . . . . . . . . . . . . . . . 317
11.10.2 Dening subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
11.11*Ligand interaction groups (contact.dat) . . . . . . . . . . . . . . . . . . . . . . 319
11.12*Ligand torsion database (torsion_standard.dat) . . . . . . . . . . . . . . . . . . 320
11.12.1 Constraining amides to planarity . . . . . . . . . . . . . . . . . . . . . . 320
11.12.2 Fixing torsional angles at specied values: a sample case . . . . . . . . 320
11.13SMARTS
TM
support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
11.13.1 Atomic primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
11.13.2 Ring perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
11.13.3 Aromaticity perception and hybridization states . . . . . . . . . . . . . 327
11.13.4 Implicit hydrogens, valences and formal charges . . . . . . . . . . . . . 328
11.13.5 Bond primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
11.13.6 Logical operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
11.13.7 Recursive SMARTS
TM
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
11.13.8 Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
11.14Dening subgraphs using SMARTS
TM
. . . . . . . . . . . . . . . . . . . . . . . 330
11.15Using templates, vector bindings . . . . . . . . . . . . . . . . . . . . . . . . . . 331
11.16Transforming molecules via SMARTS
TM
. . . . . . . . . . . . . . . . . . . . . . 332
11.16.1 Formal charges and hydrogens . . . . . . . . . . . . . . . . . . . . . . . 332
11.16.2 Atom type assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
11.17Structure correction and atom type assignment . . . . . . . . . . . . . . . . . . 334
11.18Transformation rules (transform.dat) . . . . . . . . . . . . . . . . . . . . . . . . 335
11.19*Ligand formal charges (fcharges.dat) . . . . . . . . . . . . . . . . . . . . . . . . 336
11.20*Automatic correction of localized systems (delocalized.dat) . . . . . . . . . . . 336
11.21*Dening the descriptors for calculating logp values (logp.dat) . . . . . . . . . 337
11.22*Graphics (graphic.dat) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
11.22.1 Colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
11.22.2 Color modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
11.22.3 Dening atom colors (@atom-colors) . . . . . . . . . . . . . . . . . . . . 340
11.22.4 Dening colors for interaction (contact) types (@contact-colors) . . . . 340
11.22.5 Dening defaults for the graphics settings . . . . . . . . . . . . . . . . 340
11.22.6 Dening colors @colors . . . . . . . . . . . . . . . . . . . . . . . . . . 345
12 *Program interfaces 347
12.1 Interface to MOE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
12.1.1 Triggering FlexX From Within MOE . . . . . . . . . . . . . . . . . . . . 347
12.1.2 MOE as a Ring Conformer Generator . . . . . . . . . . . . . . . . . . . 347
12.2 Interface to Sybyl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
CONTENTS 13
12.3 Interface to WHATIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
12.4 Interface to SCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
12.5 Interface to CORINA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
12.6 Interface to CONFORT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
12.7 The FlexV graphical interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
IV Appendix
A Default rdf and edf les 353
A.1 The receptor description: An rdf le . . . . . . . . . . . . . . . . . . . . . . . . 353
A.2 The ensemble description: An edf le . . . . . . . . . . . . . . . . . . . . . . . 357
B Examples of script les 363
B.1 Script 1: dock_one . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
B.2 Script 2: dock_list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
C Additional copyright notes 365
Bibliography 367
Index 370
14 CONTENTS
I
INTRODUCTION
15
1
About FlexX and This
Guide
FlexX is a computer program for predicting protein-ligand interactions. For a given pro-
tein and a ligand, FlexX predicts the geometry of the complex as well as an estimate for the
strength of binding. In this rst version of FlexX, the protein is assumed to be rigid. Thus,
the protein must be given in a conformation which is similar to the bound state. The dock-
ing algorithm in FlexX works without manual intervention. Nevertheless, in some cases
additional information about the ligand or even the complex is known. You can integrate
this knowledge in the computations with FlexX by performing single steps manually. Thus,
FlexX is ideal for interactive work on protein-ligand complexes as well as for screening a
larger set of ligands in order to nd new leads for drug design. In summary, FlexX can be
useful in the following situation:
You have a good three-dimensional model of the protein and you know the lo-
cation of the active site. You have a set of ligands and you want to know whether
and how each of them binds to your protein model.
Before you start working with FlexX, we would remind you that FlexX is software under
steady and current development. We do test the program with a continuously growing set
of proteins and ligands, but we are sure that FlexX is not error-free.
To understand and interpret the results produced with FlexX, it cannot harm to know some-
thing about the underlying models and algorithms. This topic is not covered in this User
Guide, we refer to the following literature [2, 3, 14, 16, 17, 18, 19, 22, 24, 21, 20].
1.1 Running Modes & Workows Very Briey
The way we envision your major workows with FlexX is:
1. Preparation of the binding site using the Receptor Intelligence of the Receptor Prepa-
ration Wizard
This includes selection of chains, receptor protonation, tautomers, etc.
2. Optional: Denition of a receptor-based pharmacophore
3. Read in of a ligand Docking Library (one or more ligands)
4. Docking including possibly altering of parameters and further cycles of docking
17
18 CHAPTER 1. ABOUT FLEXX AND THIS GUIDE
The character of this workow is driven by few ligands and/or a preparational stage. It can
easily and comfortably be accomplished with LeadIT .
For the experts:
A Receptor, once prepared with the Wizard, can be saved and read into the commandline
mode of FlexX.
Precondition:
You must have one or more ligands in a low-energy conformation ready to dock. FlexX
does not minimize ligands before docking. The docking process deals with the translational,
torsional, and ring conformation degrees of freedom only.
For more elaborate, high throughput, or ultra-HT docking, we assume that most of our users
will appreciate scripts and highly parallelized runs in the Commandline Mode or Batch
Modes (please see Sections 6 and 9.1 for details).
1.2 How to read this guide
Some section or subsection titles are marked with an asterisk. These sections are either
of less importance or they are very technical and hard to understand. We advise you to
read these sections after you have some experience with FlexX. Most commands and le
formats are self-explanatory, there is no need to go through the whole User Guide in order
to work with FlexX. However, we highly recommend reading section 6.5.6 before you start
working with your own data.
We have used the following styles or fonts to highlight specic parts of the text. The most
important style is the environment of examples, as follows:
Example
This is an example
The descriptions of commands and global parameters of FlexX have a special list structure,
which is self-explanatory. In the text, we use the following fonts: this is a command, this is
a <parameter>, and this is a filename, a path, or a program. Exceptions to this are the
programs FlexV and FlexX itself.
A syntax description looks like this:
command <parameter> ...
Parameters which occur only in special cases or which are optional are set in parentheses:
[<optional parameter>]. If the line ends with a \-character, the command line is continued
in the next line. Note that in FlexX itself it is not possible to escape a carriage return character
by using a \-character.
2
Installation
Once LeadIT is installed, FlexX is also installed. Everything is in one single program le.
Some of the docking parts require some special consideration which will be assisted by the
GUI. Read on if you are interested in those.
2.1 Directory Structures, Files
Most of the tasks you perform will presumably be done using the graphical user interface
(GUI). For experts, the FlexX engine can be called in commandline mode and is more ver-
satile and even more powerful running it like that. To understand what goes on behind
the scenes, we will briey review the directory structure and discuss what les have what
contents and purposes.
After the standard installation procedure of LeadIT , you should have the following les
and directories:
Files:
bsit_rcgenerator.svl
flexv
leadit
leadit-
*
-Linux
leadit_pbc.sh
settings.pxx
Directories:
doc/
examples/
predict/
recore_index/
recore_link_constraints/
tmp/
tutorials
LeadIT will check whether some paths are set correctly. This affects:
a ring conformer generator
19
20 CHAPTER 2. INSTALLATION
the parallel computing scratch les directory (PVM-based, currently supported for
Linux only)
the temporary les directory (recommended to be a local disk directory)
the results output directory
In case any of this contains invalid paths, LeadIT will pop up a dialog box which assists
you in dening valid locations (invalid locations will be highlighted in red while you type).
Once valid entries have been made, the program will store the respective information, and
there is no need to ll in the dialog box again.
2.2. EXTERNAL PROGRAMS AND DATA 21
2.2 External programs and data
For the experts:
Some features of FlexX are based on external data and software. Those components can be
adjusted:
2.2.1 Torsion angles
The static data information corresponding to the traditional le torsion_standard.dat
contains energetically favorable torsion angles for specic molecular fragments. The static
data le torsion_fine.dat contains 10 degree energy grids for torsion angles of spe-
cic molecular fragments (see Section 11.12). The torsion data in this software package
have been derived by Gerhard Klebe [15] from the Cambridge Structural Database (CSD) li-
censed by the Cambridge Crystallographic Data Centre (CCDC). The torsion data are under
copyright of GMD, BASF AG, and CCDC. An end-user license for the torsion data les is
included in the FlexX software license.
2.2.2 Flexible ring systems
The conformations of exible ring systems can be computed by the 3D structure generator
CORINA [9, 25]. Your CORINA version is suitable for use with FlexX if the driver option
exx is available (set the CORINA executable to your $path variable, then type corina -h
d to check). CORINA or CORINA-F can be obtained from Molecular Networks GmbH (see
http://www.mol-net.de for detailed information). Alternatively, the program CON-
FORT can be used. (Please contact Tripos Inc. for more information on this.) If no ring
conformation generator is available, the ag RING_MODE must be set to 0 (see section 10.1,
Conguring FlexX).
2.2.3 Parallel script execution
FlexX contains a scheduling algorithm for parallel execution of scripts on workstation clus-
ters. The underlying communication library is PVM
1
(Parallel Virtual Machine) [30]. In
order to run FlexX scripts in parallel, you need a PVM installation on your platform and a
FlexX-PVMexecutable. The FlexX-PVMexecutable has a PVMmarker and a PVMcopyright
note in its header. More information about FlexX-PVM can be found in section 8.1.
1
PVM can be obtained from http://www.epm.ornl.gov/pvm
22 CHAPTER 2. INSTALLATION
2.3 FlexX and MOE
TM
FlexX can be used from within Chemical Computing Groups Molecular Operating Envi-
ronment suite (MOE
TM
). There are many synergies which emerge, and this fruitful collabo-
ration between BioSolveIT and the CCG is being pursued vividly.
Since MOE version 2007.09, MOE comes with a FlexX interface which can very easily be
congured to work with an existing FlexX installation.
If you would like to have a more guided FlexX installation and workow, please visit out
web page:
http://www.biosolveit.de/FlexX_MOE
Just follow the instructions of the README of the package.
2.3.1 How to get more details if FlexX does not start / work?
The most common problem is that the license is not found. In order to see if this is the case,
please take a look at the (shell) output which FlexX gives according to these steps:
1. Open the Configure Flex
*
dialog.
2. Click on Browse in the Working Dir line to nd out what FlexXs actual working
directory is. This path is reported in the rst line of the selection window.
3. Note down the prex in the Prefix eld, and activate the switch Keep Files at the
end of this line.
4. Close the Configure Flex
*
dialog by clicking on OK.
5. Start a docking by clicking on OK in the Dock dialog.
6. After a docking has nished or was aborted, please go to the working directory g-
ured out in step 2, and sort the les according to when they were changed (Linux:
ls -rtl). The most recent les starting with the prex that you jotted down from the
Prefix eld are the temporary FlexX les. The le <prefix>_output.txt, which
is usually bsit_output.txt will contain FlexXs shell output.
2.4 Known Limitations
2.4.1 Installation and Startup Issues
Linux
FlexX requires the glibc version 2.3 or later. Please consult systems administration if
this error (or alike) occurs:
leadit: /lib/ld-linux.so.2: version GLIBC_2.3 not found (required by leadit)
leadit: /lib/i686/libpthread.so.0: version GLIBC_2.3.2 not found (required by leadit)
leadit: /lib/i686/libc.so.6: version GLIBC_2.3 not found (required by leadit)
2.4. KNOWN LIMITATIONS 23
Other issues may affect missing libraries. Please consult the FAQ pages on the web
(http://www.biosolveit.de/faq), or send an email to support@biosolveit.
de.
Windows
We have tested FlexX on Windows XP. In the Cygwin environment (http://www.
cygwin.com), FlexX can be started and used; however, Unix-like path specications
in arguments after calls such as leadit.exe may not be fully supported.
Also please make sure you have the latest version of your graphics card drivers in-
stalled. We encountered black 3D screens with some customers a problem which
could easily be resolved by updating graphics drivers.
2.4.2 Libraries missing?
We only use standard shared libraries in LeadIT . In FlexV, libGL may be missing: In this
case, OpenGL is not installed on this machine. Please contact your systemvendor or admin-
istrator.
2.4.3 Ubuntu / Debian Distributions
In some Ubuntu and Debian distributions, for example Ubuntu 7.0, a symbolic link from
libGL.so.1 to libGL.so is missing.
For remedy, please go to the /usr/lib directory and issue the following command:
ln -s libGL.so.1 ./libGL.so
The le libGL.so.1 can also have higher numbers, therefore a command such as
ln -s libGL.so.2 ./libGL.so may also be the appropriate one.
If the problems do not vanish thereafter, please apply the same procedure to the les
libGLX.so and libGLU.so; and restart your X-Server using STRG+ALT+BACKSPACE.
2.4.4 Token not numeric error under Linux
If you are running a Linux system and get the error message Token not numeric while
reading data, this is caused by the language support contained in the C library (oating-
point numbers are expected to contain , instead of . in some languages). Unset the $LANG
variable before running FlexX to circumvent language support.
2.4.5 Insufcient memory under Windows
In rare cases it may occur that FlexX does not start under Windows but bails out to the
system. This is usually due to the memory management of the operating system. In all
cases we observed this could be resolved with the following procedure:
1. Create a c:config.sys le if it does not exist. (You may need to change C: to an-
other drive name pointing to your boot hard disk.)
24 CHAPTER 2. INSTALLATION
2. Enter something like:
shell=c:windowscommand.com c:windows /e:2048 /p or
shell=c:windowssystem32command.com c:windows /e:2048 /p,
respectively.
3. Reboot your machine.
The problem should be resolved. Please email us if you still encounter problems.
2.4.6 Problems at Runtime of FlexX
If you encounter problems such as
>> DATA ERROR: unexpected token
[occurred in line 21 of file C:\Documents and Settings\user\.leadit\leadit_gfx.cfg]
>> ERROR: font\_app medium-r
>> ERROR: ^
>> expecting one of the following tokens:
it may be that there are les left over from the beta stage. In this case, please remove the les
in C:\Documents and Settings\user\.leadit\ (Windows) or /home/user/.leadit/
(Linux). By user, we refer to your login name.
After restart of FlexX, you may have to re-congure the license le path and the specication
to the Ring Conformer Generator (corina, ...) in the Parameters & Flags menu.
3
Getting Help
Help is available in several ways:
PDF documentation (this guide)
Commandline options and usage help
3.1 PDF Documentation
Despite this not being the Section about the graphical user interface (GUI), this guide can
be displayed using the Help menu in LeadIT . You should have an application to read PDF
les installed. The pdf le resides in the doc folder of your installation.
3.2 Commandline options and usage help
If you call FlexX as
leadit --help you will get a full-blown list of options. Under Windows do this:
Start -> Run -> type in: cmd. Then go to the installation directory using cd and/or
cd..), you can call LeadIT with commandline switches. One of these is the -help switch
which will guide you to further possibilities such as running batch jobs and much more.
Under Linux do the same at a console prompt. Here is a typical output:
mymachine> ./leadit --help
______________________________________________________________________________
L e a d I T
Finest Drug Design Platform for Teams of Medicinal and Computational Researchers
Copyright
BioSolveIT GmbH Version: 2.0.2 (26.07.11)
An der Ziegelei 79 Modules: [CORINA_F] [DECRYPT] [CDOCK] [FLEXE] [DOCKING]
[PERMUTE] [PHARM] [SCREEN] [PPI] [RECORE] [HYDE]
53757 St. Augustin
Germany Original Author: Matthias Rarey
www.biosolveit.de Contact: leadit@biosolveit.de
______________________________________________________________________________
For information about additional contributors and copyright notes
please consult the user guide or type help about.
>>> SYNOPSIS:
LeadIT [--commandline COMMANDLINE-OPTIONS] [OPTIONS]] [<project.fxx> | <receptor.pdb> | <receptor.mol2>]
>>> OPTIONS (available in commandline mode AND graphical user interface) :
(
*
) --rundock : Start docking after loading the receptor.
(
*
) --library=<mol_file> : Dock this library instead of project library.
(
*
) --poses=<pose_file> : Exports poses to <pose_file>.
Appends poses to file, if it exists.
(
*
) --nof_poses=<nof_poses> : Defines the maximum number of exported poses per ligand,
if --poses is set.
(
*
) --scoretab=<scoretab_file> : Exports score and rms of pose with rank 1.
25
26 CHAPTER 3. GETTING HELP
Appends data to file, if it exists.
(
*
) --soltab=<soltab_file> : Exports scores of all poses to <soltab_file>.
Appends data to file, if it exists.
(
*
) --exit : Exit LeadIT after docking.
--nosettings : Dont read any LeadIT preference files
-h / --help : Printing of this help text
-i : Generate BioSolveIT-HostID for this machine
-l <logfile> : All commands and arguments are logged into <logfile>.
-n <nice val> : Run LeadIT with nice value <nice val>.
-o <outfile> : Redirects the complete output of LeadIT to
<outfile> and the error messages to <outfile.err>.
-m[<outfile>] : Redirects the output and the error messages into
the same file <outfile>. If <outfile> is not
defined, stderr is redirected to stdout.
-q[<verbosity>] : Quiet / Suppress startup message. VERBOSITY will be set to (opt arg)
<verbosity> = 0: quiet (default), 1: license info, 2: config warnings
-v : Prints the version number and compilation
information.
>>> COMMANDLINE-OPTIONS (only available in commandline mode LeadIT) :
-a <argstring> : Run with predefined arguments. <argstring> has the
format: $(par1)=val1;$(par2)=val2;..., (no blanks!).
(
**
)-b <batchfile> : Run in batch mode. The sequence of
commands will be read from <batchfile>
-d <directory> : Run LeadIT in specified directory.
(
**
)--recore : Run in Recore mode. To get help for Recore, use "LeadIT --recore -h"
(
**
)-s : Run in PVM work process mode (Parallel LeadIT Version)
(
**
)--server : Run LeadIT as caching server
(
**
)--systeminfo : Print system information
(
*
) These options will be ignored, if -b is set.
(
**
)These options implicitly force commandline mode for downward compatibility.
Future versions of LeadIT may require flag --commandline to activate them!
3.3 Support
Commercial customers usually have full support from us. But we will certainly also try
to help academic customers who do not usually have support contracts with us. In either
case, please check our FAQ & Knowledge Base at http://www.biosolveit.de/faq for
answers. In case this does not help, please visit http://www.biosolveit.de/support.
Finally, we are certainly available through email at support@biosolveit.de.
II
USER GUIDE
27
4
GUI vs. Commandline:
Pick the Best of Both
Worlds
At this point we assume you could start FlexX in LeadIT and that you have a valid license.
(The license is actually only checked out once you perform a docking.)
You now have the choice between a graphically assisted running mode (GUI Mode) or a
Commandline Mode.
4.1 Conguration Overview
Upon start FlexX consults a xed set of sources for its conguration and merges these set-
tings into one combined conguration set. This set concerns both algorithmic and chemistry
parameters. To give you maximum exibility and easy usage at the same time, we have in-
troduced a hierarchical system of conguration possibilities:
Subject to their location the conguration les have different priorities, which means that
parameters of a le can be replaced by parameters of a le with a higher priority. This
hierarchy
enables your systems administrator to set company-specic parameters, or
yourself to adjust everything according to your needs, or,
nally, to set so-called Project specic parameters, ags, values etc.
The hierarchy of these sources for FlexX is as follows, from the source with the lowest to the
highest priority:
1. Installation Level:
The Installation Level enables the systems administrator to set company-specic
parameters like the path to the license server or to external tools. Parameters dened
here are the base for all users using this executable. The path for this settings le is
<installation_path>/settings.pxx.
2. User Level:
If you want to use e.g some modied chemical parameters for every Project, it is not
29
30 CHAPTER 4. GUI VS. COMMANDLINE: THE BEST OF BOTH WORLDS
necessary to change them again and again. You can store them in the settings.pxx of
your home directory and they will be used automatically in your Projects. These mod-
ications can be easily done via the conguration dialog of the GUI (see Section 4.1.2).
Additionally, the le contains your FlexX graphic settings. The path for this settings
le is <home_dir>/.leadit/settings.pxx.
3. Project Level:
In most cases, you will tune parameters for a specic target. These modications
should be stored in the so-called Project File with extension .fxx. It will be cre-
ated at leaving the GUI or when you explicitly save the state of the GUI at a certain
point using File Save Project As .
It contains for example a receptor denition, a library and pharmacophore constraints.
Section 4.1.2 describes how to save parameters in the current Project.
4.1.1 Modication of conguration at Installation Level
The rst settings le which is read may reside in the installation directory of FlexX. As
mentioned above, it is called settings.pxx. Systems administrators can exploit this and
prepare a company-wide environment as the superuser and then copy the respective result-
ing conguration le from their HOME directory to to the installation directory or edit
the le manually.
Option 1: Copying an installation pxx le frominstallation to HOME back to installation
directory:
copy the pxx le fromthe installation directory to your HOME directory. If you do not
know your HOME directory under Windows, navigate to Start -> Run -> cmd
and enter SET USERPROFILE. The resulting output tells you where the pxx le must
be located to be editable with the GUI. In this directory, the le behaves like a User
Level settings.pxx le;
edit it to your needs with the conguration dialog of the GUI (see Section 4.1.2 for
details).
After nishing your modications, copy the le back to the installation directory.
Remember to backup your settings.pxx in your home directory in case you had a
personal one before replacing it as described above.
Option 2: Manual Edit: Another possibility is to use an editor and to manually edit the
conguration le in the installation directory:
In the installation path of FlexX you will nd a settings le with a few parameters which
may be equal for all users. The contents will look similar to:
<?xml version="2.0" encoding="ISO-8859-1"?>
<biosolveit>
<version major="2" minor="0" patch="2"/>
<flexdocking>
4.1. CONFIGURATION OVERVIEW 31
<settings>
<directories>
<TEMP path="/tmp/"/>
<PVM_TEMP path="tmp/"/>
<PREDICT path="predict/"/>
</directories>
<external_tools>
<RCGENERATOR binary="./corina"/>
<GENERATOR3D binary="./corina"/>
<FLEXX binary="./leadit"/>
<FLEXV binary="./flexv"/>
</external_tools>
<licenses>
<license path="__LICENCE__/flexx.lic:."/>
</licenses>
</settings>
<project/>
</flexdocking>
</biosolveit>
This le has xml syntax and can be edited using standard text le editors (do NOT use the
Microsoft Word le format doc when saving the le; it still needs to have the pxx ending!).
If, for example, you would like to dene another default for RCGENERATOR just change the
value in the attribute binary of element RCGENERATOR.
Please note that it is not possible to modify the settings le of the Installation Level via
the GUI.
Modifying the le by hand can be dangerous. Mistakes or a wrong syntax may lead to
an invalid le which cannot any longer be read by FlexX. To get an overview of the le
structure and all possible elements, you can export the combined conguration set from
FlexX (See 7.2.7) to a le.
4.1.2 User Level modication using the GUIs Conguration Dialog
To open the conguration dialog, use File Global Preferences and switch to tab
Parameters&Flags. You will nd a table which contains all congurable FlexX parame-
ters.
32 CHAPTER 4. GUI VS. COMMANDLINE: THE BEST OF BOTH WORLDS
Figure 4.1: The conguration dialog
Click on a parameter. As you can see, the text eld beneath the table gives you a short de-
scription of it. To change the value of a parameter, double click on the cell in column Used
In Calculation and enter a new value. If the name turns to red, your value isnt in a
reasonable range, otherwise it becomes bold.
The table shows you not only the current values of each parameter. It gives you also in-
formation about the source of the parameter. This is done via the checkbox in column My
Default. It can have three different states:
1. Disabled and unchecked: It is a default factory settings or it is dened on the Instal-
lation Level.
2. Enabled and checked: It stems from your user settings le.
3. Enabled but unchecked: It stems from the current Project.
In the same way as you can see the location you can dene the location for a parameter.
The checkbox becomes enabled after you have double clicked into a eld. Let the checkbox
unchecked to save the value in the current Project or set it to checked to save it in your user
settings.
To reset a value, use the context menu by right clicking on the desired value. You can
choose, whether you want to restore your user setting or the installation setting.
4.1. CONFIGURATION OVERVIEW 33
The other tabs of the Global Preferences dialog inuence only your user settings, so
you cannot save these parameters in a Project File! Whenever you want to do this, you have
to go to tab Parameters&Flags as described above. The following list will give you an
overview of the tabs:
1. Directories:
Here you can dene the most important directories, the path to your ring conformer
generator, to the temporary and to the FlexX results directory. If you inserted a path
which doesnt exist, the background of the corresponding text eld turns to red. You
nd these parameters also in the Parameters&Flags table, but you have a faster and
easier access via this tab.
2. License Keys:
It is necessary to have a valid license key to run a docking in GUI mode or to start
FlexX in commandline mode. Here you can dene the path to a license le or you can
add a license server.
3. PDB Proxy:
You need to use these settings if your corporation has a PDB les repository inside
a protected network. It is also possible to set these parameters in menu Load From
Protein Database....
4. Parallel Computing:
Parallel computing is currently not supported under Windows, so this tab is only en-
abled under Linux. For more information of how to use the PVM feature, see 8.1.3.
5. UltraHTS Conguration:
PBC functionality of the SCREEN module is currently not fully supported under Win-
dows, so this tab is only enabled under Linux. You need a valid license for the SCREEN
module to use this feature. For more information see Sec. 8.6.
34 CHAPTER 4. GUI VS. COMMANDLINE: THE BEST OF BOTH WORLDS
5
FlexX in GUI Mode
5.1 A Typical GUI Workow (Tutorial)
FlexXs workow consists of 4 major steps:
1. Receptor Denition
2. Ligand and Docking Library Composition
3. Docking
4. Analysis
Steps 1 and 2 are interchangeable.
An optional denition step for pharmacophore specication can be inserted after Step 1 or 2.
We will briey guide you through all these steps.
For better understanding and an easier introduction to the program, we have created a sim-
ple workow example. The example will accompany you through several chapters. It can
be read as a beginners tutorial.
5.1.1 GUI Protein Preparation
Proteins can be loaded from either your le system or directly from a PDB repository. The
GUI currently reads proteins in pdb format only.
To read a protein, simply follow the menu entries
Receptor Define...
If your corporation has a PDB les repository inside a protected network, then please
consider using the proxy conguration mechanism The rest should be self-explanatory.
Once transferred or loaded, the protein will be displayed highlighting any ligand candidate
molecules in thick lines, whereas the protein itself will be in line mode. This setting cannot
currently be altered and serves to better detect the binding site and distinguish ligands and
co-factors from the remaining protein.
35
36 CHAPTER 5. FLEXX IN GUI MODE
Example
The rst step shown in this example is to load a protein le.
Click on the Load entry in the Protein menu to open a data management dialog box. We
assume that you are familiar with your operating system.
Now change to the directory example/, choose the le 1o86.pdb and conrm by pressing
the Open button. The data is now loaded into FlexXs memory.
Figure 5.1: Main window with pdb le 1dwd.pdb loaded
The main window is divided into three parts. On the left-hand side the so-called Tree View of all
loaded data is shown. On the right-hand side the 3D view displays the protein and other molecules,
while at the bottom of the window, the FlexX text output and the docking solutions are shown.
The next step will be to dene a "Receptor".
5.1.2 GUI Receptor Denition
The receptor denition is of crucial importance to FlexX. There are several chemical ambi-
!
guities or discrepancies that crystallography may not have been able to resolve. Once the
protein is loaded, we need to dene what the receptor is. Receptor in FlexX nomen-
clature is the denition of all chains, co-factors and single atoms/ions which are visible to
the ligand to be docked, i.e., which are sterically present. The receptor denition may also
include water. It is usually sufcient to just include the bits of an entire protein which are
!
close to the ligand. The more protein is presented to the ligand to dock, the more likely
you will end up with false positives in other words, the more protein you offer, the more
5.1. A TYPICAL GUI WORKFLOW (TUTORIAL) 37
options the ligand has to go wrong.
A wizard invoked by Receptor Define guides you through these steps. These are:
Choose Receptor
Chain and co-factor selection takes place here. Click on the respective chain to be
included in the docking. A more elaborate selection and inclusion facility will be im-
plemented soon. You can click in the 3Diver or in the table on the left-hand side. The
selected atoms have standard colors, i.e., carbon in gray, oxygen in red etc.
Dene Binding Site
The binding site cut-out threshold is determined here. Currently there are two meth-
ods implemented:
a) usage of a so-called reference ligand. By this, we understand a ligand in its
bound conformation which can either be dened by clicking on the ligand contained
in the pdb le (on display) or by loading it from an externally available mol2 le. SD
les will be supported soon. A reference ligand can be recognized by its green carbon
framework.
b) a spherical cut-out using a reference atom.
Chemical Ambiguities
Chemical ambiguities refer to the crystallographically unresolved regions and affect
rotamers, alternate locations (multiple coordinates for one atom), protonation, H-
torsions etc. The dialog boxes should be self-explanatory. You can click in either the
3Diver or the respective dialog boxes on the left-hand side.
Tip! Use your mouse wheel to adjust torsions.
Conrm Receptor
The receptor needs a unique name for later processing in the docking dialog.
If you click Finish, the receptor is dened and after a few seconds is available for further
processing.
The steps of the denition are now explained in more detail alongside the screenshots.
38 CHAPTER 5. FLEXX IN GUI MODE
Example
The rst step is to choose the protein
parts that form the receptor site. In
case several models are found (as often
encountered in NMR structures), pick
one from the top drop-down box.
Our loaded protein 1o86 has a ligand
which we will use as the reference lig-
and.
For co-factors there is a checkbox
Cofactor(s) to treat them as co-
factors and not as ligands; since, in our
case, LPR-702-A is our ligand, we will
not activate it, but the chain that sur-
rounds the ligand.
So, please activate the checkbox for
Chain A, the blue chain. In the 3D
view the coloring of the selected chains
will change from unique colors to stan-
dard atom colors.
Figure 5.2: Dene Receptor: Choose receptor
The second step denes the boundaries
of the actual binding site. As a bind-
ing site, we dene the part of the re-
ceptor (cp. above) which can form in-
teractions with a ligand. The candi-
dates molecules which may be used as
a reference ligand are already available
highlighted in stick mode. If the pro-
tein input le contains several ligands,
the GUI will set the focus in the wiz-
ard in such a way that all ligands are
visible. Select your reference ligand of
choice by clicking on the molecule in
the 3D view. If you have problems hit-
ting the ligand, zoom a little in (poten-
tially using the mouse wheel, see sec-
tion ??).
An additional step is adjusting the
radius around the reference ligand.
Amino acids located within this radius
are selected to form the binding site.
In this example we will use the default
value of 6.5. In the default algorithm,
we will dene a cloud-like body be-
cause all reference ligand atoms repre-
sent centers of spheres, the envelope of
which encloses the binding site.
Figure 5.3: Dene Receptor: Binding site
5.1. A TYPICAL GUI WORKFLOW (TUTORIAL) 39
Once the reference ligand is selected,
the amino acids that are located com-
pletely outside the radius around the
reference ligand and the reference lig-
and itself are displayed in a some-
what more transparent view; the amino
acids within the radius are the only
ones that are fully visible. Evidently,
these amino acids are the ones which
form the binding site.
Please click the Next button to go on to
the next step.
Figure 5.4: Dene Receptor: Binding site
In this step the so-called atom assign-
ment of candidate amino acids can be
adjusted. A list of amino acids that are
apt are shown left of the 3D view. Click
on the list entries in case you want
to switch to another assignment; the
3D view will focus. An assignment is
changed by clicking on the highlighted
amino acid in 3D view or by clicking
on the combobox in the assignment ta-
ble. Please check the correct atom as-
signment for each amino acid in the
list. In case the input le already con-
tains pointers to a crystallographic am-
biguity, you will see a yellow triangle
pointing to special attention required.
In such a case, you will not be able to
proceed with Next until you have re-
solved (changed or conrmed) the set-
ting and thus removed the triangle.
When nished, please click the Next
button to go on to the next step.
Figure 5.5: Dene Receptor: Assignments
40 CHAPTER 5. FLEXX IN GUI MODE
In this step the protonation states of
amino acids are selected. All amino
acids that form the binding site are
listed left of the 3Dview. To work on an
amino acid, select it in the list by click-
ing on the list entry; alternatively select
it in the 3DView. Amino acids with po-
tentially critical protonation/tautomer
states are especially marked by trian-
gles which need to be resolved by
conrmation or alteration. Check the
protonation and torsion of these amino
acids carefully; chemically, it can make
a large difference whether an amino
acid acts as a donor or as an acceptor.
FlexX helps you graphically by high-
lighting contemplable distances as ar-
rows. Once all triangles are resolved,
the Next button will no longer be
grayed out, and you can continue with
the next step.
Figure 5.6: Dene Receptor: Protonation, H-torsions
and tautomers
Upon clicking on either an oxygen
(red spheres) or ticking the box left to
the water entries in the Water table,
you will be led to a selection of its
Type. There are three types: Ori-
ented Water(a fully dened water),
Freely Rotatable Water ( a spherical
particle with water properties, i.e., a
freely spinning water), and a Freely
Rotatable, Displaceable Water (a lig-
and may displace this water). The
Oriented Water can be adjusted us-
ing clicks on the H atoms or the lone
pairs. As graphical aids, FlexX shows
distance based help lines. Please note
that these are not interaction lines. If
a water-H and an acceptor are in a sen-
sible distance, the color of the help line
changes. In the future, this shall be au-
tomated. You can limit the visible wa-
ters using the interaction count dialog
right below.
Figure 5.7: Dene Receptor: Water
5.1. A TYPICAL GUI WORKFLOW (TUTORIAL) 41
When you enter the metal wizard
step, FlexX will automatically look
for neighboring atoms and, using
these, propose the most probable metal
coordination. The dialog is self-
explanatory; however, we would like
to point out that FlexX automati-
cally denes a pharmacophore for any
found metal interaction surfaces. If
there are multiple interaction surfaces,
one of these must be matched, i.e.,
a ligand must touch at least one
of these surfaces displayed in blue.
Please take note that this to work
requires a PHARM license.
Figure 5.8: Dene Receptor: Metal
Finally, in this last step, please enter a
name for the dened receptor in the in-
put eld. The receptor denition will
be saved under this name.
Click the Finish button to complete
the denition.
Figure 5.9: Dene Receptor: Conrm receptor
42 CHAPTER 5. FLEXX IN GUI MODE
5.1.3 Treatment of water molecules in FlexX
In FlexX water molecules can be handled in different ways. First of all they can be modeled
as explicit water molecules with two hydrogen atoms. In this case the water has to be ori-
ented within the binding pocket. If hydrogens are present in the PDB input FlexX uses these
positions as default. Otherwise you can orient the hydrogens in the wizard (cp above).
On the other hand water molecules can be modeled as freely rotatable molecules, which
are oriented by the ligand during docking. In this case they are described as volume-less
dot objects in space. During the docking process, displaceable water molecules can exhibit
interactions (and thus become active, cp below), or the ligand can overlap with them. In
case they do not have interactions, they become phantom(cp below). The solutions with the
highest score of both described approaches win, i.e. if the score of one solution of a ligand
overlapping with the water is higher than the score of a solution where the ligand does not
overlap, then the water is displaced and marked as such.
These are the options and how the user can select them in FlexX
1. Unselected water molecules
Oxygen atoms of water molecules which are present in the PDB le are displayed as
red spheres in the protonation step of the Receptor denition wizard. This means they
are not selected yet. Clicking on one of these spheres selects them (or checking the box
next to the corresponding water molecule ID in the table underneath the protonation
table).
2. Selected water molecules
Once a water molecule is selected, there are several options:
Explicit water molecules (Icon water molecule) The water molecule has explicit
hydrogens (white) and orbitals (light blue) which need to be adjusted by the user.
Once the water molecule is in focus (meaning you can do something with it),
it changes from line to stick mode. Orienting the water molecules can be done
by either entering the values for the dihedral angles in the table or by clicking
on possible hydrogen bonding neighbors of the water molecule. This is visually
aided by lines and highlighted atoms which comprise contrary hydrogen bond-
ing properties than the atom that is oriented.
Freely rotatable water molecules (Particles; light blue sphere) The water molecule
selected as freely rotatable is displayed as a blue sphere. Interaction geometries
are modeled by spheres thus they are quasi freely rotatable. Certain angle depen-
dencies between the interactions prevent meaningless geometries. A minimum
of one interaction with the protein is necessary to orient those geometries. Thus,
these particles normally bridge interactions between ligand and protein. A water
molecule modeled by such a sphere is not displaceable; once it is out of focus, it
will be displayed by a light blue cross.
Freely rotatable and displaceable water molecules (grey blue sphere) This has es-
sentially the same properties as the particle above, but in addition it is displace-
able which means the ligand can, depending on its size, either establish interac-
tions with the water molecule and/or the protein, or, if it is a bigger and more
sterically demanding ligand, can displace the water from the active site and in-
5.1. A TYPICAL GUI WORKFLOW (TUTORIAL) 43
teract with the protein in its stead. The displaceability is indicated by a blinking
grey blue sphere, or a blinking grey blue cross if the water is out of the focus.
For the docking solutions the water molecules are displayed according to their status (cp
above). Displaced waters are not displayed by default. However, if the user checks a box
above the solution table, the positions of displaced waters are marked by blue spheres. FlexX
decides between three different states of water molecules after a docking, which are written
to a mol2 le if a docking solution is saved as such:
Active
The water is still present with all interactions
Phantom
The water is still there (not displaced), but has no interactions to the receptor nor to
the ligand.
Displaced
The ligand has displaced this water molecule in case it was labeled as such before.
The user can easily sort the solutions according to the state of the water molecules in the
active site.
Hint: By introducing waters (especially displaceable waters) to the docking problem, there
are a lot more possibilities for the docking to go in the wrong direction. The user should not
litely introduce more degrees of freedom unless the boundary conditions are well known.
5.1.4 Pharmacophore Constraints in the GUI
Once the binding site has been dened, you can add pharmacophore constraints. Phar-
macophore constraints in FlexX are receptor-based; interactions have to be specied seen
!
through the eye of the protein. The dialog will guide you.
Docking with pharmconstraints requires an extra license for the PHARMmodule (currently
!
free of charge to academic Release 2 customers and companies dened as Small Biotech in
BioSolveITs contracts).
Pharmacophore constraints in FlexX are of two types:
Interaction constraints: As mentioned, these are receptor bound, this means that inter-
actions to be formed with the receptor are dened from the perspective of the amino
acids. The dialog will clarify what this means.
Spatial constraints: These constraints are dened by a sphere which resides at any point
in space and requires certain atom types or parts of the ligand to match. The subparts
of compounds to match can be specied in SMARTS, an abbreviated line notation for
molecules. (For more details on SMARTS, please refer to http://www.daylight.
com or the user guide which is available under: HELP USER GUIDE .)
44 CHAPTER 5. FLEXX IN GUI MODE
In addition, FlexX distinguishes between essential and optional constraints. They can be
combined freely using Boolean expressions with the logical expression syntax.
Interaction constraints can be specied by clicking on the respective interaction surfaces.
The most prominent interaction surface types are pre-selected (h_don, h_acc); others
can be selected as well. You can de-select interaction surfaces by clicking underneath the
interaction types in the respective dialog boxes.
5.1. A TYPICAL GUI WORKFLOW (TUTORIAL) 45
Example
In this example we will dene an interaction constraint and a spatial constraint based on the
receptor 1dwd from the examples directory. While this step is not needed for performing a docking
successfully, using pharmacophore constraints produces better docking results in many cases.
Click on the Define entry in the Pharmacophore menu to open the pharmacophore dialog.
The pharmacophore dialog box
is divided into four sections: the
upper panel is used for visual-
izing interaction (IA) types (Tab
IA Constraint Definition)
and dening the spatial interaction
sphere (Tab Spatial Constraint
Definition); the second part
shows the already dened con-
straints in the form of lists; the
individual constraints can be com-
bined in the third part using
the essential/optional dropdown
boxes or
the logical expression syntax with
the IDs of the pharmacophores
which are annotated in column 3 of
the pharmacophore lines.
Finally, the bottom part is used to
enter a name for saving the pharma-
cophore constraints.
Let us have a look at the Interaction
Constraint Denition tab on the top
panel: Different interaction types are
sorted into three categories accord-
ing to their priority. (Please refer to
Section 11.6 on p. 301) for more de-
tails on this.) If an entry is activated
by clicking, the interaction surface of
that type is displayed in the 3D wid-
get.
Figure 5.10: Dene Pharmacophore: Dialog box
46 CHAPTER 5. FLEXX IN GUI MODE
For this example (and just for explana-
tory reasons!), we shall select three
interactions with hydrogen acceptors.
Please click on the entry h_acc and
notice what happens in the 3D wid-
get: the hydrogen acceptors in the
protein are given a red interaction sur-
face. Search for surfaces that inter-
act with the amidine and sulfonamide
group of the reference ligand and click
on the respective interaction surfaces.
Each of the selected interactions will
then be listed in the second segment
of the dialog. Change the priority of
the interaction between the reference
ligand and amino acid GLY 216 from
essential to optional as well as
the interaction between atom _OD1
in amino acid ASP 189 and the ref-
erence ligand. Optional, as men-
tioned, means that this interaction will
be used preferentially but it does not
necessarily have to be in the dock-
ing solution, because it competes with
other optionals of which a minimum
number has to be matched according
to the denitions in the third part of
the dialog. An essential interac-
tion, however, must be formed by a
ligands docking pose.
Figure 5.11: Dene Pharmacophore: Interaction con-
straints
Another type of constraint are spatial
constraints. The Spatial Constraint
Denition tab is shown on the right.
To dene a spatial constraint, click on
an atom (e.g. a carbon) in a buried
part of the binding site. The coordi-
nates of that atom are assigned to the
pharmacophore dialog as center of a
sphere with a dened radius. Change
the radius of the sphere to 1.5in the
dialog box. The sphere will be dis-
played in cyan in the 3D widget.
Figure 5.12: Dene Pharmacophore: Spatial constraints
5.1. A TYPICAL GUI WORKFLOW (TUTORIAL) 47
Once the spatial constraint is con-
rmed by clicking the Add to
Constraints button, it will be listed
in the middle part of the dialog box
and the sphere will turn pink in the
3D widget.
To complete the denition, the single
constraints must be combined into
a set of constraints. In this example
we will use the pre-selected priority
method (optional/essential). Two
values must be dened: the minimum
number of competing (optional)
constraints, and the maximum
number of optionals requested in
docking solutions. For your informa-
tion, the Number of Essential
Constraints is also listed but not
adjustable. Set the minimum number
of optionals to 1 and the maximum
number to 3. This reads: At least one
optional constraint must be fullled
no matter which one and: three
optionals may be fullled at max.
Figure 5.13: Dene Pharmacophore: Spatial constraints
An alternative, more powerful
method of combining constraints uses
logical expressions. An example is
shown in the screenshot on the right.
It uses the constraint IDs combined
with Boolean operators.
Each pharmacophore denition is
saved under the name entered in the
input eld at the bottom of the dia-
log box. Enter 1dwd_pharm" here and
conrm by pressing the OK button.
Figure 5.14: Dene Pharmacophore: Spatial constraints
and logical expressions
48 CHAPTER 5. FLEXX IN GUI MODE
5.1.5 Ligands in the GUI
Ligands are loaded into FlexX using the menu Ligands :
Ligands Select & Prepare
At this point, the GUI supports mol2 and SDle formats. The termfor a collection of ligands
is called Library in the GUI.
The top section of the dialog box summarizes what les will contain the ligands you wish
to dock. The currently supported formats are:
mol2
SD
Current limitations:
At this time, a FlexX Project cannot contain more than one library.
A library may have a maximum of 10.000 ligands. If you require to dock more ligands,
it is requested that you use the commandline and a script (cp. Sec. 9.1). This is to be on
the safe side with on-board memory.
Example
The second mandatory step in preparing docking is the selection and preparation of ligands. Click
on the Select & Prepare entry in the Ligands menu. A new dialog opens (Fig. 5.15).
Let us import a ligand.
Click on the Load File... but-
ton. Select 1dwd_min.mol2 in the
example directory. The dialog will
respond with an entry and a 2D view.
Conrm by clicking the OK button.
Note:
It is currently not possible to cut out
a ligand from the PDB le and dock it
right away.
Figure 5.15: Prepare ligands
5.1. A TYPICAL GUI WORKFLOW (TUTORIAL) 49
Normally ligands undergo the FlexX-specic initialization during the import; if you want
FlexX to leave your ligands as you prepared them, for example using a third-party tool,
you can indeed switch the preparation off or alter the respective steps after the library was
!
loaded by clicking the Preparation column in the top section of the dialog. A new dialog
will pop up in which you can activate or de-activate single initialization steps:
Figure 5.16: The ligand preparation dialog. The steps are worked on level-wise from top to
bottom and left to right.
For SD-les you can tell FlexX what eld from the input le to use as the molecule ID. To do
so, please click on the column SD Name From in the top section of the dialog, and dene
the ID eld in the new dialog. (This column will certainly be grayed out automatically if
the format of the loaded le is mol2):
Figure 5.17: SD Files: Choices to pick the molecule ID
50 CHAPTER 5. FLEXX IN GUI MODE
5.1.6 Docking within the GUI
Docking requires a license and is invoked through
!
Docking Define
The respective Docking dialog is divided into 5 sections. Only section 1 needs to have valid
entries; the remaining sections can optionally be used to deviate from default settings and
tune your dockings. Therefore, you could dock right away using Apply & Run at this
!
point.
1. General Docking Information
Here you specify what receptor to choose and what library you would like to dock
and how many poses you would like to keep for each ligand. Optionally, a previously
dened pharmacophore can be applied.
2. Base Placement
Strategies typically employed to inuence the placement of the rst fragment (the so-
called "Base Fragment") will be implemented here in due course. The Base Fragment
is the rst ligand fragment to be placed on the binding site. It undergoes an espe-
cially elaborate placing algorithm. You have the choice between the standard triangle
algorithm and the Single Interaction Scan (SIS) placement; the new SIS algorithm is
particularly suited for more hydrophobic pockets and pockets with only a few inter-
action sites; much success has been achieved also when applying the SIS algorithm to
metalloproteins.
3. Scoring
In later versions, different scoring combinations will be accessible here. What is al-
ready activated by default is so-called access_scaling. This downscales the score of
poses lying at the rim of pockets.
4. Chemical Parameters
This section controls the stereo chemistry and other parameters relevant for the chem-
istry of docking. Some are yet to be implemented; however stereo chemistry and clash
handling can already be inuenced.
5. Docking Details
This dialog lets you control the number of solutions to be taken along while traversing
the solution tree of FlexX.
The docking can be invoked with the Apply&Run button. You can watch the progress at
the console.
5.1. A TYPICAL GUI WORKFLOW (TUTORIAL) 51
Example
Once the receptor is dened and a ligand selected, we are ready to run a rst simple docking. Open
the docking dialog box by clicking Define in the Docking menu.
The docking dialog box is divided into sev-
eral parts. Select the dened receptor and
the ligand library in the rst part, General
Docking Information. In this example
we will perform a very basic docking, so we
will use the default values from the other
parts such as the access_scaling alteration
facility shown here in an exemplied way.
Figure 5.18: Dene docking
We will use the default values in the sections
shown in this screenshot. If requested, you
can trigger the automatic and on-the-y cre-
ation of stereo isomers. Now start the dock-
ing run by clicking the Apply & Run but-
ton.
Figure 5.19: Dene docking
52 CHAPTER 5. FLEXX IN GUI MODE
5.1.7 Docking Analysis
Once the docking has been completed, all the data will be fed into the GUI and appear at
the bottom of the system. You can select poses and look at them in 3D view. The respective
energies (including their increments) of the poses will be displayed in a spreadsheet-like
view.
Example
The result of a docking run is shown in Figure 5.20. A result table has been added below the 3D
view in the main window. It shows a 2D view of the ligand, the total score and the partial scores,
the RMSD, and the similarity. For a 3D view of a ligand in the list, just click on the list entry and the
ligand is displayed in the binding site. To display or remove the reference ligand from the view, use
the control key plus a mouse click on the respective line or select the respective entry in the Tree
View.
Figure 5.20: Main window including the docking result table.
5.1.8 Exporting Poses from the GUI
To export poses in mol2 or SDformat, use the menu context menu of the selected poses in the spread-
sheet area (right mouse click). A dialog lets you specify the output directory for further processing.
5.2. OTHER GUI EXPORT FACILITIES 53
5.1.9 Screening
With FlexX you can screen millions of compounds; in GUI mode however, the number of ligands
is limited to 10.000 to be on the safe side with on-board memory. The procedure to screen 10.000
compounds would be exactly the same as depicted above with the exception that the ligand le
contains the 10.000 ligands.
However, we strongly recommend to use the commandline mode to screen many compounds be-
cause this gives you a much more powerful facility and enables you to compute in parallel (Linux
only at this point).
The basic procedure is very simple:
1. prepare a Project File (an .fxx-le) and save it to disk
2. prepare a screening script (a .bat-le)
3. use FlexX in batch mode to run the script and export the solutions.
Please refer to Section 6.3 how to start FlexX with a script as an argument.
For parallel batch mode processing, please have a look at Sec. 8.1, and for virtual UltraHTS, there is
the Section on the SCREEN module for FlexX(Section 8.6 on p. 223).
5.2 Other GUI Export Facilities
5.2.1 PipelinePilot
TM
Support
FlexX exports a prepared receptor and a selected pharmacophore specication to a PipelinePilot
component. The component does not contain pointers to the original les, instead it contains all
necessary information to go straight away.
Therefore, please use
Export Docking PipelinePilot Component...
as a procedure to create an xml-le to drag and drop into your PipelinePilot User Interface.
The currently supported protocol assumes that you have running a PipelinePilot server under Win-
dows. You can either
execute FlexX on the PipelinePilot server or
remote-execute FlexX on a Linux machine.
Execute FlexX on remote Linux host
You will have to specify a couple of details for a remote machine ssh log-on. These are:
the host name you will compute on (ssh_host)
your user name on this host (ssh_user)
your password (ssh_password), unless you use Key Files (ask your systems admin) to
log-on password-free
in case of Key File usage:
a) the path to a le containing an RSAor DSAprivate key, in OpenSSHformat (ssh_key).
The remote host must be congured to use the corresponding public key. The password will
be ignored if authentication through this method is successful.
b) the password to access the Key File (ssh_key_passphrase).
54 CHAPTER 5. FLEXX IN GUI MODE
The Linux machine(s) can be a cluster as long as you have sufciently many licenses to occupy several
machines at a time. In this case, the ssh_pvm value in the component should be set to True. Please
note that everything else on the compute host must be congured to work with PVM. This comprises
basically:
PVM correctly installed
FlexX executable in $HOME/pvm3/bin/$PVM_ARCH
all parameters in global preferences dialog properly congured
Guidance for parallel docking can be found in the FlexX User Guide and/or in our FAQ & Knowl-
edge Base at http://www.biosolveit.de/faq
Finally, Please be sure to also specify
the name of the executable by parameter FLEXX
the temporary les directory (must be NFS readable!)
the number of solutions you would like to take over to the next component (in screening this
would typically be 1).
6
FlexX in Commandline
Mode I. Basic Usage
To start FlexX in interactive Commandline Mode you must enter flexx from the operating system
shell.
Linux/Unix: Go to the directory from where you installed LeadIT and type bin/leadit
--commandline
Windows: use Start -> Run -> type cmd\ in the eld & press Return.
change to the LeadIT installation directory, and type leadit --commandline
You are now in the so-called FlexX shell, i.e. you will see the FlexX prompt on the screen:
LEADIT>
6.1 The FlexX shell
6.1.1 Menu navigation
When you see the FlexX prompt on the screen, you can work with the FlexX shell. The FlexX shell is
menu-driven, and the menus are hierarchically organized in a tree structure. In each menu you have
specic valid commands (called menu commands). You can execute these commands by typing their
names. Entering a name of a submenu brings you to the submenu, entering END to the parent menu.
You can also directly go to a menu available in a parent menu by typing its name. The FlexX prompt
will always reect the name and location of the current menu. There are some commands which are
valid for all menus. These are called global commands.
You can get a list of all global commands, menu commands, submenu and parent menu names which
are valid in a given menu by pressing the RETURN key after the prompt.
6.1.2 Upper and lower case
Since FlexX internally converts command and menu names to uppercase letters, you can type the
commands and menu names in lowercase letters or uppercase letters (or both, if you want). This is
not true for the command parameters, which are explained now.
6.1.3 Command parameters
Many commands take several parameters. The command will prompt you for all required parame-
ters or you may type the parameter list directly after the command name, with whitespace separating
the single parameters. If the number of parameters you specify in the parameter list directly after the
command name is less than the number of expected parameters for that command, then you will be
55
56 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
prompted to supply the missing parameters. For some commands (e.g. QUERY), some parameters
may be an empty string. This type of parameter is represented by "" (two double quotes) in the
parameter list.
Note: When entering parameters that contain whitespace (for example, some color names) there is a
catch that you must watch out for. When entering such parameters as answers to prompts provided
by the command, FlexX will identify all the parts of the parameter separated by whitespace. If this
parameter is entered in a list directly after the command name, it must be enclosed in double quotes.
Otherwise this single parameter would be interpreted as multiple parameters. In the example below,
both calls of the SELCOL command set the same parameter values.
Example
SUITENAME/LIGAND> selcol
graphic settings for reference coordinates : (yes <1>, no <0>) ....... [0] : 0
>> color mode selection for LIGAND :
0 INVISIBLE
1 ATOM
2 UNIQUE
3 FRAGMENT
4 ENERGY
select color mode by typing its number <0,4> [2] : 1
>> color mode selection for GEOMETRY :
0 INVISIBLE
1 UNIQUE
2 CONTACT
select color mode by typing its number <0,2> [2] : 2
>> color mode selection for SURFACE :
0 INVISIBLE
1 UNIQUE
2 SURF_ATOM
3 CEN_DIST
4 SURFPATCH
select color mode by typing its number <0,4> [2] : 1
surface color : <0,360> [light steel blue] : light green
SUITENAME/LIGAND> selcol 0 1 2 1 "light green"
6.1.4 Command escape
Typing a
-character at a command parameter prompt escapes execution of the current command
(i.e. the command is not executed and no values from previous prompts are stored).
6.2 Batch Processing
You can start FlexX in batch mode and run scripts. Scripts can be coded in BioSolveITs proprietary
but very easy-to-learn scripting language (see Sec. 9.1 on p. 253) or with Python (p. 259).
6.3. COMMANDLINE SWITCHES / STARTUP OPTIONS 57
6.3 Commandline Switches / Startup Options
6.3.1 Loading Projects and receptor les (lename without option)
FlexX can be called with a Project File (with extension .fxx) or an receptor le in pdb or mol2 format.
The specied le is loaded after the initialization process of FlexX .
6.3.2 Commandline options and their arguments
In order to link a command line option and its argument, use blanks as a separator, i.e.
-l <logfile>.
6.3.3 Arguments for batch processing (-a)
If FlexX is started in batch mode (see -b, below), you can dene an argument string for the batch
program. The format of the argument string is explained in Section 9.1.
6.3.4 Batch mode (-b)
For users experienced with scripts (see le formats section 9.1) it may sometimes be desirable to start
FlexX in batch mode. One advantage of this mode is that you can redirect the screen output of FlexX
into a le. To start FlexX in batch mode, type leadit -b <script filename>. FlexX will then
execute the script <script lename>. If FlexX is started with the -b option, it never waits for a
key press and terminates whenever an error occurs. -b takes precedence over options --rundock,
--poses, --nof_poses, --scoretab, --soltab and --exit. FlexX automatically starts in com-
mandline mode, if option -b is given.
6.3.5 Specifying the execution directory (-d)
In order to execute FlexX in an alternative directory, FlexX can be called with option -d <execute
dir>.
6.3.6 Help for command line options (-h, --help, ?)
Type leadit -h to get a short help text about the command line options.
6.3.7 Output the host ID or system ID (-i)
Type leadit -i to output the host or system ID of the machine it is running on.
6.3.8 Logging the FlexX session (-l)
If FlexX is started with the -l <logfile> option, all commands executed are written with their
parameters into a log le named logfile stored in the current directory. This is especially useful
for the creation of script backbones for further modication. Beware not to overwrite your (edited)
script with another -l call.
6.3.9 Nice value (-n)
The FlexX session can be started with a specic nice value given after the -n option. The nice value
affects the prioritization of the process within the framework of other processes. The common range
is at least from -20 (resulting in the most favorable scheduling) through 19 (the least favorable). The
nice value is effective on Linux only.
58 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
6.3.10 Redirecting output (-o, -om)
By default FlexX sends all text output to stdout and error messages to stderr. Starting FlexX with
leadit -o <outputfile> causes text output to be redirected to outputfile and the error
messages to be redirected to outputfile.err. The output of stdout and stderr can be merged
using the parameter -om instead of -o.
6.3.11 Verbosity (-q)
Type leadit -q <verbosity> to set the internal VERBOSITY ag to (opt arg). <verbosity> =
0: quiet (default), 1: license info, 2: cong warnings
6.3.12 Interface option (-s)
The options -s is an interface options to control FlexX behavior in combination with calling programs
and should therefore not be used as command line options. FlexX automatically starts in command-
line mode, if option -s is given.
6.3.13 Version information (-v)
Type leadit -v to get detailed information about the FlexX version you are using.
6.3.14 Running docking in GUI mode (--rundock)
You can use this option only together with a Project File, otherwise it will be ignored. It starts the
docking process directly after the Project was loaded if it contains a receptor and a library.
6.3.15 Export docking poses (--poses)
You can use this option only together with a Project File and option --rundock, otherwise it will be
ignored. Type leadit <project> --rundock --poses=<filename> to export the calculated
poses to <filename>. The maximum number of poses can be dened with option --nof_poses.
If the specied le still exists, the data will be appended.
6.3.16 Maximum number of poses to export (--nof_poses)
You can use this option only together with option --poses, otherwise it will be ignored. Type
leadit <project> --rundock --poses=<filename> --nof_poses=<number_of_poses>
to export at most <number_of_poses> poses to <filename>.
6.3.17 Export score table (--scoretab)
You can use this option only together with a Project File and option --rundock, otherwise it will be
ignored. Type leadit <project> --rundock --scoretab=<filename> to export the score
and rms value of pose with rank 1 to <filename>. If the specied le still exists, the data will be
appended.
6.3.18 Export solutions table (--soltab)
You can use this option only together with a Project File and option --rundock, otherwise it will
be ignored. Type leadit <project> --rundock --soltab=<filename> to export the scores
of all calculated poses to <filename>. (the format is described in 7.8.13). If the specied le still
exists, the data will be appended.
6.4. ERRORS AND WARNINGS 59
6.3.19 Exiting right after docking (--exit)
You can use this option only together with a Project File and option --rundock. FlexX exists after
nishing the docking, if --exit is given.
6.4 Errors and warnings
FlexX divides atypical situations into three categories: warnings, errors, and fatal errors. A
warning is issued if the situation can be handled by FlexX but will probably have a signicant in-
uence on the result. In the case of an error, something went wrong and data is not available. The
current command will be aborted in most cases. A fatal error is produced in cases where FlexX has
to terminate immediately.
60 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
6.5 A Typical Commandline Interface Workow (Tutorial)
This section is meant to be a quick introduction to the world of docking with FlexX. FlexX is ex-
tremely exible and congurable, and you will learn in the later sections where to tune to get your
desired docking result. Here we would like to give you a rough outline of the rst steps to take to
get the program running, acquaint you with the most important commands and show you the very
basics of the docking procedure. We assume you are familiar with basic operating systemcommands.
6.5.1 Conguration
We assume you have already entered the information about the license le (if you have any problems
with this, see section ??).
In this tutorial we use FlexV as a molecular viewer, therefore you must make sure that parameter
FLEXV points to a valid executable (see section 4.1 for a description of the conguration dialog).
FlexX does not come with a 3D generator, so the parameters RCGENERATOR and 3DGENERATOR are
not congured. You should point these two variables to the location of your own 3D generator (e.g.
CORINA, CONFORT, CONCORD etc.). Without this information, FlexX cannot use the ring torsion
feature. Please also consider the information about CORINA on page 348.
When you type leadit --commandline in your working directory now, FlexX will say HI, and the
command prompt will be waiting for your valuable input:
______________________________________________________________________________
L e a d I T
Finest Drug Design Platform for Teams of Medicinal and Computational Researchers
Copyright
BioSolveIT GmbH Version: 2.0.2 (26.07.11)
An der Ziegelei 79 Modules: [CORINA_F] [DECRYPT] [CDOCK] [FLEXE] [DOCKING]
[PERMUTE] [PHARM] [SCREEN] [PPI] [RECORE] [HYDE]
53757 St. Augustin
Germany Original Author: Matthias Rarey
www.biosolveit.de Contact: leadit@biosolveit.de
______________________________________________________________________________
For information about additional contributors and copyright notes
please consult the user guide or type help about.
>> Running on gamma (Linuxx86_64 2.6.37.6-0.5-default) with 8 processors.
>> Loaded settings from /local/leadit-2.0.2-Linux-x64//settings.pxx.
>> Loaded settings from /home/user/.leadit//settings_2.0.2.pxx.
>> Licensed modules: LeadIT [CORINA_F] [CDOCK] [FLEXE] [DOCKING] [PERMUTE] [PHARM] [SCREEN] [PPI] [RECORE] [HYDE]
>> PVM status: no pvm daemon, running sequential.
LEADIT>
For this tutorial, we will rst load a ligand, then a receptor. (Sometimes the sequence may
be reversed for reasons not discussed here.) Since we do not yet have information about
what atoms make up the active site, we will take the ligand as it is in the reference pdb
le, and create a sphere of 6.5 radius around every atom of the ligand. Subsequently, all
protein atoms encountered within this sphere are considered active site atoms. If we answer
yes to the question . . . complete amino acids . . . ? (see below), then the set of already
selected active site atoms is augmented. We also chose the remaining amino acid atoms of
6.5. A TYPICAL COMMANDLINE INTERFACE WORKFLOW (TUTORIAL) 61
the ones that have already partially been selected by the sphere. In consequence of all this,
it is necessary to start with a ligand.
Our general procedure is therefore as follows:
1. The ligand
Load a (minimized) ligand as a mol2 le
Load the ligand as a reference, using the coordinates we took from the pdb
2. The protein (receptor)
Load the protein
Create the active site
3. Do the docking:
Base placement
Extend ligand fragments
4. Output some information
OK, lets go. Theres no time to make coffee, because FlexX is too fast for that! ;-)
6.5.2 The ligand
For future reference, please remember that a ligand should always be read in in a minimized
conformation. In this context, we will load a ligand and a reference ligand to which we will
refer an RMSD calculation after the docking.
The reference ligand is not loaded as a molecule; only the coordinates and atom types are
read and used as a reference (hence the name). You should also load the reference without
protons. FlexX does this by default; should you desire, for whatever reason to load the
reference with protons, you can do so (see below). However, we do not recommend this.
Let us read in the ligand structures now: Type ligand. FlexX changes to the ligand sub-
menu:
LEADIT> ligand
LEADIT/LIGAND>
Now read in the ligand structure as a mol2 le. Depending on your settings (level of ver-
bosity etc.) you will receive some warnings, errors or nothing at all.
Another thing to remember: Unlike other docking programs, FlexX does not need partial
charges. For scoring with the default FlexX score, FlexX needs formal charges. This is why
you should get rid of all partial charges before reading in your ligand, or leave this to FlexX.
How to do this is explained in Sections 7.5.4 and 7.5.7.
LEADIT/LIGAND> read ligand_minimised.mol2
>> Attempting to index multi-mol file..
>> Scanning .mol2 file...
>> File ./ligand_minimised.mol2 contains 1 compounds.
>> Applying transformation levels:
>> Level 2 : Preprocess / localize molecule
>> Level 3 : Aromaticity check
>> Level 4 : Assign default protonation
>> Level 5 : Assign formal charges
>> Level 6 : Assign delocalized systems
62 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
>> Level 10 : Assign atom-types
>> Type assignment check, OK.
>> Ring 1: conformations computed by CORINA.
>> Ring 2: conformations computed by CORINA.
>> Ring 3: conformations computed by CORINA.
>> Ligand 1dwd loaded from file ./ligand_minimised.mol2.
Current process size: 51088 kB
LEADIT/LIGAND>
As you can see, FlexX gives some information about the ligand it just read in, in this case it
shows the different transform levels (see Section 7.5.7), the atom (types) it corrected, and the
number of ring systems. Level 4, protonation is turned ONby default. If you want to change
any of the initialization levels, you can use the command selinit (see Section 7.5.4).
Next, we will read the same ligand as a reference. As mentioned above, this is necessary
for FlexX to calculate RMSD values for the docked solutions and for generation of the active
site.
LEADIT/LIGAND> rr ligand_as_in_pdb.mol2
>> Set reference coordinates by separate mol2 file
Ignore hydrogen atoms [y] :
>> Ligand reference coordinates loaded from file ./ligand_as_in_pdb.mol2.
Current process size: 51092 kB
LEADIT/LIGAND>
As you can see, FlexX ignores the hydrogens by default. If you type [n], FlexX will protonate
the reference as well (not recommended). See more on readref and how to prepare your
reference le in Section 7.5.11.
If you would like to visualize the ligand(s), you may type draw fix (x stands for xed
coordinates, the ligand to be docked), and/or draw ref (ref stands for reference), press
Return, followed by display or the shortcut go to make FlexV pop up. Click on the eye in
FlexV to center the molecule.
LEADIT/LIGAND> draw fix
>> Ligand drawn to graphics object 5
LEADIT/LIGAND> draw ref
>> Ligand drawn to graphics object 3
LEADIT/LIGAND> go
You can see in Figure 6.1 that the two molecules though being the same compounds do
not overlap. The ligand loaded second is still missing its surrounding protein, but we will
take care of that in a minute. For more information on draw, see sections 7.5.19 and 6.6.2.
If you want to read a ligand directly from a pdb le (e.g. if you do not have a separate
molecular modeling program to hand), FlexX has two nice new features to do that. Infor-
mation about the location of the binding site can be identied automatically, if there is a
ligand complexed with the desired protein. To get some information about how the pro-
tein is constituted, you can use the command PDBINFO from the receptor menu. Lets see
what happens if we use this command on a protein (which of course must reside in your
pdb directory). From the LIGAND menu, change directly to the RECEPTOR menu by typing
receptor. This will bring you to the RECEPTOR submenu. Then type pdbinfo 1dwd:
LEADIT/RECEPTOR> pdbinfo 1dwd
>> Reading PDB file 1dwd
>> structure 1dwd contains 2445 atoms in 343 residues.
6.5. A TYPICAL COMMANDLINE INTERFACE WORKFLOW (TUTORIAL) 63
Figure 6.1: FlexV with the reference ligand (gray) in the foreground and the ligand to be
docked (atom color) in the background.
1dwd LIGAND HOH-
*
-55-
*
, # (Unnamed (2 atoms))
1dwd LIGAND HOH-
*
-53-
*
, # (Unnamed (2 atoms))
1dwd LIGAND HOH-
*
-50-
*
, # (Unnamed (2 atoms))
1dwd LIGAND HOH-
*
-48-
*
, # (Unnamed (2 atoms))
1dwd LIGAND MID-
*
-1-
*
, # (NAPAP - SEE REMARK 13. 27 H31 N5 O4 S1)
1dwd PEPTIDE
*
-
*
-
*
-I # (Residues: 11)
1dwd PEPTIDE
*
-
*
-
*
-H # (Residues: 258)
1dwd PEPTIDE
*
-
*
-
*
-L # (Residues: 29)
As you can see, there is some information about several compounds in the protein. FlexX
tries to identify the type of compound (PEPTIDE, LIGAND, ION) and generates a unique
identier pattern for each compound which is composed of four values separated by a mi-
nus sign.
residue name
alternative location
residue number
chain identier
All hetero groups that are not single oxygens (which is the usual form, how water is repre-
sented in pdb les) are printed as a LIGAND group. In this special case there are several
pairs of water entities with identical naming and chain identier so they are combined in
one group and represented as a single hetero or ligand group. The identier patterns can be
used to extract ligands directly from pdb les, via the command FROMPDB in the LIGAND
menu. Lets try this with the identied NAPAP ligand. Change back to the LIGAND menu.
LEADIT/LIGAND> frompdb 1dwd
Residue(s) [none] : MID
>> Reading PDB file 1dwd
64 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
>> structure 1dwd contains 2445 atoms in 343 residues.
>> calculating connectivity for 37 atoms.
>> Ligand loaded as ligand_1dwd.
>> WARNING: Illegal valence state 5 at atom C35|2396 found after transformation.
>> Applying transformation levels:
>> Level 1 : Correct valences and bonds in PDB structures
>> Level 2 : Preprocess / localize molecule
>> Level 3 : Aromaticity check
>> Level 4 : Assign default protonation
>> Level 5 : Assign formal charges
>> Level 6 : Assign delocalized systems
>> WARNING: Illegal valence state 5 at atom C35|2396 found after transformation.
>> Level 10 : Assign atom-types
>> Type assignment check, OK.
>> Ring 1: conformations computed by CORINA.
>> Ring 2: conformations computed by CORINA.
>> Ring 3: conformations computed by CORINA.
>> WARNING: Empty list of torsion angles at bond C17|2378 --> C35|2396.
>> WARNING: No torsions, taking 30 degree grid with arbitrary reference atoms.
Process time used: 0.19 s. Current process size: 51620 kB
LEADIT/LIGAND>
We can shortcut the pattern to MID, because there is only one ligand with residue-name
MID in the structure. The coordinates of this structure are stored in the x-coordinate set of
the ligand. At the same time, the reference is initialized, with the same coordinates as the
ligand. Thus it is not necessary to read in a reference from a separate le. If you want to
do so however, be aware that the atom numbering in the reference and the ligand you read
in with FROMPDB is the same as in the PDB le. Thus, if you read in a separate reference
ligand, atom numbers will probably not match. We therefore recommend to write out both
the ligand and the reference to separate les, and then read themin again. Atomnumbers in
these les are set to 1. See also 7.5.3. It is highly recommended to visually inspect the ligand,
because bond types and atom types are approximated by bond length and some specic
rules to correct minor ambiguities, but sometimes this mechanism fails and it is necessary
to correct things manually. Refer to chapter 7.5.7 for rules-based structure modications.
Now, with a ligand that shows us where the binding site is located, we can directly import
the PDB. Lets switch to the RECEPTOR menu and read in the pdb le. It is important to
specify the le extension pdb, because FlexX searches for rdf les by default. Adefault rdf
le is provided in the @ROOTDIR/example/rec directory.
Make sure you always edit this le to your specic needs, it will improve your results dras-
tically! See section 11.3 for more details concerning how to edit the rdf le.
To switch to the receptor menu you can either type receptor from the ligand submenu
directly, or you can change to the superordinate menu by typing end, and then receptor.
6.5.3 The receptor
Generating an rdf le
Usually, fromRelease 3 onward, you will not have to hassle with rdf les but instead prepare
your protein using the GUI. A subsequent export as a Project File or the export as a FlexX
mol2 le can be read in at the commandline as well: Use RECEPTOR -> READ for a mol2
read-in or PROJECT -> READ for the fxx Project File. Despite all this however, for some
6.5. A TYPICAL COMMANDLINE INTERFACE WORKFLOW (TUTORIAL) 65
workows, the traditional way of reading in a protein using the combination of a pdb and
an rdf le will be needed. To make the initial work with FlexX easier, a default rdf le
is included in the static_data directory as a template which is directly used if a pdb
le is read as a receptor instead of an rdf le. The receptor description le contains the
location of the protein pdb le and provides FlexX with all the details about an active site
(the pocket) if it exists. This procedure can be used to get docking results quickly, and
to get a rst impression for example if a new complex structure is added to the Brookhaven
Protein database. To get information about the pdb le, you can use the command pdbinfo
again. After obtaining this information, we can now actually read in the receptor.
LEADIT/RECEPTOR> r 1dwd.pdb
>> PDB file given, using generic rdf file
>> Interactive determination of active site
Selection radius (A) [6.5] :
Select always complete amino acids [y] :
>> 274 atoms active in site.
HIS H 57 (his1 ); CYS H 58 (cysh ); TYR H 60A (tyr ); TRP H 60D (trp );
LYS H 60F (lys+ ); TYR H 94 (tyr ); ASN H 95 (asn ); TRP H 96 (trp );
ARG H 97 (arg+ ); GLU H 97A (glu- ); ASN H 98 (asn ); LEU H 99 (leu );
ASP H 102 (asp- ); ILE H 174 (ile ); ASP H 189 (asp- ); ALA H 190 (ala );
CYS H 191 (cysh ); GLU H 192 (glu- ); GLY H 193 (gly ); ASP H 194 (asp- );
SER H 195 (ser ); VAL H 213 (val ); SER H 214 (ser ); TRP H 215 (trp );
GLY H 216 (gly ); GLU H 217 (glu- ); GLY H 219 (gly ); CYS H 220 (cysh );
ASP H 221 (asp- ); TYR H 225 (tyr ); GLY H 226 (gly ); PHE H 227 (phe );
TYR H 228 (tyr );
>> Compute surface atoms, please wait ...
Interaction points (Level 3 2 1): 1907 ( 855 333 719)
Unbound receptor SAS: 13408.200 A^2 lipophilic SAS: 7089.592 A^2
>> Number of active site atoms: 274
>> Receptor loaded from 1dwd.pdb.
Process time used: 1.33 s. Current process size: 55120 kB
With the help of the reference ligand, we cut out an active site denition from the sur-
rounding pdb le. FlexX asks within which radius of the reference ligand it should create
the active site. 6.5 is the default value that we can safely accept in the example. The next
question FlexX asks is whether it should choose complete amino acids or not. Unless you
have a specic reason not to choose complete amino acids, just type return here. Depending
of the level of verbosity, FlexX outputs information about the number of active site atoms,
the SAS (solvent accessible surface) and lipophilic SAS. Once this is done, we can write the
receptor, pocket and surface information as well as the ligand reference structure to their re-
spective les. This is done with the write command from the LIGAND and the RECEPTOR
menu, respectively.
LEADIT
RECEPTOR> write a 1dwd_poc65.pdb n
LEADIT
RECEPTOR> write s 1dwd_surf.sdf
LEADIT
RECEPTOR> write r 1dwd.rdf
LEADIT
RECEPTOR> ligand
LEADIT
LIGAND> write fix
a stands for (a)ctive site, s for (s)urface, and r for (r)eceptor. You can get this menu also by
simply typing write and then <Return>, or use help write.
66 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
After we have written the les, we can adapt the rdf le to our special needs. Read more
about this in section 11.3
In addition to this mechanism, rdf les can be generated with the SYBYL interface to FlexX,
but we advise the user in production to carefully inspect the les, since this is where all
information about the active site is given and modied etc. Also, protonation-dependent
rules may be given here (please refer to section 6.5.7 for more on this).
You may want to use draw and go for piping the necessary visualization data to FlexV (it
may be in the background, so bring it to the front and move your mouse across the canvas).
LEADIT/RECEPTOR> draw
>> Receptor drawn to graphics object 4
LEADIT/RECEPTOR> go
Figure 6.2: FlexV with the reference ligand (gray) in the active site of 1dwd; in the back-
ground the ligand to be docked (atom color).
It should be visible that your reference ligand sits nicely in the pocket. Way off lies the
ligand waiting to be cut into pieces in a minute. . .
We are now done with both the ligand and the receptor and may proceed to the docking
stage.
6.5.4 Docking
Since FlexX performs exible docking based on fragmentation of the ligand, we will
have to start with a placement of a suitable base fragment. This may be quite a
sensitive point in the docking procedure! To read more about this, see Section 7.8.1 and 7.8.2.
For the time being and this tutorial introduction, we will simply take the defaults and dock
the fragmented ligand with a single command after having switched to the docking menu:
complex all. All necessary steps are automatically executed:
LEADIT/RECEPTOR> docking
LEADIT/DOCKING> complex all
>> No base placements, executing PLACEBAS (with default args).
>> No fragmentations, executing SELBAS (with default args).
6.5. A TYPICAL COMMANDLINE INTERFACE WORKFLOW (TUTORIAL) 67
>> Base fragment selection
>> Already selected base fragments: 0 of 10.
>> Automatic base selection
>> Fragmentation 0
No.|Connect| Connected by bond | Nof. | Nof. | # IA level |Components
| to | | Conf.| Ster.| 3 2 1 |
-----------------------------------------------------------------------
0| - | -- | 4| 0/ 0| 4 5 0 | 5, 11,
1| 0 | C11|11 -<12>- C12|12 | 1| 0/ 0| 0 0 1 | 4,
2| 1 | C8|8 -<05>- C11|11 | 1| 1/ 0| 0 0 1 | 2,
3| 2 | C9|9 -<12>- C8|8 | 1| 0/ 0| 1 0 0 | 3,
4| 2 | N7|7 -<09>- C8|8 | 1| 0/ 0| 1 0 0 | 1,
5| 4 | C20|20 -<03>- N7|7 | 1| 0/ 0| 1 1 0 | 8,
6| 5 | C19|19 -<10>- C20|20 | 1| 0/ 0| 0 0 1 | 7,
7| 6 | N18|18 -<07>- C19|19 | 1| 0/ 0| 1 0 0 | 6,
8| 7 | S22|22 -<04>- N18|18 | 1| 0/ 0| 2 0 0 | 9,
9| 3 | N1|1 -<03>- C9|9 | 1| 0/ 0| 0 0 5 | 0,
10| 8 | C26|26 -<10>- S22|22 | 1| 0/ 0| 0 9 2 | 10,
>> Fragmentation 1
...
...
energy cutoff (k best solution): -47.156
dock entry with | no. |norm. E.|total E.| rms |#M | overlap (ave.)
----------------------------------------------------------------------
min. norm. energy | 1| -61.899| -45.299| 1.22| 35| 1.64 ( 0.18)
min. energy | 1| -61.899| -45.299| 1.22| 35| 1.64 ( 0.18)
min. rms | 1| -61.899| -45.299| 1.22| 35| 1.64 ( 0.18)
first with rms<1.5| 1| -61.899| -45.299| 1.22| 35| 1.64 ( 0.18)
Final clustering: Number of solutions 461 (before) 293 (after)
Process time used: 10.06 s.
LEADIT/DOCKING>
After a few seconds, the docking ends with a number of solutions ranked according to the
scoring function. Very similar solutions have been clustered and the best solutions according
to energy, RMSD etc. have been listed. In our case, the best-ranked solution with respect to
score is also the one with the lowest RMSD. If you would like to see more solutions, use the
command LISTSOL which will list as many solutions as you like:
LEADIT/DOCKING> listsol
number of solutions to show [30] : 10
SELECTED SOLUTIONS: 1dwd.pdb -- ligand_minimised
+---+-------+-------+-------+-------+------+------+------+------+------+------+------+-----+-----+
|No.|Total |Match- |Lipo- |Ambig- |Clash-|Rot- |RMS- |Simil.|#Match|Avg. |Max. |Frag.|#Inst|
| |Score |Score |Score |Score |Score |Score |Value |Index | |Volume|Volume|No. | |
+---+-------+-------+-------+-------+------+------+------+------+------+------+------+-----+-----+
| 1|-45.299|-40.280|-14.798|-10.642| 3.822|11.200| 0.794| 0.734| 18| 0.179| 1.641| 0| 0|
| 2|-44.131|-40.452|-13.117|-11.685| 4.523|11.200| 5.881| 3.027| 12| 0.241| 2.300| 0| 0|
| 3|-42.680|-40.452|-14.843|-10.930| 6.945|11.200| 5.887| 2.973| 12| 0.386| 2.300| 0| 0|
| 4|-42.392|-43.402|-11.435| -9.358| 5.203|11.200| 4.051| 2.737| 17| 0.274| 2.592| 0| 0|
| 5|-41.976|-40.847|-12.335| -9.729| 4.335|11.200| 3.200| 2.344| 11| 0.234| 2.207| 0| 0|
| 6|-41.647|-41.770|-12.712| -9.415| 5.650|11.200| 2.712| 2.023| 13| 0.312| 2.377| 0| 0|
| 7|-41.169|-40.847|-11.673| -9.461| 4.212|11.200| 3.300| 2.306| 11| 0.232| 2.207| 0| 0|
| 8|-40.500|-41.958|-10.515| -9.188| 4.560|11.200| 4.280| 2.927| 14| 0.252| 2.592| 0| 0|
| 9|-40.466|-41.278|-10.685| -9.347| 4.244|11.200| 2.976| 1.924| 13| 0.234| 2.308| 0| 0|
| 10|-40.376|-41.375|-11.850| -9.755| 6.004|11.200| 5.054| 3.264| 16| 0.315| 2.293| 0| 0|
+---+-------+-------+-------+-------+------+------+------+------+------+------+------+-----+-----+
Process time used: 0.11 s.
LEADIT/DOCKING>
Visualize the best solution according to scoring rank in FlexV:
LEADIT/DOCKING> draw 1
>> Docking drawn to graphics object 5
LEADIT/DOCKING> go
This would write out the rst solution for FlexV. Bring FlexV to the front and see how well
you did!
68 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
6.5.5 Outputting information
You should by now be curious enough to see how good or bad it worked in detail. To this
end, we would encourage you to use the built-in help function. If you simply press Return
at any time at an ordinary menu prompt, a list of commands available at this stage will be
output.
Since we would like to get information about a special command called info under the
docking menu, we simply type help info at the docking menu prompt:
LEADIT/DOCKING> help info
Outputting the most important quantities of a docking result (INFO)
[Syntax:] INFO <TABLE FORMAT> <OUTPUT TABLE>
[Description:] Displays the main characteristics of the docking result
...
LEADIT/DOCKING>
This is what the output looks like: Or, if you would like to see a summary of the docking
you just performed, type INFO, which outputs the information in another table, especially
information about each fragment FlexX created:
LEADIT/DOCKING> info
Table format [y] :
Receptor : 1dwd.pdb
Ligand : ligand_minimised
----------------------------------------------------------------------
Number of solutions : 293
----------------------------------------------------------------------
Solution with RMS <=: min 1.000 1.500 2.000 2.500
----------------------------------------------------------------------
Energy : -45.299 -45.299 -45.299 -45.299 -45.299
RMS : 0.794 0.794 0.794 0.794 0.794
Rank : 1 1 1 1 1
Best sol. w. rank <=: 1 10 20 50 100
----------------------------------------------------------------------
Energy : -45.299 -45.299 -45.299 -45.299 -45.299
RMS : 0.794 0.794 0.794 0.794 0.794
Rank : 1 1 1 1 1
----------------------------------------------------------------------
Accumulated RMS : 0.794 0.794 0.794 0.794 0.794
Complex buildup history
----------------------------------------------------------------------
Fragment | 0 1 2 3 4 5 6 7 8 9 10
# placements | 1016 790 824 859 895 913 925 732 624 243 293
e_norm (min e_norm) | -94.29 -86.69 -87.74 -88.69 -87.70 -83.71 -82.40 -81.03 -74.88 -67.43 -61.90
e_total (min e_norm) | -23.68 -23.38 -23.35 -27.43 -29.36 -32.61 -31.45 -34.33 -37.86 -40.51 -45.30
RMS (min e_norm) | 0.52 1.81 1.79 1.53 1.81 1.49 1.63 1.42 2.02 1.94 1.22
e_norm (best pred.) | -94.29 -86.03 -86.36 -87.50 -82.45 -83.71 -82.40 -81.03 -70.44 -63.19 -61.90
e_total (best pred.) | -23.68 -22.71 -21.97 -26.25 -24.11 -32.61 -31.45 -34.33 -33.42 -36.27 -45.30
RMS (best pred.) | 0.52 0.87 1.45 1.35 1.47 1.49 1.48 1.42 1.34 1.38 1.22
Rank (best pred.) | 1 8 10 7 257 1 7 1 26 14 1
e_norm (min rmsd) | -94.29 -85.30 -83.66 -84.19 -82.45 -82.74 -81.96 -76.65 -65.88 -63.19 -61.90
e_total (min rmsd) | -23.68 -21.98 -19.27 -22.93 -24.11 -31.64 -31.01 -29.96 -28.86 -36.27 -45.30
RMS (min rmsd) | 0.52 0.74 0.92 1.09 1.47 1.45 1.42 1.40 1.34 1.38 1.22
Rank (min rmsd) | 1 14 96 96 257 4 18 108 329 14 1
Runtime (elapsed process time)
----------------------------------------------------------------------
Base placement : 3.62 s
Complex construction : 6.41 s
Postoptimization : 0.00 s
LEADIT/DOCKING>
Play around with the commands to get used to the procedures behind FlexXs help function.
You should now have obtained an initial overview of the steps to take for computing a
simple docking with one ligand and one protein. For multiple ligands and much more
6.5. A TYPICAL COMMANDLINE INTERFACE WORKFLOW (TUTORIAL) 69
complex applications, a script is usually more appropriate. Please refer to the index or the
table of contents to get into more detail. Thank you for going through the tutorial!
Oh, BTW: You exit FlexX by typing quit or if you dislike being asked whether you are
sure by a simple x.
LEADIT> quit
Are you sure you want to quit LEADIT <y,n> [y] : y
>> Releasing user data.
>> Releasing system data.
>> Bye!
6.5.6 Preparing the input data I: The Ligand
In this and the following section we explain how the input data for FlexX should be pre-
pared.
Ligand input le formats
The ligand must be given in the SYBYL MOL2 or MOL format. Since Release 2 SDF, PDB
and SMILES are accepted as well. For a description of these le formats we refer to the
SYBYL manuals [32]. For an overview of the default settings and the intercorrelation of
the important conguration ags of the conguration le in this context, please refer to
Table 11.1.
Atom and bond types
Correct atomand bond types are important for FlexX because they are used to map physico-
chemical information like torsion angles and interaction groups onto the ligand. If your lig-
and le has been converted automatically fromanother le format, check the types carefully.
The following hints may help you to assign the correct types:
FlexX makes no distinction between *.ar and *.2 atom types. In principle, *.ar types
should only be used in aromatic ring systems. In contrast, there is a big difference
between bond types ar and 2. The bond type ar should only be used in aromatic ring
systems.
Setting the right type for nitrogen atoms is the most difcult part of type assignment.
The atom type N.am and the bond type am should only be used in amide groups.
FlexX distinguishes between N.3 and N.pl3 atoms. In contrast to N.pl3, N.3 atoms
have a lone pair and can accept hydrogen bonds. In heterocycles, the types N.ar/N.2
and N.pl3 dene where the hydrogens will be attached later and thus where hydrogen
bond acceptors and donors lie.
FlexX automatically detects symmetry within the molecule. This is important because
it can drastically reduce conformational space. The symmetry detection is based on
atom types and bond types. We therefore recommend you use symmetric notations
instead of asymmetric ones, for example use O.co2 for both oxygens of a carboxylate
group and the same bond type instead of a double-bonded and a single-bonded nega-
tively charged oxygen.
70 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
Atomic charges
FlexX expects formal charges at the molecules (no partial charges!). The charges are used to
discriminate between charged and non-charged hydrogen bonds and to detect interaction
partners for salt bridges.
Here are some examples, often occurring in organic compounds:
CO
2
-0.5 (each oxygen)
NH
+
3
+1.0 (nitrogen)
C(NH
2
)
+
2
+0.5 (each nitrogen)
SO
3
-0.33 (each oxygen)
PO
2
-0.5 (each oxygen)
PO
2
3
-0.66 (each oxygen)
FlexX has an extended mechanism for automatic preparation of molecules, based on
SMARTS
TM
and SMILES notation is implemented as well, refer to 7.5.7 for more details.
The transformation level 5 is able to convert localized denitions into delocalized ones. This
setting is highly recommended if automatic structure generation is used.
The transformation rules are used by default at PDB, SDF and SMILES import.
The remaining steps
The ligand molecule should contain hydrogens. If the atomtypes are assigned correctly, this
step can be performed automatically by SYBYL.
Because bond lengths and angles in the ligand molecule are taken from the input structure,
the molecule should be energy minimized. Non-minimized structures can cause geometry
errors in FlexX.
Fixing parts of the ligand structure
constraining torsional angles/ring conformations
In some applications it makes sense to x a specic torsion angle or ring conformation. This
can be done with the @<TRIPOS>SET RTI (Record Type Indicator) in the mol2 input le.
For more general applications, such as constraining all amides to planarity, please refer to
the sections on subgraph specication and torsional data specication (sections 11.12 and
11.12.1).
Syntax: @<TRIPOS>SET
<set_name> <set_type> <obj_type> <sub_type> <status> <comment>
<num_members> <member> <member> . . .
The rst data line following the @<TRIPOS>SET line consists of several parameters includ-
ing the set name, set type, object type, the set subtype, status and a user comment. The
second data line contains the number of set members followed by a list of set members. For
more details on these refer to the SYBYL documentation [32]. To x a torsion angle or ring
at the conformation in the input le, these parameters must have the following values.
6.5. A TYPICAL COMMANDLINE INTERFACE WORKFLOW (TUTORIAL) 71
set_name The set name must read FIXTORSION or FIXRING. The name can contain more
characters after these keywords in order to distinguish different sets.
set_type The set type must be STATIC.
obj_type The object type must be BONDS for FIXTORSION or ATOMS for FIXRING.
sub_type The subtype of the set should be <user>.
status The status must be .
comment The comment can be any comment from the user.
num_members The number of set members.
member For a FIXTORSION each member is the bond ID of the bond which corresponds to
the torsion angle. For a FIXRING each member is the atom ID of an atom contained in
the ring system.
The following example xes the torsion angles at bond 5 and 7 and the ring conformation of
the ring system containing atom 12:
Example
@<TRIPOS>SET
FIXTORSION STATIC BONDS <user>
****
fix torsions at bonds 5 and 7
2 5 7
FIXRING STATIC ATOMS <user>
****
fix ring containing atom 12
1 12
Note: It is still possible to x torsion angles and ring conformations using the old
@<TRIPOS>COMMENT RTI method.
Preparing a reference structure le
You can compare the predicted binding modes of FlexX on the y with a reference position
of the ligand in the active site. This can be useful for testing the predictive power of FlexX
on specic proteins or for comparing the predicted binding modes with those generated
manually.
The reference structure le must be in SYBYL MOL2 format and can be read with the
READREF command. The numbering scheme of the atoms as well as the chemical elements
must be identical to those in the previously read ligand input le.
Alternatively, a reference structure can be assigned with the MAPREF command. In this
case the assignment is done via subgraph matching such that the atom numbering is of
less importance. The SYBYL atom types must be identical between the reference and the
input molecule in order to nd a hit. Bond type comparison is optional. In the case where
more than one hit of the reference structure is found, the arbitrary rst mapping is used.
In the case where the reference structure is used for base selection or placement, multiple
mappings are evaluated.
Another way to obtain a reference structure is to directly extract the ligand from a protein-
ligand complex given in PDB format using the FROMPDB command. But then visual inspec-
tion and manual intervention are usually necessary.
72 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
6.5.7 Preparing the input data II: The Protein
We recommend you simply follow the steps in the GUI (Sec. 5.1.1).
Should you however need or prefer to use the alternative to prepare your protein using an
RDF le (cp. Sec. 11.3), then there are two decisive steps: the denition of the active site and
the creation/edit of the receptor description le (rdf). Here are the necessary details:
Dening the active site
There are two main ways of dening the active site of the protein:
Denition by a placed ligand If you take the protein from a PDB le which already con-
tains a ligand, you can dene the active site in FlexX with the following sequence of steps:
Prepare the ligand input le and a reference structure le as explained in the previous
section.
Create the receptor description le (see the following section) without specifying the
active site (no @pockets entry).
Start FlexX and load the ligand input le (LIGAND/READ) and the reference
structure le (LIGAND/READREF, LIGAND/MAPREF, LIGAND/SETREF) or use the
LIGAND/FROMPDB command.
Read the pdb le directly by using the default.rdf le specied in your congura-
tion settings or a predened receptor description le (RECEPTOR/READ) and specify a
selection radius r. All protein atoms having a distance not greater than r from an arbi-
trary ligand atomare then dened to be part of the active site. In addition, the selection
can be extended to complete amino acids. Complete amino acids should be selected
only if there is a good reason to do so. For a FlexX calculation, complete amino acids
are not required and adding atoms which are far away from the active site results only
in an increase of computing time.
Write the active site le (RECEPTOR/WRITE, option a) and insert the lename into the
receptor description le.
Denition with a modeling tool Alternatively, you can use a modeling tool to dene the
active site. Generate a le in PDB format which contains all atoms of the active site. FlexX
uses the atom name, the amino acid name, the amino acid number and the chain ID for
mapping the atoms back onto the PDB input le.
Creating the receptor description le
There is a detailed description of the features of the receptor description le in section 11.3.
The easier way to create this le is to copy the le default.rdf from the help directory
and edit as follows:
In the line beginning with @pdb_file: insert the name of your pdb le.
6.5. A TYPICAL COMMANDLINE INTERFACE WORKFLOW (TUTORIAL) 73
In the @atoms record: give the identiers of protein atoms to be used for the calcula-
tions in a sequence of include/exclude commands (one in each line). The command
takes four parameters which can also be wildcards (
*
character): atom name, amino
acid name, chain ID, amino acid number. For atomnames, the constellation Element
*
is allowed in order to specify all atoms of a specic element. The chain ID can be
specied by _. If you want to include all atoms of chain A but no hydrogens, you
would enter the two commands include
* *
A
*
and then exclude H
* * * *
.
In the @pockets record: insert a list of pocket denitions, each given by a pocket
name (for reference only) and a pocket le name. All pockets together form the active
site. If the active site is to be dened interactively with FlexX, either comment the
@pockets record out (with #) or enter a lename of a non-existing le.
In the line beginning with @surface_file: insert the name of your surface le or
comment it out (with #) in order to let FlexX calculate the surface atoms. Once this
calculation is done you should write out this data (see the receptor WRITE command,
section 7.6.5, for further information).
Templates (beginning with @templates): the given list with default settings will be
sufcient in most cases. Only if there are HIS groups in the active site is it important to
nd out what N atom is binding the hydrogen. This information is given by choosing
the corresponding template (his1 or his2; e.g. in the case of 1DWD you would add
HIS
*
57 his1.)
Hetero atoms (beginning with @hetero_atoms or @hetero_files): if there are no
metal atoms or water molecules or other ligands in the active site you should not
change anything. If there are hetero atoms to be loaded as part of the protein, you
must include them on the lines following exclude
* * *
. For details see section
11.3 (e.g. in the case of 3CPA you would add include _ZN
*
1).
If a more complex molecule such as a cofactor is to be included, you can prepare the
cofactor in the same way as a ligand and store it (with the coordinates of the cofac-
tor inside the protein) in mol2 le format. You can then add the cofactor with the
@hetero_files record.
Alternate locations (beginning with @alternate_locations): if no alternate loca-
tions are given in your PDB le, do not make changes (otherwise see 11.3).
H torsions (beginning with @h_torsions): here you should be careful because the
given default rules do not work in many cases. E.g. favorable OH torsions of TYR
are 0 and 180 as well, but the default value is 180. So analyze the hydrogen bridge
network in your protein (with a model builder) and choose the H torsions that are
most plausible. Then add for each group with a torsion angle differing fromthe default
value a new line with the correct value (e.g.: tyr A 57 _ce1 _cz _oh 0.).
Unidentied atoms (beginning with @assign): if there are no unidentied atoms in
your PDB le, do not make changes (otherwise see 11.3).
It may be a little bit confusing that some records expect their arguments on the same line
whereas others expect several lines of input. But there is quite a simple rule behind this: if
74 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
an entry consists of only one single line, it must be given in the same line. If it is in principle
possible to enter more than one entry in a record, then FlexX expects these in separate lines.
Treatment of hydrogen atoms in the context of PDB les
There are a fewcaveats to consider regarding existing or missing hydrogens in your PDB le.
Hydrogens need to be given in the correct pdb nomenclature to be recognized and
used by FlexX, including the pdb torsion angles of the hydrogens. However, several
tools use slightly different hydrogen names. This means in the end that FlexX is not
able to recognize the available hydrogens and overwrites the given hydrogen posi-
tions.
If the pdb le provides no hydrogens for an atom, FlexX will automatically com-
plete the missing hydrogens according to the templates specied in the static data le
AMINO (see section 11.8). The torsion angles for hydrogens placed solely by FlexX
are obtained from the @h_torsion record specied in the RDF le (see section 11.3).
If the pdb le provides at least one hydrogen for an atom, FlexX will not include any
further hydrogens. Therefore, in this scenario, it is in principle possible that atoms
result with fewer hydrogen atoms than specied in the respective template of the
AMINO le!
Some atoms of the pdb le might have more hydrogen atoms bonded to an atom than
the template would. In this case FlexX arbitrarily deletes excessive hydrogens, which
might on the one hand lead to a deletion of hydrogens that are specied in the cor-
responding template and on the other hand to non-deleted hydrogens that have not
even been specied in the AMINO le.
If you have an amino acid with more than one possible template tautomer (e.g. his-
tidines, see section 11.8), FlexX may change both the protonation state and pattern
according to the specied template in the RDF le (see section 11.3)
Reading the receptor from a mol2 le
The receptor can also be read directly from a single MOL2 le without any reference to an
RDF le. Here, the rationale is that the user carefully prepared the receptor outside of FlexX
and wants to use it in that state for docking. In particular, no amino acid templates are used
on this import route. Thus, it is assumed that a number of tasks have already been carried
out by the user when the receptor is loaded, e.g.:
FlexX uses the protonation states exactly as described in the receptor MOL2 le. FlexX
does not add or remove any hydrogen atoms.
In general, precisely those atoms that are contained in the MOL2 le dene the recep-
tor. It is impossible to exclude or include any further atoms, cf. sections 11.3.2 and
11.3.6. A receptor MOL2 le must not be a multi MOL2 le.
6.6. FLEXX AND FLEXV 75
The user has already resolved any ambiguities that might exist in a PDB le with re-
spect to atom locations, see sections 11.3.7 and 11.3.9.
All torsion angles, including those involving polar hydrogens, are read exactly as pro-
vided by the receptor MOL2 le, see section 11.3.8.
There is one exception to this concept: FlexX may assign delocalized formal charges.
Whether it does so or not can be controlled via ag INIT_MOL2_RECEPTOR, see section
10.1.4.
A receptor MOL2 le may contain an optional @<TRIOS>SET section in order to specify a
binding site:
@<TRIPOS>SET
FLEXX_BINDING_SITE STATIC ATOMS <user>
<number of atoms in set> <atom index 1> <atom index 2> .....
Here, the atom indices must refer to atoms listed in a preceding <@TRIPOS>ATOM section.
Note that the number of atoms and the atom indices form a single logical line. The latter
should be divided into several physical lines for large binding sites: Use a backslash at the
end of a physical line to indicate that a continuation line follows. Note also that <user>
does not represent a placeholder but a constant character string.
Example
@<TRIPOS>SET
FLEXX_BINDING_SITE STATIC ATOMS <user>
20 3 8 10 5 37 38 39 63 50 98 \
99 100 102 137 200 203 204 214 299 301
Alternatively, a binding site may be specied via a reference ligand as described previously,
see section 6.5.7. In this case make sure that the reference ligand is loaded before the receptor.
Note that currently the RECEPTOR/ACTIVE command cannot be used for receptors that
were loaded from a MOL2 le.
6.6 FlexX and FlexV
This Section will usually only be needed for advanced or expert users, because the GUI
takes over the most common tasks which require graphical assistance. Should you like to
know more about the combination of FlexX with FlexV, our free, generic viewer, then we
encourage you to read on. Please be aware that FlexV functionality will more and more be
taken to the GUI part of LeadIT .
If you have already completed the FlexX tutorial for commandline usage above you may
already be familiar with FlexV. FlexV is a graphics tool that is included for free in the FlexX
package to enable visualization of molecules and other FlexX objects.
Therefore, even when using the Commandline Mode, you can still trigger visualization with
FlexV: After download/installation (http://www.biosolveit.de/download), type
bin/flexv at the operating systems shell or simply GO from within FlexXs shell to start
FlexV.
76 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
When started the rst time, FlexV generates a .flexv le in your home directory. This le
will be used to store visualization preferences and should therefore be separate for each user
(see FlexV manual for details). The FlexV main window should then appear. Now you can
quit FlexV again by pressing the Quit button in the lower-right corner. No license key is
needed for FlexV.
6.6.1 Choosing your graphics interface
In addition to the built-in GUI, FlexX offers to use FlexV. Please set the global preferences
variable FLEXVaccordingly (see Section 4.1.2 on howto change). This variable must contain
the correct call to the viewer executable. Remember the FlexV executable came with FlexX
so it can be found in the FlexX installation directory.
6.6.2 The DRAW commands
In general, molecules etc. are visualized in FlexX by calling the DRAW commands. There
are actually several DRAW commands to be found, each in a different menu, and what is
drawn depends on which menu the user is working in. There are DRAW commands for
the menus RECEPTOR, LIGAND, DOCKING, PHARM and RECEPTOR/GAUSS (FlexX-Pharm
module), CLIB (FlexX
c
module), ENSEMBLE and ENSEMBLE/GRAPH (FlexE module). For
example, using the DRAW command in the RECEPTOR menu will produce a graphic of the
protein/receptor, while using DRAW in the LIGAND menu will produce a graphic of the lig-
and. A couple of the menus also contain an MDRAW command (multiple DRAW) which allows
you to draw many things with one command.
6.6.3 DRAWing and DISPLAYing
At this point we should explain how graphics are created in FlexX and sent to the visualiza-
tion program. At the time of the DRAW command, a graphic is produced and sent to a le -
either a) a temporary runtime le which does not contain information relevant to the user, or
b) to a user-specied destination le. The graphics are viewed with the DISPLAY command
(see 7.2 below. An alias "GO" has been set to the DISPLAY command, so type the shorter
command GO if you want to see the results quicker! When you use this command, the vi-
sualization program is triggered and the le containing the graphics is displayed. FlexV
will only be started if it was not already running, otherwise the graphic will be sent to the
already running FlexV - you can also continue working with FlexX while FlexV is running.
For some more technical details about the graphical interfaces see also Program interfaces
(section 12.)
6.6.4 What are graphics objects?
The concept of graphics objects is only really relevant if you are working with FlexV. At
each DRAW command, the graphics are actually drawn into graphics objects and the graphics
objects are sent to the le. For each menu a default graphics object has already been set.
For example, the receptor is drawn in graphics object 4 and the ligand in graphics object
5. Graphics in each graphics object are held separately from one another. When they are
displayed in FlexV they can be controlled separately, e.g. switched on and off etc. (see
6.6. FLEXX AND FLEXV 77
the FlexV manual for more details about controlling individual objects). There are up to
255 graphics objects available - you can change which graphics object is used by the DRAW
command with the SELADM command (see below).
6.6.5 Controlling what is drawn and how it is drawn details of the SELxxx graphics
commands
There are several other commands controlling the graphics that you will nd in each menu
with a DRAW command. These control how the graphics will appear when they are drawn.
Defaults are of course already set for each DRAW command so if you are happy with these
you do not need to change anything. However, it is possible to control the graphics using
one or more of these commands. You can control what is drawn (command SELGRA), what
color scheme will be used (command SELCOL), what labeling scheme will be used (com-
mand SELLAB), and where the graphic will be sent (lename etc. command SELADM).
The actual parameters for each of these commands differ depending on which menu you
are working in. If you wish to permanently set your preferences as the defaults for drawing,
you may set them in the GRAPHIC static data le 11.22 it may also be useful to read this
section if you really want to get to grips with controlling your graphics.
78 CHAPTER 6. FLEXX IN COMMANDLINE MODE --- I. BASIC USAGE
7
FlexX in Commandline
Mode II. Menus and
Commands
7.1 Menus
This is the hierarchical menu structure of FlexX. The menus in square brackets are only
available if the corresponding module is licensed.
LEADIT---+--DATABASE
|
+--PROJECT
|
+--RECEPTOR---+--[GAUSS] (module PHARM)
|
+--LIGAND-----+--CONFORM
|
+--[PBC] (module SCREEN)
|
+--DOCKING----+--ANALYZE
|
+--[CLIB] (module CDOCK)
|
+--[CDOCK] (module CDOCK)
|
+--PVM
|
+--[PHARM] (module PHARM)
|
+--[ENSEMBLE]-+--[GENRDF] (module FLEXE)
+--[GRAPH] (module FLEXE)
Typing the submenu name brings you to the submenu, typing END returns you to the parent
menu. You can type commands and menu names in uppercase or lowercase letters.
79
80 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
7.2 Global commands
In contrast to menu commands, global commands are available in every menu. As with
menu names, no distinction is made between uppercase and lowercase letters.
7.2.1 Quitting FlexX (QUIT)
Syntax: QUIT
Description: Quits FlexX after clearing memory.
7.2.2 Returning to the main menu (MAIN)
Syntax: MAIN
Description: Returns to the main menu.
7.2.3 Returning to the parent menu (END)
Syntax: END
Description: Returns to the parent menu. Has the same effect as QUIT when en-
tered in the main menu.
7.2.4 Online help (HELP)
Syntax: HELP <topic>
Description: Displays the description of the specied <topic> on the screen. Pos-
sible topics are all valid commands or menus available in the current menu. The
text is taken directly from this LaTeX source by a simple parser, therefore the text
might not be formatted perfectly in every case.
7.2.5 Viewing the User Guide (MANUAL)
Syntax: MANUAL
Description: Displays the User Guide for FlexX. Starts your local PDF
viewer with flexx_ug.pdf (PDF: Adobe Portable Document For-
mat). The viewer application is adjustable in the conguration dialog
(File -> Global Preferences -> Parameters & Flags) The default
viewer is acroread, which is available for free at http://www.adobe.com/
products/acrobat/readstep2.html.
7.2.6 Short online help (?)
Syntax: ? <topic>
Description: Displays a one line (very short) help text about the specied <topic>
on the screen. Possible topics are all valid commands or menus available in the
current menu.
7.2. GLOBAL COMMANDS 81
7.2.7 Export conguration le (WRITECFG)
Syntax: WRITECFG <lename>
Description: Exports current conguration in xml format to <lename>.
7.2.8 Listing environment variable settings (LIST)
Syntax: LIST <topic>
Description: Displays a list of environment variables and their current values on
the screen. <topic> can be
g Program ags, see section 10.1.4. Flags which must be constant during an ap-
plication are marked.
dir Program directories (paths), see section 10.1.1
exe Executables called by FlexX, see section 10.1.3
db Static data les, see section 10.1.2
par Program parameters, see section 11.5
mol Filenames of currently loaded les
all All of the above.
7.2.9 Changing values of environment variables (SET)
Syntax: SET <variable name> <value>
Description: Sets the variable <variable name> to value <value>. Valid strings
for <variable name>are the names of environment variables for non-constant ags,
directories, and program parameters, see sections 10.1.4, 10.1.1, 11.5.
Example
SET verbosity 3
SET ligand /home/goofy/new_drugs/
7.2.10 Selecting the output destination for docking results (SELOUTP)
Syntax: SELOUTP <destination> [<append>] [<pvm merge>]
Description: Directs the output generated by the LIST, LISTALL, LISTSOL,
LISTMAT, QUERY and INFO commands (see sections 7.5, 7.8) to <destination>.
If <destination> equals the string screen, the output is directed to the
screen, otherwise it is directed into a le. The name of the le is
<destination>.log, if <destination> does not contain any sufx, otherwise
the name is <destination>.
This le will be located in the directory stored in the PREDICT environment vari-
able. If <destination> is a le, you can decide with <append> whether the output
is to be appended to the existing le (<append>= a) or the le is to be overwritten
(<append> = o).
82 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
In FlexX-PVM, output les created by SELOUTP and commands listed above are
automatically merged after parallel script execution. This feature can be switched
off by setting <pvm merge> to no.
If <pvm merge> is set no, then the built-in batch variable $(PVM_ID)
1
is auto-
matically appended to the lename.
Important note: For lename usage and le merging within scripts, please refer to
the PVM section on page 147.
Note: The SELOUTP command must be used before a command whose output is
to be redirected (e.g. LIST, LISTALL, LISTSOL, etc). Afterwards the output stream
can be set back to the screen again. Here is an example:
Example
DOCKING
SELOUTP my_info_0_file.log a y
INFO 0
SELOUTP screen
END
7.2.11 Sending a command to FlexV (TOFLEXV)
Syntax: TOFLEXV <command>
Description: You can send a command string to FlexV with TOFLEXV. Acommand
consists of a single character (the rst character from the sent string followed by an
argument. In case of only one argument, the argument can be written directly after
the command without a separating blank. If there are more than one argument, the
whole command has to be put in quotation marks, like this:
"<command> [<first_arg> [<second_arg> ...]]".
Currently, the following commands are implemented:
1
$(PVM_ID) : please refer to the PVM section on page 147
7.2. GLOBAL COMMANDS 83
Com. Argument Meaning
. do nothing, can be used for checking the com-
munication line
b BREAK, terminates the communication link to
the application program without terminating.
c [x y z] CENTER, sets the center of rotation to the cen-
teroid of all currently visible objects. If x, y, z
given, center of rotation will be set to the given
coordinates.
d o_id DELETE, deletes the contents of the graphic ob-
ject o_id, which can be 0 to 255 or the string
"all".
i le1 [o_id [mode]]
[matchlist_le le2
[show|skip]]
IMPORT, imports the mol2 le file1.
o_id is optional. If just one o_id is given, FlexV
imports the mol2 le into this slot. If the slot
is not free, it will be overwritten without any
warnings. Using mode "a" after it, you can
append the new mol2 to an occupied slot as a
new slider object.
Its also possible to give a range of object ids,
like "2,3,4,7", "2-4,7". In this case, FlexV trys to
import the mol2 le file1 into the next free
slot. If there is no free slot, FlexV breaks with
a warning. Using mode "a" for append, FlexV
trys to append the new mol2 le to the last
used slot in the given o_id range. If the last
used graphical object is not within the given
range, FlexV breaks with a warning.
Giving a matchlist and a second mol2 le
file2 (optional), FlexV draws lines between
the atoms in the rst and the second given
mol2 les for each matching entry in the
matching le. Matchlist le consists of lines
with following format:
<atom_id_1> <atom_id_2> <label>
<energy>,
<atom_id_1>:integer, atom ID in file1,
<atom_id_2>:integer, atom ID in file2,
<label>:string, label of the line to draw,
<energy>:double, base for the color of the line.
Giving argument "skip", skipps drawing of
the second mol2 le file2, "show" (default)
draws also the second one.
84 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
Com. Argument Meaning
g le[ o_id] GET, load the gdf le.
o_id is optional. If not given, the objects are
written to the slots that are saved in the gdf le.
If just one value is given, FlexV loads the gdf
le file into the given o_id. An existing
graphical object will be overwritten without
any warnings.
It is also possible to give a range of object ids,
like "2,3,4,7", "2-4,7". In this case, FlexV trys to
load the gdf le file into the next free slot. If
there is no free slot, FlexV breaks with a warn-
ing.
l LOOKAT, initiates the look-at function.
m mode[ o_id] changes the current molecule display mode.
Possible values are "lines", "ballssticks",
"sticks", "spheres" and "smartsticks".
o_id is optional. If it is left out or set to "all"
or "*" all slots are changed. Otherwise only the
given slots are affected.
It is also possible to give a range of object ids,
like "2,3,4,7", "2-4,7".
p switches to pharm mode and opens the pharm
control panel.
r action ROCK, switches the rock mode on (action is
"on") or off (action is "off")
s o_id action SWITCH, switches graphic objects on or off.
o_id is the number of the graphic object, action
is the string "on" or "off", "-" (decrement visible
instance id), "+" (increment visible instance id),
or the instance id directly. The whole call has to
be put into quots.
x EXIT, quit FlexV
7.2.12 Sending FlexX graphic objects to the visualizer (DISPLAY)
Syntax: DISPLAY
Description: The command causes a switch to FlexV and displays the objects pre-
viously drawn with DRAW.
7.3. COMMANDS IN THE ROOT MENU 85
7.2.13 Erasing a graphics object (ERASE)
Syntax: ERASE <graphics object selection>
Description: Deletes the selected objects with the next execution of DISPLAY.
<graphics object selection>species the graphic objects to be erased It can be either
a single number, a list of numbers separated by blanks or ,, a list of intervals of the
form a-b or all. Note that you must enclose the expression in quotation marks if
it contains blanks.
7.2.14 Executing shell commands (! and EXEC)
Syntax: ! <unix command>
EXEC <line no> <unix command>
Description: If a user input starts with the !-character, the complete string (without
this rst !-character) is passed as a command to the operating system which will try
to execute it.
In contrast to !, EXEC reads the output written by the unix command to stdout and
stores the output line <line no> to a built-in variable named $(UNIX_OUTP) which
can be accessed later on in a FlexX script.
Note that the <unix command> goes through the parameter processing unit of
FlexX. Therefore, if called in a script, all variables starting with $ are exchanged by
their values.
Important notes: Note that no shell-specic expansions can be performed (for ex-
ample the ~-character cannot be expanded to the home directory name).
Shell commands are disabled in the WWW interface.
Example
!ls -a
!cp dummy.c /home/usr/snoopy/test/
7.2.15 Executing internal unit tests (UNITTESTS)
Syntax: UNITTESTS
Description: This command runs a couple of internal self-tests for FlexX. It is
mainly used for internal quality checks. It prints a dot for each test executed and
summarizes the results. The nal result of this command call should always be
OK. Please report to support@biosolveit.de if this is not the case.
7.3 Commands in the root menu
7.3.1 Deleting everything (DELALL)
Syntax: DELALL
Description: DELALL deletes everything in FlexXs main memory except static
data. Thus it summarizes the delete commands in the submenus LIGAND,
RECEPTOR and DOCKING.
86 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
7.3.2 A complete docking run (AUTODOCK)
Syntax: AUTODOCK <lig. le> <compare> [<ref. le>] [<ref. lename>]
[<hydro>] <rec. le> <manual> [<base atom list>] <algo type> <build up>
Description: The AUTODOCK command is a summary of all the steps required to
yield a complex prediction from FlexX. These are:
Reading the ligand (LIGAND/READ <lig. le>)
If you want to compare the predicted placements with reference coordinates, set
<compare> to y. Then you can decide whether you want to read the reference
coordinates from disk (set <ref. le> to y) or take the coordinates already loaded
(set <ref. le> to n). Depending on <ref. le> one of the following operations is
performed:
Reading reference coordinates (LIGAND/READREF <ref. lename>)
Setting reference coordinates (LIGAND/SETREF <hydro>)
The subsequent steps are:
Read the receptor (RECEPTOR/READ <rec. le>)
Select the base fragment (DOCKING/SELBAS <manual> [<base atom list>])
Place the base fragment (PLACEBAS <algo type>)
Build up the complex (COMPLEX <build up>)
See the respective commands for the meaning of the parameters.
Requirements: FlexXs workspace should be empty. Perform a DELALL before
AUTODOCK if this is not the case.
7.3.3 Executing a script le (SCRIPT)
Syntax: SCRIPT <lename> [<parameter list> <keep variables>]
Description: Executes the script <lename>. <parameter list> is a list of script
variables with predened values, which must be separated by ;. If <keep
variables> is answered yes, the list of script variables is not reset, i.e. variables and
their values from previously executed scripts are present during the execution of
the script. See section 9.1 for an explanation of the script language.
7.4 Working with projects (PROJECT submenu)
7.4.1 Reading (READ)
Syntax: READ <lename>
Description: Reads a FlexX Project File <lename> into FlexXs workspace. The
le must have the extension .fxx. You can create a Project int the GUI mode of
FlexX .
Important note: Reading a Project resets all settings of FlexX, i.e. all ligands, re-
ceptors, pharmacophore constraints etc. are deleted and all settings, which might
have been changed with the SET command are reset to their default values, which
are dened in either the FlexX code, the installation, or the user settings les. Fi-
nally, the settings dene in the Project File are set.
7.5. WORKING WITH LIGANDS (LIGAND SUBMENU) 87
7.4.2 Run docking (RUNDOCK)
Syntax: RUNDOCK [<scoretab-lename>] [<solutionstab-lename>] [<pose-
lename>] [<max_number_of_poses>]
Description: Performs a docking run if the FlexX workspace contains a Project.
You can dene some optional output les:
1.scoretab-lename:
Export the score and rms value of pose with rank 1 to
<scoretab-filename>. If the specied le still exists, the data will
be appended.
2.solutionstab-lename:
Export the scores of all calculated poses to <solutionstab-filename>.
(the format is described in 7.8.13). If the specied le still exists, the data will
be appended.
3.pose-lename:
Export the calculated poses to <pose-filename>. The maximum number
of poses can be dened with the next parameter. If the specied le still exists,
the data will be appended.
4.max_number_of_poses:
Export at most <max_number_of_poses>to the le dened with parameter
<pose-filename>. The default is all.
7.5 Working with ligands (LIGAND submenu)
7.5.1 Reading (READ)
Syntax: READ <lename> [<molecule ID>]
Description: Reads a ligand le <lename> into FlexXs workspace. The FlexX
native le format for ligands is the SYBYL MOL or MOL2 format [32], the le must
have the corresponding extension .mol or .mol2. The rules explained at the be-
ginning of section 11 apply to the lename. The default directory for this command
is the path specied in the LIGAND entry.
If the le is a multi-mol2 le (multi-mol les are not supported), <molecule ID>
gives the number of the molecule to be loaded.
For special features and/or recommendations with respect to other formats such as
SMILES, please refer to the table in section 11.1.
The following operations are initiated in the following order:
1.File read-in
2.Identication of ring systems
3.Ring conformer generation (e.g. by CORINA)
4.Molecule initialization (see below)
5.Stereo descriptor and atom equivalence class computation
6.Torsion angles analysis for all acyclic single bonds
7.Interaction type and interaction geometry assignment
88 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
The molecule initialization comprises a preprocessing step, formal charge assign-
ment, an aromaticity analysis and much more. The initialization conguration can
be fully congured with the rules dened in the transform.dat static data le.
Please refer to section 11.18 for the details.
Note: If you use a verbosity level of 5 or higher, FlexX lists an overview of com-
ponents in its output. The respective table contains statistical information about
the number of interaction geometries of a certain interaction level (#IA level) within
each component. Please note that for every atom only one contact type (the one
with the highest interaction level) is counted. For example, if you have a carbon
atom with the contact types phenyl_ring (level 2), phenyl_center (level 2), and
aro (level 1), only one interaction of type 2 is counted.
7.5.2 Direct SMILES parsing (SMILES)
Syntax: SMILES <smiles-string> [<molecule-name>]
Description: Generates a ligand directly from a SMILES code using CORINA, if
available.
7.5.3 PDB import (FROMPDB)
Syntax: FROMPDB <pdb-lename> <pattern>
Description: Extracts atoms from a pdb le and evaluates the connectivity. The
connectivity can usually just be approximated, so many bond lengths are in between
single, double and triple bond distances. To x those ambiguities, FROMPDB uses
the rules from transform.dat to generate a correct molecule. The coordinates
from the PDB source are internally copied to the reference coordinates and the xed
coordinates too. A READREF or MAPREF is not necessary for a redocking.
Important notes: When using this command to cut out a ligand from a protein ac-
tive site in a PDB le, carefully check the ligand and its atom types. Minimization
before docking most often drastically improves the docking results. Use this com-
mand with care! The atomnumbering as it occurs in the PDB le is kept only as long
as the (reference) ligand is not written to a le. If you want to read in a different ref-
erence, atom numbering to the ligand still in memory will probably not match. In
order to have consistent numbering between the ligand and the reference, we rec-
ommend to write out both to separate les, and read them in again (using readref
and read for reference and ligand, respectively). Please also consult the tutorial for
this issue (Section 6.5.2)
7.5.4 Setting up the initialization procedure (SELINIT)
Syntax: SELINIT [<list of levels>]
Description: Adjusts the state of transformation rules in the initialization process.
Initially the transformation levels applied during initialization are set in the
transform.dat le by setting a switch ON or OFF. To adjust this for special
purposes or to perform only selected initialization steps, like protonation or
assignment of formal charges, the levels can be set by this command. There are two
7.5. WORKING WITH LIGANDS (LIGAND SUBMENU) 89
ways of calling this command:
Interactively When no levels are specied as parameters to the command, FlexX
asks for each level individually whether it should be enabled (ON) or disabled
(OFF). See the example below.
By parameters The numbers of the levels that should be enabled or disabled can
be specied as parameters to the command. An exclamation mark (!) before
the level number means it should be disabled, otherwise just the presence of
the number in the list means it will be enabled. An asterisk means all levels. In
addition to using the number of a level, there are several labels for the initialization
procedures themselves, regardless of the number they have. Currently there are
seven procedures available:
Label Action Default level number
P PDB import 1
L Localization of bonds 2
A Aromatic systems 3
H Protonation 4
F Formal charges 5
D Delocalization 6
T Atom type check/assignment 10
Example
LEADIT/LIGAND> selinit
Level 1: Correct valences and bonds in PDB structures [OFF] : OFF
Level 2: Preprocess molecule [ON] : OFF
Level 3: Aromaticity check [ON] : OFF
Level 4: Assign default protonation [ON] : ON
...
Level 10: Assign atom types [ON] : ON
Example
LEADIT/LIGAND> selinit !
*
10 # enables atom type detection, only
LEADIT/LIGAND> selinit !T H # atom type check off, prot. check on
LEADIT/LIGAND> selinit !5 10 # disable level 5 but enable level 10
7.5.5 Cleaning up molecules (REINIT)
Syntax: REINIT
Description: The structure manipulation command TRANSFORM allows complex
manipulations of loaded structures. Because TRANSFORM changes only parts of the
ligand and does not always check the adjustment of atom, torsion and interaction
types, it can happen that the molecule is in a more or less undened state. So if
several TRANSFORM commands were applied to a ligand, use REINIT to clean up
the structure. This uses the rules in transform.dat to bring the molecule into a
90 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
dened state. The levels of transformation can be set beforehand with SELINIT. To
recalculate the 3D structure use MINIMIZE.
Syntax: MINIMIZE
Description: FlexX has a built-in Tripos force eld minimization algorithm that
can be applied this command. This is useful if atom and bond types have been
modied and no longer have the correct geometry.
7.5.6 Outputting the most important information about a ligand (INFO/MOLINF)
Syntax: INFO <table format>
Description: Displays the main characteristics of a ligand, such as the potential
number of conformations and the number of interactions, on the screen. Three er-
ror ags indicate whether problems have occurred during initialization of the lig-
and data structure. (<rc_gen> error) indicates that the ring conformation program
has had problems in generating conformations for a ring system of the molecule. A
(geom error) occurs if the geometry of the ligand looks strange (try an energy min-
imization in this case). A (conf error) occurs if no conformation can be constructed
(again, try an energy minimization before loading the ligand). The INFO command
is extremely useful for summarizing errors in a ligand data set. If <table format>
is set to y, the result will be output in a table, otherwise it is output on one line.
If a base fragment is selected (DOCKING/SELBAS) and reference coordinates
are dened (LIGAND/READREF, LIGAND/MAPREF, LIGAND/SETREF), RMSDs
between the reference coordinates and the most similar conformation of
the ligand in FlexXs discrete conformational space are computed (see also
LIGAND/CONFORM/MINCONF).
Syntax: MOLINF <keywords>
Description: Displays some detailed information about the ligand, such as infor-
mation about rings, atoms, torsion angles, logp and contact type assignment. There-
fore keywords can be one or more of atoms, rings, torsion, logp, contact, or the rst
letter of a keyword.
<(a)toms> shows a table containing the most important information about all
atoms.
<(r)ings> information about the ring systems in the ligand.
<(t)orsion> information about the assigned torsion angles from the torsion
database (see section 11.12).
<(c)ontact> information about the assigned interactions fromthe contact database
(see section 11.11).
<(l)ogp> information about the assigned logp and refractivity parameters from
the logp database (see section 11.21).
<(m)isc> miscellaneous information such as the molecular mass or a SMILES rep-
resentation of the ligand.
Requirements: A ligand must be loaded.
7.5. WORKING WITH LIGANDS (LIGAND SUBMENU) 91
7.5.7 Ligand manipulation (TRANSFORM)
Syntax: TRANSFORM <match pattern> <transform pattern>
Description: Applies a transformation rule to the currently loaded ligand which is
similar to Daylight SMIRKS. The match pattern species a subgraph description in
SMARTS
TM
and the transformation pattern is applied on the matched substructure
and allows atomic attributes, bond types etc. to be changed. Transformations do
not take care of changes in atom types or geometries, so it may be necessary to call
MINIMIZE and/or REINIT after complex transformations. We recommended you
place transformation rules in transform.dat and apply them during ligand initial-
ization.
Note: The transform rule are processed from left to right. Thus, the order of occur-
rence of the labels has to be identical on both side of the rule, e.g.
[C:1](=O)[C:2] >> [C:1](=O).[C:2] # correct order
[C:1](=O)[C:2] >> [C:2].[C:1](=O] # wrong order
The rules have to be dened in such a way that the bonds are cut rst and additional
atoms / linkers are added afterwards:
[C:1][N:2](-[C:3])[C:4] >> \
[C:1](-[1
*
]).[N:2](.[C:3](-[1
*
]))(-[2
*
])(-[2
*
])(-[2
*
]).[C:4](-[1
*
])
^ ^
cut bond add linkers => correct
[C:1][N:2](-[C:3])[C:4] >> \
[C:1](-[1
*
]).[N:2](-[2
*
])(-[2
*
])(-[2
*
]).[C:3](-[1
*
]).[C:4](-[1
*
])
^ ^
add linkers cut bond => wrong
Furthermore, if a rule should match more than once make sure that the SMARTS
TM
patterns do not overlap. For more details see section 11.16
7.5.8 Checking SMARTS
TM
patterns and subgraph occurrence (SMARTS)
Syntax: SMARTS <smarts pattern>
Description: Checks whether a substructure dened by a given SMARTS
TM
pat-
tern can be found in a ligand or not. The matched atoms are output to the screen.
Additionally the batch variable $(SMARTS_MATCH) contains the number of occur-
rences of the substructure or <no_match> if the substructure was not found. Set the
VERBOSITY to 10 to get more information about the substructure generated from
the SMARTS
TM
pattern.
7.5.9 Writing (WRITE)
Syntax: WRITE <lename> <append> <multi le> <dock soln selection>
[<write particles>] <pvm merge> [<append $(PVM_ID)>]
92 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
Description: Writes a set of ligand placements in the le <lename>. The default
format is MOL2 [32]. Alternative formats can be selected by the appropriate le-
name extension. Possible formats are MDL SD [31] extension .sdf), MOL [32] (ex-
tension .mol) or PDB [1] (extension .pdb). If <append> is set to y, the molecule
is appended to any previously existing le. Otherwise any previously existing le
will be overwritten. If <multi le> is set to y, all placements are written in one
le. Otherwise a set of les with lenames <lename>_<dock soln number> is
generated.
<dock soln selection> species the dock entries to write. It can be either a single
number, a list of numbers separated by blanks or ,, a list of intervals of the form
a-b or all. Note that you must enclose the expression in quotation marks if it
contains blanks. If set to q, the result list from the last query command (submenu
DOCKING, commands QUERY, LISTSOL, LISTRMS, etc.) is used. This option is
only shown, if a former query exists. If dock entries are present, the output would
be <0-X>, where X is the number of the last docked solution. If there are no docked
solutions present, there is only the choice of the xed ligand: <0-0>. If <lename>
is a mol2 le, several score values for each dock entry are printed as a comment line
(FLEXX_SCORE). Otherwise if <lename> is an sdf le, the score values for each
dock entry are printed as data blocks in the sdf le.
Note that if you are writing in one le, the order of the dock entries is ascending
and does not correspond to the order of the entries in the entered selection. Only if
the last query result list is used, the placements are written in the same order as in
the query list.
The default directory for this command is the path specied in the PREDICT entry.
The read-in (x) coordinates can be written by entering 0 for the <dock soln
selection>. This option can be used to extract a single molecule from a combi-
natorial library and write it into a le (module CDOCK must be available). Note
however that the coordinates are not necessarily free of self-overlaps.
If FlexX is in place-particle mode (see corresponding ag in section 10.1.4 and
<write particles> is answered yes, the locations of particles in a docking entry are
written to the mol2 le in the form of additional atoms. The name of the atom is
constructed from the particle type name and a unique internal particle number. The
atom type is specied in the AMINO static data le.
In FlexX-PVM, output les created by WRITE are automatically merged after paral-
lel script execution, if the parameter <append> and <multi le> are set to yes.
This feature can be switched off by setting <pvm merge> to no.
The default parameter of <pvm merge> depends on the parameters <append>
and <multi le>. If both parameters are set to yes, then the default answer of
<pvm merge> is yes, too. Otherwise the the default answer is no. The feature of
merging les can be switched on by setting <pvm merge> to yes.
There is one conguration, where FlexX-PVM can not merge the written les:
<multi le> is set to no
<dock soln selection> contains at least two docking solutions
<pvm merge> is set to yes
Each PVM slave process writes a set of les with lenames <lename>_<dock
7.5. WORKING WITH LIGANDS (LIGAND SUBMENU) 93
soln number>_$(PVM_ID). The PVM master process tries to merge the les with
lenames <lename>_$(PVM_ID). Therefore no les will be merged.
If <pvm merge> is set to no, then you will be asked to append the built-in batch
variable $(PVM_ID)
2
to the lename: <append $(PVM_ID)>. The batch variable
is automatically added to les, which will be merged after the parallel script execu-
tion.
Important note: For lename usage and le merging within scripts, please refer to
the PVM section on page 147.
7.5.10 Deleting (DELETE)
Syntax: DELETE
Description: Removes a ligand from FlexXs workspace. All data associated with
the ligand such as conformational sets and placements is removed automatically
too.
7.5.11 Reading reference coordinates (READREF)
Syntax: READREF <lename> <ignore hydrogen>
Description: To compare ligand placements proposed by FlexX with other place-
ments or reference coordinates (computation of RMSDs), a reference coordinate set
can be loaded with the command READREF from the le <lename>. The le must
be in mol2 le format [32] and the numbering of the atoms must be the same as in
the ligand le loaded with the READ command. The READREF command can only
be performed after the READ command. Finally, if <ignore hydrogen> is answered
yes, hydrogen atoms are ignored during loading.
Note that READREF can be executed before as well as after a docking computation.
If it is executed after a docking computation, the RMSD will be automatically re-
computed.
Requirements: READ must be performed rst.
7.5.12 Assigning reference coordinates by subgraph matching (MAPREF)
Syntax: MAPREF <lename> <bond check> <atom check> <ignore hydrogen>
Description: The mol2 le <lename> is loaded and reference coordinates are as-
signed on the basis of a subgraph matching. The matching process can be controlled
with two ags: if <bond check> is answered yes, the matching algorithm enforces
exact matching of bond types. Otherwise, bond types are ignored. If <atomcheck>
is answered yes, exact matching of SYBYL atom types is required, otherwise only
the element must match. Finally, if <ignore hydrogen> is answered yes, hydrogen
atoms are ignored during loading.
Executing MAPREF has two effects. During execution, the reference molecule is
mapped to the previously loaded molecule and reference coordinates are assigned.
If multiple matchings are found, the arbitrary rst matching is used. The subgraph
2
$(PVM_ID) : please refer to the PVM section on page 147
94 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
together with the coordinates are then stored internally and used further during
base selection and placement (see SELBAS and PLACEBAS commands).
<lename> can be a multiple mol2 le. In this case, each molecule instance is
used to form a coordinate set. It is required, however that the molecules themselves
(atom types, bond types, atom ordering, etc.) are identical. For the assignment
of reference coordinates, only the rst molecule contained in the le is used. In
PLACEBAS however, the manual placement is performed for each coordinate set
loaded.
Requirements: READ must be performed rst.
Important notes: In the mapping process, only mappings with compatible stereo
chemistry at 4-bonded atoms are created.
7.5.13 Setting reference coordinates (SETREF)
Syntax: SETREF <ignore hydrogen> [<placement ID>]
Description: If the docking predictions are to be compared with the coordinates
in the ligand le loaded with the READ command, the reference coordinates can
be set with the SETREF command. You can decide whether the hydrogen atoms
should be taken into account in the comparison or not with the parameter <ignore
hydrogen>.
If placements are already computed, SETREF can be used to compare the place-
ments with one specic one by setting <placement ID> to the number of that place-
ment.
Requirements: READ must be performed rst.
7.5.14 Calculating the ligands solvent-accessible surface (SAS)
Syntax: SAS
Description: Calculates the ligands SAS (solvent-accessible surface) with an inter-
nal approximation algorithm and outputs it on the screen.
7.5.15 Selecting admin settings for drawing the ligand (SELADM)
Syntax: SELADM <ref. coords> <graphics object number> [<start fo object>]
[<end fo object>] <temp le> <append>
Description: With SELADM you can specify the graphics object numbers used for
drawing ligands and you can determine whether the graphics les are internal tem-
porary les used only by FlexX or saved for further use. For yes/no questions you
can enter either y, yes or 1 for yes, and similarly n, no or 0 for no.
<ref. coords> Yes/no answer:
yes The following modications concern the settings for drawing the ligand
with the reference coordinates.
no The following modications concern the settings for drawing the ligand
with the input coordinates.
<graphics object number> Enter integer:
7.5. WORKING WITH LIGANDS (LIGAND SUBMENU) 95
(1255) The graphics created with the DRAW command will be displayed in
graphics object <graphics object number>.
0 fo mode the graphics generated by subsequent DRAW commands will be
sent to a range of graphics objects in a rst-drawn-rst-overwrite manner.
You will be asked to enter two more parameters:
<start fo object> The start graphics object for the fo range.
<end fo object> The end graphics object for the fo range.
<temp le> Yes/no answer:
yes The graphics are written in temporary les and removed after quitting
FlexX.
no The graphics are written to permanent les chosen by the user. You will be
asked for a lename at the end of each DRAW command (see DRAW below
for example).
<append> Yes/no answer:
yes Previous graphics les are not overwritten. Instead the current graphics
are appended to the previous one in the graphics le.
no The previous graphics le will be overwritten and all previous graphics
made with the DRAW command in this menu will be lost.
7.5.16 Selecting graphics settings for drawing the ligand (SELGRA)
Syntax: SELGRA <ref. coords> <mol display mode> <hydro> <interact
geoms> <all contact types> [<contact type selection>] <all components>
[<component selection>] <surf>
Description: With SELGRA you can set specic default values for drawing ligands.
For yes/no questions you can enter either y, yes or 1 for yes, and similarly n,
no or 0 for no.
<ref. coords> Yes/no answer:
yes The following modications concern the settings for drawing the ligand
with the reference coordinates.
no The following modications concern the settings for drawing the ligand
with the input coordinates.
<mol display mode> Species the drawing display mode for molecule selection:
1 Lines
2 Sticks
3 Balls & sticks
4 Space-lled spheres
(Aside: the default appearance of ligands drawn with DRAW in the DOCKING
menu is sticks, independent of the setting here this helps with visualization
of the docking pose against the protein active site. Use the DRAW command
here in the LIGAND menu (see below) to overcome this default setting.)
<hydro> Species whether and how hydrogens should be drawn on the ligand
selection:
0 Do not draw hydrogens.
96 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
1 Draw all hydrogens.
2 Draw only hydrogens bonded to hetero atoms (non-carbon atoms).
<interact geoms> Yes/no answer:
yes Interaction geometries (interaction surfaces around potential interacting
groups) are drawn.
no No interaction geometries are drawn.
<all contact types> Yes/no answer:
yes Interaction geometries for all contact types (interaction types) are drawn
if <interact geoms> is set to yes.
no Interaction geometries are drawn for a selection of contact types (interac-
tion types). You will be asked to select types from a given list:
<contact type selection> Choose a list of types represented by integers.
Enter the list as separate integers or as integer ranges (format a b)
separated by , or blanks. Note that you need to enclose the expression
in quotation marks if it contains blanks e.g. 1, 2, 4, 7 9.
<all components> Yes/no answer:
yes All components are drawn, i.e. the complete ligand is drawn.
component: On loading a ligand, FlexX identies all rotational bonds (ac-
cording to its internal denition of rotatable bonds) and splits the ligand
into components at these bonds. Later, the components will be used to
decide on a base fragment and will form the building blocks for the incre-
mental reconstruction of the ligand in the active site.
no Only components are drawn froma selected list. You will be asked to select
components from a given list:
<component selection> Choose a list of components represented by in-
tegers. Enter the list as separate integers or as integer ranges (format
a b) separated by , or blanks. Note that you need to enclose the
expression in quotation marks if it contains blanks e.g. 1, 2, 4, 7 9.
<surf> Determines surface drawing:
0 Draw no surface
1 Draw the molecular surface: If FlexV is used to visualize, the Connolly sur-
face is drawn. Otherwise only concave patches as triangles are drawn.
Note: Hydrogens are not considered when drawing the surface if hydrogens
are to be drawn (<hydro>) they will be ignored when drawing the surface.
Important notes: The Connolly surface is rendered by its analytical calculated
patches. This enables selection of the level of curvature approximation but makes
the rendering much more complicated. Therefore a few percent of the patches are
rendered incorrectly (we will try to reduce this rate). In addition, there is currently
only pairwise cusp trimming.
7.5.17 Selecting colors for drawing the ligand (SELCOL)
Syntax: SELCOL <ref. coords> <molecule color mode> <interact geoms color
mode> <surface color mode>
7.5. WORKING WITH LIGANDS (LIGAND SUBMENU) 97
Description: With SELCOL you can set the color modes for the molecule, interac-
tion geometries and molecular surface. For each of these, a selection of color modes
are available:
<ref. coords> Yes/no answer:
yes The following modications concern the settings for drawing the ligand
with the reference coordinates.
no The following modications concern the settings for drawing the ligand
with the input coordinates.
<molecule color mode> Choose the color mode for drawing the molecule color
mode selection:
INVISIBLE
ATOM
UNIQUE
FRAGMENT
ENERGY
<interact geoms color mode> Choose the color mode for drawing the interac-
tion geometries if they are to be drawn. The interaction geometries consist
of patches or surfaces that indicate the positions of interacting groups in the
molecule. Color mode selection:
INVISIBLE
UNIQUE
CONTACT
<surface color mode> Choose the color mode for coloring the molecular surface
if it is to be drawn. Color mode selection:
INVISIBLE
UNIQUE
SURF_ATOM
CEN_DIST
SURFPATCH
The possible color modes are explained below for some color modes you are also
asked to enter some dening colors. Enter your chosen color as either an angle from
the color circle (0 360 degrees: 0 is invisible, 1360 runs from dark blue, through
red, yellow, green to blue), a color name (as dened in the GRAPHIC static data le),
or an RGB(A) value; 3 (4) oating-point numbers separated by blanks or slashes:
Example
selcol .... "dark green" ....
selcol .... green ....
selcol .... "0.0 0.8 0.1" ....
selcol .... 0.0/0.8/0.1 ....
selcol .... 220 ...
(For a full explanation of how to dene colors see Sec. 11.22.1.)
INVISIBLE The item drawn will be invisible.
98 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
ATOM The ligand will be colored according to the element types of the atoms. The
atoms are drawn in the color dened for its element type in the static data le
GRAPHIC, while the bonds are drawn half and half in the neighboring atom
colors.
UNIQUE The object will be drawn in one user-dened color. You will be asked to
choose the unique color:
<color> Enter your chosen color.
FRAGMENT Color the ligand according to its fragments. You are required to
choose the fragmentation scheme and three colors for this color mode:
<fragmentation> Integer selection enter the number that represents your
chosen fragmentation scheme.
<base color> Enter a color for the base fragment.
<rst color> The remaining fragments will be alternately colored with two
colors: enter the rst color here . . .
<second color> . . . and the second color here.
fragment: When the SELBAS command is called in the DOCKING menu, FlexX
calculates several fragmentation schemes for the ligand. Each scheme contains
a base fragment which will be the rst fragment of the ligand to be placed in
the active site while the remainder of the ligand is split into further fragments
which will be successively added to the base fragment during the incremental
reconstruction of the ligand in the active site. The fragments are based on the
components FlexX dened when the ligand was loaded (see, for example, the
SELGRA command for an explanation of component).
ENERGY Draw the ligand conformation in a color representative of its docking
solution score (energy). A color rainbow will be dened between two given
colors across the range of two given docking scores. You are required to enter:
<no. of intervals> Enter the number of intervals (integer) that the energy
range will be split into.
<min energy> Enter the minimum energy value (oating-point number) for
the start of the energy range. The default is the score of the best docking
solution.
<max energy> Enter the maximum energy value (oating-point number) for
the end of the energy range. The default is the score of the worst scoring
docking solution.
<rst color> Enter the rst color of the color rainbow.
<second color> Enter the second (end) color of the color rainbow.
CONTACT The object will be drawn in a color representing its interaction (contact)
type. The colors for each type are dened in the GRAPHIC static data le.
SURF_ATOM Convex patches in the surface are colored by atom type (see color
mode ATOM). Any reentrant patches (i.e. saddle and concave patches) are
drawn in a user-dened color. You are required to enter the reentrant patch
color in this mode:
<reentrant color> Enter your chosen color.
CEN_DIST The surface is drawn in a rainbow of colors representing how far the
surface lies from the (geometric) center of the molecule. For this mode you are
7.5. WORKING WITH LIGANDS (LIGAND SUBMENU) 99
required to enter the start and end colors of the rainbow plus the number of
intervals to be colored across the rainbow range:
<no of intervals> The number of intervals into which the range will be split.
<rst color> Start color for the rainbow.
<second color> End color for the rainbow.
SURFPATCH The surface is colored according to the surface patch type. You are
required to enter colors for the various patch types in this mode:
<concave color> Enter your chosen color for concave patches.
<saddle color> Enter your chosen color for saddle patches.
<convex color> Enter your chosen color for convex patches.
7.5.18 Selecting labels for drawing the ligand (SELLAB)
Syntax: SELLAB <ref. coords> <atom name> <inle number> <SYBYL type>
<fragment number> <formal charge> <partial charge>
Description: When the ligand is drawn, FlexX stores information in labels for dis-
play in the graphic interface. You can choose what should appear in the label using
the SELLAB command. For yes/no questions you can enter either y, yes or 1 for
yes, and similarly n, no or 0 for no.
<ref. coords> Yes/no answer:
yes The following modications concern the settings for drawing the ligand
with the reference coordinates.
no The following modications concern the settings for drawing the ligand
with the input coordinates.
<atom name> Yes/no answer:
yes Include the atom name as taken from the input le in the label for atoms.
no Atom names will not be included in the label for atoms.
<inle number> Yes/no answer:
yes Include the number of the atom as taken from the input le in the label for
atoms.
no Inle numbers will not be included in the label for atoms.
<sybyl type> Yes/no answer:
yes Include the SYBYL atom types in the label for atoms.
no SYBYL atom types will not be included in the label for atoms.
<fragment number> Yes/no answer:
yes Include the fragment number in the label for atoms.
no Fragment numbers will not be included in the label for atoms.
Note: For an explanation of fragments see, for example, the LIGAND/SELCOL
command.
<formal charge> Yes/no answer:
yes Include the formal charges on the atoms in the label for atoms.
no Formal charges will not be included in the label for atoms.
100 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
<partial charge> Yes/no answer:
yes Include the partial charges on the atoms in the label for atoms.
no Partial charges will not be included in the label for atoms.
7.5.19 Drawing the ligand (DRAW)
Syntax: DRAW <coordinate set> [<rms_limit>] [<rank_limit>] [<lename>]
Description: DRAW generates a drawing of the ligand and sends it to a le ready to
be displayed in the graphics interface. For details about what exactly is drawn see
the SELGRA command.
<coordinate set> Stored with the ligand are a choice of coordinates. The choices
are (enter one of the following words, or an integer for <dock soln number>):
rms Available only after a docking calculation. For this choice, the coordinates
fromthe docking solution that meet the following two criteria will be used:
<rms_limit> The docking solution must have an RMS deviation less than
this limit (enter the limit as a oating-point number).
<rank_limit> The docking solution must lie ranked on position 1
<rank_limit> (enter the limit as an integer).
For example if <rms_limit> is 1.0 and <rank_limit> is 10, the highest
ranking solution with RMS lower than 1.0 among the rst 10 solutions is
taken.
<dock soln number> Enter an integer greater than 0, corresponding to the
rank of the docking solution you want to use.
x Coordinates are taken from the ligand input le.
ref Coordinates are taken from the reference coordinates (created via one of
the commands SETREF, READREF or MAPREF).
Note: the reference coordinate set has its own graphics settings.
[<lename> ] If the graphics are not to be stored in a temporary le (see SELADM),
enter the lename for storing the graphics here.
Important notes: Drawings are not displayed automatically. Use DISPLAY to out-
put the drawing on the graphics device.
7.5.20 Drawing the ligand at multiple positions (MDRAW)
Syntax: MDRAW <placement selection> [<mul draw directory> <lename>]
Description: Generates multiple drawings of the ligand using coordinate sets
taken from a selection of placements (docking solutions).
<placement selection> Enter your selection as a list of integers or integer ranges
(format a b) separated by , or blanks. The integers are the ranks of the
docking solutions you want to draw.
[<multiple draw directory> <lename> ] If the graphics are not to be stored in
temporary les (see SELADM), enter a directory for containing the graphics les
and a base lename for the graphics les here.
7.5. WORKING WITH LIGANDS (LIGAND SUBMENU) 101
The graphics are appended one after the other within the selected graphics object
and appear in FlexV attached to a slider. You can scroll through the different draw-
ings by moving this slider (see FlexV manual).
Important notes: Drawings are not displayed automatically, use DISPLAY to out-
put the drawings to FlexV.
7.5.21 Listing the graphic items (GRAINF)
Syntax: GRAINF <ref. coords>
Description: Outputs a list of all current graphic settings for the ligand (either for
reference coordinates or for the read-in coordinates and placements).
7.5.22 Minimizing the ligand coordinates (MINIMIZE)
Syntax: MINIMIZE
Description: Minimizes the ligand x coordinates using rst a Nelder-Mead sim-
plex approach with at most 200 steps and afterwards a BFGS minimization until
convergence or 1000 steps is reached.
Important notes: This function is still in a highly experimental state.
7.5.23 *Working with ligand conformations (LIGAND/CONFORM submenu)
There are some commands for analyzing the conformational set of a ligand. They can be
useful for understanding differences between FlexXs predictions and X-ray data.
Finding the conformation with minimal RMSD to the reference conformation or mini-
mal energy (MINCONF)
Syntax: MINCONF <fragmentation> <by rms>
Description: With the MINCONF command, you can search for the most similar
conformation in the discrete conformational set of the ligand compared with the
reference coordinates or with a conformation with minimal energy. The confor-
mational set is dened by the ring conformations produced by the program SCA
and by discrete torsion angles assigned in FlexX from the torsion database (tor-
sion_standard.dat/torsion_ne.dat). With <fragmentation> you select the fragmen-
tation used for the minimization.
Requirements: SELBAS must be performed rst.
Important notes: The algorithmused here is heuristic, so there is no guarantee that
the proposed conformation is really the one with lowest RMSD from the reference
coordinates or with lowest energy.
Writing one specic conformation (WRITONE)
Syntax: WRITONE <fragmentation> <superpose> <conf. string> <lename>
Description: Writes one specic conformation into a le. The conformation is
specied by the conformation string which denes the conformation of each frag-
ment sequentially. Because this description is based on the internal representation
of conformations, you must take the conformation string from FlexXs output, for
102 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
example from the command MINCONF or from the placement tables (see section
7.8.13). If <superpose> is set to y, the dened conformation is superposed on the
reference coordinates. You can also write out partial ligand conformations by ter-
minating the conformation string with -1. The default directory for the output is
PREDICT.
Requirements: SELBAS must be performed rst. If <superpose> is set to y, ref-
erence coordinates must be loaded (READREF).
Writing a set of random conformations (WRITRAND)
Syntax: WRITRAND <lename> <number of conf.>
Description: Writes a set of approximately <number of conf.> randomly selected
conformations in a set of mol2 les <lename>. The default directory for the out-
put is PREDICT.
Requirements: SELBAS must be performed rst.
Important notes: The real number of written conformations can differ from the
<number of conf.> if internal clashes occur.
Calculating the RMSD from read-in to reference coordinates (FIXRMSD)
Syntax: FIXRMSD
Description: Calculates the xed order and variable order RMSD from the read-in
(x) to the reference (ref) coordinates.
Requirements: A ligand LIGAND/READ and reference coordinates
LIGAND/READREF must be loaded.
Calculate an enantiomer (FLIPSTER)
Syntax: FLIPSTER <lename>
Description: Calculates an enantiomer of the currently loaded molecule by ip-
ping acyclic stereo centers. Only stereo centers which are permitted to change ac-
cording to the ag STEREO_MODE are ipped.
The resulting enantiomer is written to the le <lename>. The format (mol, mol2,
or pdb) is selected with the appendix of <lename>. The default directory for this
command is the path specied in the environment variable PREDICT.
Requirements: A ligand LIGAND/READ must be loaded.
Important notes: Note that stereo centers in ring systems are not modied.
Internal clash test (CLASH)
Syntax: CLASH <clash factor> [<fragmentation>]
Description: Generate for the read-in (x) coordinates and a given <clash factor>
an internal clash test. If base gments are determined with LIGAND/SELBAS, then
you have to specied the corresponding fragmentation, <fragmentation>. Other-
wise a temporary fragmentation is generated. The internal clash test is performed
between atoms of different fragments. Therefore you obtained the best results, if
you do not use SELBAS before.
7.6. WORKING WITH PROTEINS (RECEPTORS) (RECEPTOR SUBMENU) 103
Requirements: A ligand LIGAND/READ must be loaded.
7.6 Working with proteins (receptors) (RECEPTOR submenu)
7.6.1 Reading (READ)
Syntax: READ <lename> [<selection radius> <complete>]
Description: Reads the receptor from le <lename>, which may be a receptor
description (RDF) le, a PDB le or a MOL2 le.
In the rst case, the le must be in the FlexX-specic RDF le format, which is
explained in section 11.3. It must have the extension .rdf. In the second case, the
lename extension is .pdb and a generic RDF le is loaded in addition to the PDB
le. In the third case, the lename must have the extension .mol2 and no RDF le
is read; see section 6.5.7 for further details.
The rules explained at the beginning of section 11 apply to the lename. The de-
fault directory for this command is the path specied in the environment variable
RECEPTOR.
If no active site le is dened in the RDF or MOL2 le, the active site can be
dened interactively by <selection radius>. In this case a ligand must be read
(LIGAND/READ) and reference coordinates must be dened (READREF, SETREF,
MAPREF) rst. Then all protein atoms which are closer than <selection radius>
from a ligand atom at its reference position are taken to be the set of active site
atoms. If <complete> is answered yes, the selection is extended to complete amino
acids.
The following operations are initiated in this order:
loading the .rdf le (not for a MOL2 receptor)
loading the PDB or MOL2 le
selecting or loading the active site atom selection
computing or loading the surface atom selection,
adding polar hydrogens to the active site (not for a MOL2 receptor)
assigning interaction types and geometries to the active site
Important notes: One available docking dataset provided in MOL2-format is the
reference dataset for the GOLD docking program, the Astex dataset. It can be
downloaded at http://www.ccdc.cam.ac.uk/products/life_sciences/
validate/astex . Adapted rdf les for usage within FlexX can be obtained from
BioSolveIT.
7.6.2 Summarizing PDB contents (PDBINFO)
Syntax: PDBINFO <pdb-lename>
Description: The command reads the given pdb le and tries to nd out what
components (peptide chains, ligand, ions) are in the protein structure. The com-
ponents are represented by patterns that can be used to be read in as a ligand by
LIGAND/FROMPDB.
104 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
Example
LEADIT/RECEPTOR> pdbinfo 1dwd
>> Reading PDB file 1dwd
>> structure 1dwd contains 2445 atoms in 343 residues.
1dwd LIGAND HOH-
*
-55-
*
, # (Unnamed (2 atoms))
1dwd LIGAND HOH-
*
-53-
*
, # (Unnamed (2 atoms))
1dwd LIGAND HOH-
*
-50-
*
, # (Unnamed (2 atoms))
1dwd LIGAND HOH-
*
-48-
*
, # (Unnamed (2 atoms))
1dwd LIGAND MID-
*
-1-
*
, # (NAPAP - SEE REMARK 13. 27 H31 N5 O4 S1)
1dwd PEPTIDE
*
-
*
-
*
-I # (Residues: 11)
1dwd PEPTIDE
*
-
*
-
*
-H # (Residues: 258)
1dwd PEPTIDE
*
-
*
-
*
-L # (Residues: 29)
7.6.3 ACTIVE
Syntax: ACTIVE <mode> [<radius>] [<le> [<lename>] [<x_coord>
<y_coord> <z_coord> <radius>] ] <complete_aa> <overwrite> [
<draw_sphere> ]
Description: Selects the atoms that belong to the active site.
If <mode> is set to 0, the active site will be dened by the reference ligand. All
protein atoms which are closer than <radius> from an atom of the reference ligand
are taken to be the set of active site atoms.
If <mode> is set to 1, the active site will be dened by a sphere. All protein atoms
which lie within the sphere are taken to be the set of active site atoms. If <le>is set
to y, the center and the radius of the sphere will be taken fromthe le <lename>.
FlexX searches for the keyword @radius and @origin in the le (see the example).
Example
@radius = 10.0
@origin = 1.25 -5.89 10.1
Otherwise, if <le> is set to n, FlexX will ask for the center (<x_coord>,
<y_coord>, <z_coord>) and the radius (<radius>) of the sphere. To draw the
sphere (only in mode 1), set <draw_sphere> to y. Then FlexV will be started and
the sphere will be drawn.
If <complete_aa> is set to y, the selection is extended to complete amino acids.
The new active site selection is stored internally in FlexX. If <overwrite> is set to
y, the internal selection will be overwritten. Otherwise a new pocket with the
above selection will be added to the internal representation.
Important notes: If the receptor was read from a MOL2 le or if it was com-
plemented by a molecule via an @hetero_files record of an RDF le, see sec-
tion 11.3.6, the RECEPTOR/ACTIVE command is currently not available.
7.6.4 Printing site atom information (ATLIST)
Syntax: ATLIST
7.6. WORKING WITH PROTEINS (RECEPTORS) (RECEPTOR SUBMENU) 105
Description: The command prints a table that contains detailed information about
the selected binding site. This is especially interesting if hetero groups were in-
cluded and you want to check if interaction types, formal charges etc. were assigned
correctly. Explanation of shortcuts:
type - SYBYL atom type
q - atomic charge
Fq - formal charge
#H - number of attached hydrogens
#b - number of bonds
pol - is polar atom
srf - is exposed to surface (only these atoms have inter-
action types assigned)
ct - contact
cb - chain break
Interactions - interaction types assigned, definitions are
defined in contact.dat
(... ) - buried atoms
7.6.5 Writing (WRITE)
Syntax: WRITE <atom select> <lename> <hydrogens>
Description: Writes a protein in the le <lename>. You can either save the ac-
tive site atom selection with <atom select> equals a, the full protein with <atom
select> equals f, or the surface atom selection with <atom select> equals s. Ac-
tive site atom selections and the full protein are written in PDB or MOL2 format.
Surface atom selections are written in a FlexX-internal format with the extension
.sdf. Although this is an ASCII le, do not try to edit sdf les.
For active site and full protein les, if the sufx .mol2 is used, the le format is
MOL2. Otherwise the atoms are written in PDB format. Hydrogens are written if
<hydrogens>is answered yes. Note that only hydrogens contained in the original
input le or those added by FlexX to the active site are written. If an active site le
is to be referenced by an RDF le, hydrogens should not be included and the le
format must be PDB.
The default lenames are those dened in the rdf le. Thus, you can specify le-
names for the active site and the surface in the rdf le, let FlexX compute the active
site and surface atom selections and save them with the WRITE command.
If you specify the written les in your .rdf le, the surface atoms or active site
atoms are loaded the next time you load the protein. This is much faster than re-
computing the surface atoms each time you load the protein. The default directory
for this command is the path specied in the entry SITE or RECEPTOR, respectively.
Note: If you choose to write the full protein to a MOL2 le, and if the bind-
ing site contains at least one non-hydrogen atom, the MOL2 le will contain an
@<TRIPOS>SET section listing all non-hydrogen atoms of the binding site. See sec-
tion 6.5.7 for further details.
Note: You cannot combine different <atom select> selections, only one type of
output can be written at once.
106 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
Important note: The PDB le contains only the ATOM records. The format of the
surface les has changed in FlexX version 1.4. The surface les written by FlexX
from previous versions cannot be used anymore.
Example
write a active_site
In the example all atoms belonging to the active site are written in the le
active_site.pdb.
7.6.6 Deleting (DELETE)
Syntax: DELETE
Description: The DELETE command removes the protein fromFlexXs workspace.
All data associated with the protein, such as the triangle hash table and docking
predictions, is removed automatically too.
7.6.7 Editing the receptor description le (EDIT)
Syntax: EDIT
Description: The EDIT command invokes the editor with the receptor description
le of the protein currently in FlexXs workspace.
7.6.8 Outputting the most important information about a receptor (INFO)
Syntax: INFO <table format>
Description: Displays the main characteristics of a receptor, such as the number
of atoms, number of atoms in the active site, number of interacting groups, etc. In
addition, a list of loaded amino acids is output. With respect to the verbosity level,
either all amino acids (level 5), only active site amino acids (level 3), or only non-
natural amino acids (level 2) are output.
If <table format> is set to y, the result will be output in a table, otherwise it is
output on one line. In the latter case, only non-natural amino acids are output.
7.6.9 *Building the receptor triangle hash table (TRIHASH)
Syntax: TRIHASH <pocket_id>
Description: For the second phase of the docking algorithm, a triangle hash table
must be generated. If the hash table is not available, it will be automatically gen-
erated by the docking algorithm. Manual usage of this command is therefore not
necessary under standard conditions.
If several subpockets are dened in the rdf le, the base placement phase can be
limited to a subpocket by specifying the pocket index <pocket_id>. Note that the
triangle hash table is valid until the receptor is deleted. Therefore, to change the
triangle hash table, TRIHASH must be called again (automatic generation of the tri-
angle hash table is not performed if a table was generated manually).
7.6. WORKING WITH PROTEINS (RECEPTORS) (RECEPTOR SUBMENU) 107
7.6.10 Forming a subpocket containing the deep part of the active site (DEEPSITE)
Syntax: DEEPSITE <min fraction> <min contact> <poc id>
Description: DEEPSITE identies the more buried part of the active site and stores
it as a pocket with pocket <poc id>. The selected pocket is internally stored in FlexX
and not written as a le. Pockets must be used in conjunction with the <TRIHASH>
command. In this way pockets can be used to limit base placements to certain parts
of the active site. The more buried part of the active site is identied as follows. For
each surface atom of the active site the number of protein atoms within a radius of
10 is calculated this is the contact number. Then the fraction <min fraction>
of surface atoms with the largest contact numbers is taken or all atoms with at least
<min contact> contacts whichever gives the largest set of atoms.
7.6.11 Calculating the proteins solvent-accessible surface (SAS)
Syntax: SAS
Description: Calculates the proteins SAS (solvent-accessible surface) with an in-
ternal approximation algorithm and outputs it on the screen.
7.6.12 Selecting admin settings for drawing the receptor (SELADM)
Syntax: SELADM <graphics object number> [<start fo object>] [<end fo
object>] <temp le> <append>
Description: With SELADM you can specify the graphics object numbers used for
drawing the receptor and you can determine whether the graphics les are internal
temporary les used only by FlexX or saved for further use. For yes/no questions
you can enter either y, yes or 1 for yes, and similarly n, no or 0 for no.
<graphics object number> Enter integer:
(1255) The graphics created with the DRAW command will be displayed in
graphics object <graphics object number>.
0 fo mode the graphics generated by subsequent DRAW commands will be
sent to a range of graphics objects in a rst-drawn-rst-overwrite manner.
You will be asked to enter two more parameters:
<start fo object> The start graphics object for the fo range.
<end fo object> The end graphics object for the fo range.
<temp le> Yes/no answer:
yes The graphics are written in temporary les and removed after quitting
FlexX.
no The graphics are written to permanent les chosen by the user. You will be
asked for a lename at the end of each DRAW command (see DRAW below
for example).
<append> Yes/no answer:
yes Previous graphics les are not overwritten. Instead the current graphics
are appended to the previous one in the graphics le.
no The previous graphics le will be overwritten and all previous graphics
made with the DRAW command in this menu will be lost.
108 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
7.6.13 Selecting graphics settings for drawing the receptor (SELGRA)
Syntax: SELGRA <mol display mode> <hydro> <active site> <complete aa>
<aa selection> [<new aa selection>] <interact geoms> <ia points> <all contact
types> [<contact type selection>] <particles> <surf> <backbone>
Description: With SELGRA you can set specic default values for drawing recep-
tors. For yes/no questions you can enter either y, yes or 1 for yes, and similarly
n, no or 0 for no.
<mol display mode> Species the drawing display mode for molecule selection:
1 Lines
2 Sticks
3 Balls & sticks
4 Space-lled spheres
<hydro> Species whether and how hydrogens should be drawn on the ligand
selection:
0 Do not draw hydrogens.
1 Draw all hydrogens.
2 Draw only hydrogens bonded to hetero atoms (non-carbon atoms).
<active site> Yes/no answer:
yes Draw only the active site and not the complete receptor.
no Draw the complete receptor.
<complete aa> Species how the amino acids should be drawn selection:
0 Do not drawcomplete amino acids this means that if just a fewatoms of an
amino acid are included in the active site, only these atoms will be drawn.
1 Draw complete amino acids if just a few atoms of an amino acid are in-
cluded in the active site, the complete amino acid will always be drawn.
2 BB mode: draw only the backbone atoms of the amino acids with atoms
included in the active site.
<aa selection> Yes/no answer:
yes Draw only the amino acids given in a selected list. You are required to
enter a list of amino acids:
<new aa selection> To dene a selection you must enter
<aa> <aa no.> <chain> [<slot>]
for each amino acid to be drawn. For each of these parameters you
can also type the wildcard *. For example, trp * * * selects all tryp-
tophans. The individual amino acid selections must be separated with
;. For example * 57 * * ; * 227 * * selects amino acid no. 57 and 227.
As a shortcut you can also leave out all trailing wildcards. Thus, the
last example can also be written as * 57; * 227.
Aside: [<slot>] describes a selection option for the FlexE module (see
section 8.4). Slot 0 contains the united protein description and slots
1-30 contain the different ensemble member structures. If you are not
working with the FlexE module you can just choose the wildcard char-
acter for <slot> or leave it out.
7.6. WORKING WITH PROTEINS (RECEPTORS) (RECEPTOR SUBMENU) 109
no Draw all amino acids.
<interact geoms> Yes/no answer:
yes Interaction geometries (interaction surfaces around potential interacting
groups) are drawn.
no No interaction geometries are drawn.
<ia points> Yes/no answer:
yes A set of points describing the interaction surface are drawn. (All calcula-
tions involving interaction surfaces in FlexX use these sets of points).
no No points are drawn.
<all contact types> Yes/no answer:
yes Interaction geometries for all contact types (interaction types) are drawn
if <interact geoms> is set to yes.
no Interaction geometries are drawn for a selection of contact types (interac-
tion types). You will be asked to select types from a given list:
<contact type selection> Choose a list of types represented by integers.
Enter the list as separate integers or as integer ranges (format a b)
separated by , or blanks. Note that you need to enclose the expression
in quotation marks if it contains blanks e.g. 1, 2, 4, 7 9.
<particles> Yes/no answer:
yes Any particles used in docking are drawn (see the PLACE_PARTICLES ag
in 10.1.4 on how to use particles). Particles are colored by their particle
color dened in the corresponding entry in the AMINO static data le.
no Do not draw particles.
<surf> Determines surface drawing:
0 Draw no surface
1 Draw the molecular surface: If FlexV is used to visualize, the Connolly sur-
face is drawn. Otherwise only concave patches as triangles are drawn.
Note: Hydrogens are not considered when drawing the surface if hydrogens
are to be drawn (<hydro>) they will be ignored when drawing the surface.
<backbone> Species whether and how the protein backbone should be drawn
selection:
0 Do not draw the backbone.
1 Draw the backbone as a line.
2 Draw the backbone as a tube.
3 Draw the backbone as a ribbon with rectangular cross-section.
4 Draw the backbone as a ribbon with elliptical cross-section.
5 Draw the backbone in cartoon mode to represent the secondary structure
(helices, sheets and loops). (Hint: you can select the coloring for these
different structural elements using the SELCOL command.)
Important notes: The Connolly surface is rendered by its analytical calculated
patches. This enables selection of the level of curvature approximation but makes
the rendering much more complicated. Therefore a few percent of the patches are
rendered incorrectly (we will try to reduce this rate). In addition, there is currently
only pairwise cusp trimming.
110 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
7.6.14 Selecting colors for drawing the receptor (SELCOL)
Syntax: SELCOL <receptor color mode> <interact geoms color mode> <surface
color mode> <backbone color mode>
Description: With SELCOL you can set the color modes for the receptor molecule,
interaction geometries, surfaces and the backbone. For each of these, a selection of
color modes are available:
<receptor color mode> Choose the color mode for drawing the receptor color
mode selection:
INVISIBLE
ATOM
UNIQUE
<interact geoms color mode> Choose the color mode for drawing the interac-
tion geometries if they are to be drawn. The interaction geometries consist
of patches or surfaces that indicate the positions of interacting groups in the
molecule. Color mode selection:
INVISIBLE
UNIQUE
CONTACT
ACCESS
<surface color mode> Choose the color mode for coloring the molecular surface
if it is to be drawn. Color mode selection:
INVISIBLE
UNIQUE
SURF_ATOM
CEN_DIST
SURFPATCH
<backbone color mode> Choose the color mode for coloring the protein backbone
if it is to be drawn. Color mode selection:
INVISIBLE
UNIQUE
SECSTR
The possible color modes are explained below for some color modes you are also
asked to enter some dening colors. Enter your chosen color as either an angle from
the color circle (0 360 degrees: 0 is invisible, 1360 runs from dark blue, through
red, yellow, green to blue), a color name (as dened in the GRAPHIC static data le),
or an RGB(A) value; 3 (4) oating-point numbers separated by blanks or slashes:
Example
selcol .... "dark green" ....
selcol .... green ....
selcol .... "0.0 0.8 0.1" ....
selcol .... 0.0/0.8/0.1 ....
selcol .... 220 ...
(For a full explanation of how to dene colors see Sec. 11.22.1.)
7.6. WORKING WITH PROTEINS (RECEPTORS) (RECEPTOR SUBMENU) 111
INVISIBLE The item drawn will be invisible.
ATOM The ligand will be colored according to the element types of the atoms. The
atoms are drawn in the color dened for its element type in the static data le
GRAPHIC, while the bonds are drawn half and half in the neighboring atom
colors.
UNIQUE The object will be drawn in one user-dened color. You will be asked to
choose the unique color:
<color> Enter your chosen color.
CONTACT The object will be drawn in a color representing its interaction (contact)
type. The colors for each type are dened in the GRAPHIC static data le.
SURF_ATOM Convex patches in the surface are colored by atom type (see color
mode ATOM). Any reentrant patches (i.e. saddle and concave patches) are
drawn in a user-dened color. You are required to enter the reentrant patch
color in this mode:
<reentrant color> Enter your chosen color.
CEN_DIST The surface is drawn in a rainbow of colors representing how far the
surface lies from the (geometric) center of the molecule. For this mode you are
required to enter the start and end colors of the rainbow plus the number of
intervals to be colored across the rainbow range:
<no of intervals> The number of intervals into which the range will be split.
<rst color> Start color for the rainbow.
<second color> End color for the rainbow.
SURFPATCH The surface is colored according to the surface patch type. You are
required to enter colors for the various patch types in this mode:
<concave color> Enter your chosen color for concave patches.
<saddle color> Enter your chosen color for saddle patches.
<convex color> Enter your chosen color for convex patches.
ACCESS Color the object according to its accessibility. The accessibility gives an
indication of the buriedness in the active site. For more information see section
11.7.3. A color rainbow is used, starting with dark blue for deeply buried and
ranging to red for the less deep the interacting group.
SECSTR Color the backbone according to the secondary structure alpha helix,
beta sheet or loop. You are required to enter:
<helix> Enter the color for alpha helices.
<sheets> Enter the color for beta sheets.
<turn> Enter the color for loop (turn) sections.
7.6.15 Selecting labels for drawing the receptor (SELLAB)
Syntax: SELLAB <atom name> <inle number> <formal charge> <partial
charge> <aa name> <aa number> <chain ID>
Description: When the receptor is drawn, FlexX stores information in labels for
display in the graphic interface. You can choose what should appear in the label
using the SELLAB command. For yes/no questions you can enter either y, yes or
1 for yes, and similarly n, no or 0 for no.
112 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
<atom name> Yes/no answer:
yes Include the atom name as taken from the input le in the label for atoms.
no Atom names will not be included in the label for atoms.
<inle number> Yes/no answer:
yes Include the number of the atom as taken from the input le in the label for
atoms.
no Inle numbers will not be included in the label for atoms.
<formal charge> Yes/no answer:
yes Include the formal charges on the atoms in the label for atoms.
no Formal charges will not be included in the label for atoms.
<partial charge> Yes/no answer:
yes Include the partial charges on the atoms in the label for atoms.
no Partial charges will not be included in the label for atoms.
<aa name> Yes/no answer:
yes Include the three letter amino acid code in the label.
no Do not include the amino acid code in the label.
<aa number> Yes/no answer:
yes Include the amino acid number as taken from the input le in the label.
no Do not include the amino acid number in the label.
<chain ID> Yes/no answer:
yes Include the protein chain ID (one letter) as taken from the input le in the
label.
no Do not include the protein chain ID in the label.
7.6.16 Drawing the receptor (DRAW)
Syntax: DRAW [<lename>]
Description: DRAW generates a drawing of the receptor and sends it to a le ready
to be displayed in the graphics interface. For details about what exactly is drawn
see the SELGRA command.
[<lename> ] If the graphics are not to be stored in a temporary le (see SELADM),
enter the lename for storing the graphics here.
Important notes: Drawings are not displayed automatically. Use DISPLAY to out-
put the drawing on the graphics device.
7.6.17 Listing the graphic items (GRAINF)
Syntax: GRAINF
Description: Outputs a list of all current graphic settings for the receptor.
7.7. *CHANGING THE STATIC DATA (DATABASE SUBMENU) 113
7.7 *Changing the static data (DATABASE submenu)
7.7.1 Adjusting the settings of the scoring function (SELSCO)
Syntax: SELSCO <parameter> <value> <factor nal> <factor partial>
Description: Allows you to adjust the @scoring_parameters specied in geome-
try.dat dynamically on the command line or in a script. (refer to 11.7)
<parameter> is any of the parameters described in geometry.dat.
<value> is the value without any scaling.
<factor nal> is the scaling factor for nal scoring.
<factor partial> is the scaling factor for partial ligands during the incremental
build-up process.
7.7.2 Listing the settings of the scoring function (LISTSCO)
Syntax: LISTSCO
Description: Prints the actual settings for the scoring-function that have been de-
ned in the @scoring_parameters section of geometry.dat or set by SELSCO. (refer
to 11.7) The status is printed as G_OFF if factor-nal and factor-partial are set to
zero.
Requirements: Protein and ligand must be loaded.
Example
parameter value final partial status
---------------------------------------------------
G_constant 5.400 1.000 1.000 G_ON
G_lipo_contacts -0.170 1.000 1.000 G_ON
G_rotbonds 1.400 1.000 1.000 G_ON
G_match 1.000 1.000 1.000 G_ON
G_ambig_contacts -0.170 1.000 1.000 G_ON
G_close_contacts -0.340 1.000 1.000 G_ON
G_overlap 0.000 0.000 0.000 G_OFF
G_plp_steric 0.400 0.000 0.000 G_OFF
G_plp_hbond 2.000 0.000 0.000 G_OFF
G_plp_rep 20.000 0.000 0.000 G_OFF
G_conf_torsion 5.000 0.000 0.000 G_OFF
7.7.3 Decrypting static data les (DECRYPT)
Syntax: DECRYPT <directory>
Description: DECRYPT decrypts the static data les so that the user can modify
them. It always creates a copy in <directory>.
Important notes: This command is only available with a full license, it is not avail-
able with an evaluation license. In order to use the modied static data informa-
tion you must please import them into the conguration le using the GUIs tab at
114 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
File -> Global Preferences -> Parameters&Flags or adjust cong.dat
in case you use a Pre-Release 3 conguration setup.
7.8 Docking (DOCKING submenu)
The docking algorithm of FlexX consists of three phases: selection of a set of base fragments,
placing the base fragments into the active site, and building up the complex incrementally,
beginning at the base fragments. The three phases are associated with the following three
commands.
7.8.1 Selecting the base fragments (SELBAS)
Syntax: SELBAS <mode> [<base atom list>]
Description: Denes the base fragment of the ligand. The following modes are
available:
automatic (a) In automatic mode, a set of base fragments is automatically selected
based on an internal scoring scheme. All previous selections are overwritten.
manual (m) In manual mode, a base fragment is manually dened, previous de-
nitions of base fragments are overwritten.
You receive a list of all ligand atomnames, preceded by the number of the atom
in the ligand input le. You are asked for a set of atom numbers dening the
base fragment in the following way:
All atoms lying on a path between these atoms extended by all atoms which
are attached rigidly to them form the base fragment. The selection is a list of
integers or integer ranges (format a-b) separated by , or blanks or the keyword
all.
manual append (p) In this mode, a base fragment is manually selected and added
to the list of already selected base fragments.
reference (r) In reference mode, base fragments are selected via a reference struc-
ture loaded previously with the MAPREF command. If multiple mappings to
the reference structure are possible, up to the maximum number of base frag-
ments mappings are used. Be aware that due to the limitation of the number of
allowed conformations, too large reference structures may be rejected. Before
the selection is performed, the reference coordinates are extended to hydrogen
atoms which can be placed unambiguously.
freeze (f) Like reference mode, the base fragments are selected by mapping a pre-
viously loaded reference structure (see MAPREF). In freeze mode however, the
conformation of the base fragment is frozen to the conformation of the refer-
ence structure. Therefore, exactly the same placement as given in the reference
structure can be achieved. In freeze mode, arbitrarily large base fragments can
be selected.
After the base fragments are dened, the complete fragmentation is calculated and
the order in which fragments are added is determined. The order depends on sev-
eral features of the fragments like the kind of interactions which can be performed
7.8. DOCKING (DOCKING SUBMENU) 115
and the number of fragments which still have to be added to complete this part of
the ligand.
In some situations it makes sense to manually control not only the rst fragment to
be added but also the last, for example if the ligand is connected to an additional
linker. Last fragments can be controlled by a dummy atom at the end of the frag-
ment chain. FlexX adds fragment chains containing dummy atoms with the lowest
priority.
Requirements: A ligand must have been loaded with the LIGAND/READ com-
mand. For reference mode, a reference structure must have been loaded with the
LIGAND/MAPREF command.
Important notes: The selection of the base fragment is an important phase and the
results vary substantially for different base fragments. Manual selection should be
performed if you already have specic knowledge about the protein-ligand com-
plex.
Base fragments should have the following features:
only a small set of discrete conformations
enough interacting groups which are able to bind to the protein
If the number of discrete conformations is too large, FlexX prohibits the base selec-
tion and aborts the command.
Example
selbas a
selbas m "3 6"
selbas r
The rst example performs an automatic selection; in the second example, all atoms lying
on a path between these atoms, numbers 3 and 6, extended by all atoms which are attached
rigidly to them form the base fragment; in the third example the base fragments are selected
via a reference structure.
7.8.2 Placing base fragments (PLACEBAS)
Syntax: PLACEBAS <mode> [<lig 1> <lig2> <rec aa> <rec1> <rec2>]
[<pocket_id>]
Description: Places the base fragment of each fragmentation in the active site.
<mode> selects the algorithm used:
c Covalent manual base fragment placing is performed. After specifying two lig-
and atoms <lig1> and <lig2> of the base fragment, an amino acid of the pro-
tein <rec aa> and two protein atoms of this amino acid <rec1> and <rec2>,
the base fragment is placed by positioning the ligand atoms onto the protein
atoms. The torsion angle of the bond between the two ligand atoms is sam-
pled in a 10
< 10
8
The second criterion is a minimum step size (STEP):
max
k=1,...,n
[x
i+1
k
x
i
k
[
max[x
i
k
[, 1.0
< macheps
0.9
macheps is machine accuracy of the computer.
The third criterion is a minimum energy size (ENERGY):
[E(x
i+1
) E(x
i
)[ < macheps
0.9
The last stop criterion is an upper bound for the number of iterations of the opti-
mization (MAXITER): at most <nof it> iterations will be performed.
The optimization stops with x
i
if the calculation of x
i+1
failed (ABORT).
Requirements: A complex prediction must have been computed.
Important notes: After optimizing, if <sort> is set to n, the placements are no
longer sorted by score (the list has the original order). You can sort the list again
with the SORT command.
118 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
Example
No | #new matches | pre / post Score | termination
--------------------------------------------------------
1 | 0 | -25.559 -> -29.003 | STEP
2 | 0 | -25.544 -> -34.625 | MAXITER
3 | 0 | -25.122 -> -34.117 | MAXITER
4 | 0 | -25.107 -> -33.844 | ENERGY
5 | 0 | -24.901 -> -33.365 | MAXITER
6 | 0 | -24.880 -> -34.520 | GRAD
7 | 0 | -24.865 -> -33.825
*
| MAXITER
8 | 0 | -24.791 -> -33.749
*
| ABORT
9 | 0 | -24.784 -> -33.319 | MAXITER
10 | 0 | -24.674 -> -32.420
*
| ABORT
7.8.5 Interactive selection of solutions (SELECT)
Syntax: SELECT <placement selection>
Description: Enables the interactive selection of a restricted set of partial place-
ments. The selection is a list of integers or integer ranges (format a-b) representing
placement IDs separated by , or blanks.
Requirements: A complex prediction must have been computed.
Important notes: Once the SELECT command is nished, deleted placements can
only be recovered by a new computation. After the operation, the placements are
in the same order, the numbering of the placements is reset to values from 1 to the
number of remaining placements.
7.8.6 Clustering solutions (CLUSTER)
Syntax: CLUSTER <max rms> <max angle dev.> <max length dev.>
Description: Clusters the placements proposed with a complete linkage cluster al-
gorithm. The distance between two placements is dened to be the RMSD between
them. All placements in a remaining cluster have an RMSD below <max rms>. If
there are vectors to fragments not placed yet, two placements can only be clustered
if the following conditions are met for all pairs of vectors:
1. The distance of the endpoints of the vectors must be less than or equal to <max
length dev.>.
2. The enclosing angle between the vectors must be less than or equal to <max
angle dev.>
After the clustering, only the energetically highest placement of each cluster re-
mains in the set of solutions.
Requirements: A complex prediction must have been computed.
7.8.7 Writing placements in pdf format (WRITE)
Syntax: WRITE <lename> <code transformations>
7.8. DOCKING (DOCKING SUBMENU) 119
Description: Writes a placement in a FlexX-specic le format (.pdf format) on
disk. The default directory for this command is the path specied in the entry
PREDICT. The pdf format is based on ASCII and can therefore be read and edited
with standard tools. Because small changes in transformations can result in dif-
ferent solutions, transformation information should be coded by setting <code
transformations> to y.
Important notes: Coding works only on machines with specic oating-point rep-
resentations. Thus, it may be the case that coding cannot be used on your hardware
platform. Be careful when reading and writing coded pdf les on different ma-
chines.
Because this format stores all the internal information, format changes are necessary
from time to time. Therefore pdf les may not be compatible between different
FlexX versions.
7.8.8 Reading placements in pdf format (READ)
Syntax: READ <lename>
Description: Reads a placement in a FlexX-specic le format (.pdf format) from
the le <lename>. The default directory for this command is the path specied in
the entry PREDICT.
Important notes: The placement information is based on the protein and ligand
molecule. Thus, the protein and ligand les in FlexXs main memory before exe-
cuting the READ command must be the same as the les which were in the main
memory during generation of the placements. Otherwise, FlexX ends up in an in-
consistent state which is not detected in every case.
Because this format stores all the internal information, format changes are necessary
from time to time. Therefore pdf les may not be compatible between different
FlexX versions.
7.8.9 Deleting a docking (DELETE)
Syntax: DELETE
Description: Deletes a complex prediction from FlexXs workspace.
Important notes: The base placement is destroyed during the complex building
phase. It is not possible to restore the base placements after the complex building
phase is performed.
7.8.10 Sorting the list of placements (SORT)
Syntax: SORT <quantity>
Description: Sorts the list of placements by the specied <quantity>. This is nor-
mally necessary after optimizing the placements (OPTIMIZE). Possible quantities
are: E_TOTAL, RMS.
7.8.11 Outputting the most important quantities of a docking result (INFO)
Syntax: INFO <table format> [<output table>]
120 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
Description: Displays the main characteristics of the docking result, such as num-
ber of solutions, highest ranking score, etc. on the screen. If <table format> is set
to y, the result is output in a set of tables, otherwise one of the tables is output
on one line. The single-line option is very useful to summarize a docking run over
large data sets. All single lines start with receptor name, ligand name, number of
solutions and the rst three ones end with computation time for base placement
and complex construction. In between the contents is
standard Score, RMSDof solution at rank 1; score, RMSD, and rank of highest rank-
ing solution with RMSD less than 1.0 ; score, RMSD, and rank of solution
with minimal RMSD.
rms Score and rank of highest ranking solution with minimal RMSD and then with
RMSD below the thresholds 1.0 , 1.5 , 2.0 , 2.5 .
rank Score and RMSD of solution with lowest RMSD within the rst 1, 10, 20, 50,
100 solutions.
acc. rms Accumulated minimal RMSD of the rst 1, 10, 20, 50, 100 solutions. If r(i)
is the RMSD of solution at rank i, the accumulated minimal RMSD of the rst
k solutions is k
1
k
i=1
min
1ji
r( j). The basic idea of the accumulated RMSD
is to dene a quality measure which is independent of discrete RMSD or rank
thresholds.
rms history Number of fragments; RMSD and rank of the best
7.8.12 Outputting a solution tab row (SOLTAB)
Syntax: SOLTAB <separator> <report> [<none tag>]
Description: Writes a formatted row for an overview solution table to the screen,
which can be redirected to a le. You can select the <separator> and decide if shall
<report> failed dockings as well. In this case you have to choose a <none tag>
that is printed instead of the missing score. The columns are:
ligand name: as give within the le
top score: score of top ranking solution
true hit: if the ligand is a true hit (not yet implemented)
RMSD: RMSD of top ranking solution
rec pdb name: PDB code of receptor
lig le name: lename of ligand (short)
lig le name: lename of ligand (long with path)
rec le name: lename of receptor (short)
rec le name: lename of receptor (long with path)
7.8.13 Listing solutions (LISTSOL)
Syntax: LISTSOL <table length>
Description: Displays a table of <table length> solutions of the complex pre-
diction on the screen. (The amount output may depend on your setting of ag
USER_MODE.) The table columns depend on the scoring function settings in
7.8. DOCKING (DOCKING SUBMENU) 121
geometry.dat consisting of the following columns. The column identiers are
shown in parentheses:
No. (SOL_NO) The number of the solution.
Total Score (E_TOTAL) Total score of the docking solution.
Match Score (E_MATCH) Contribution of the matched interacting groups.
Lipo Score (E_LIPO) Contribution of the lipophilic contact area.
Ambig Score (E_AMBIG) Contribution of the lipophilichydrophilic (ambiguous)
contact area.
Clash Score (E_CLASH) Contribution of the clash penalty.
Rot Score (E_ROT) Ligand conformational entropy score.
RMS Value (RMS) RMSDof coordinates fromreference coordinates. If there are no
reference coordinates, this column contains the RMSD from the highest rank-
ing solution.
Similarity Index (SIM_IDX) Measure of similarity between solution coordinates
and reference coordinates. If there are no reference coordinates, all entries
of this column are 1.0. The similarity index score is similar to the RMSD
value given in the previous column. The RMSD value however, is strictly re-
stricted to calculate the RMSDbetween the 2 coordinates assigned to one atom:
the docking solution coordinates and the reference coordinates. The similarity
index rather corresponds to a "fuzzy" similarity measure: It is based on an
RMSD between the docking coordinates of an atom and the reference coordi-
nates of the nearest atom of the same SYBYL atom type. For example a sym-
metric molecule docked back to front will have a bad (high) RMSDbut can still
achieve a good (low) similarity index.
#Match (NOF_MATCH) Number of matches.
Avg. Volume (AVG_VOL) Average volume of protein/ligand overlap (for a de-
scription of the overlap test, see 11.4.1).
Max Volume (MAX_VOL) Maximum volume of protein/ligand overlap (for a de-
scription of the overlap test, see 11.4.1).
Fragmentation No. (FRAG_NO) Number of the fragmentation used for this pre-
diction.
Conf. String (CONF_STR) String displaying the conformation of the ligand (inter-
nal notation) (in DEBUG_MODE only).
Sol. String (SOL_NR_STR) String displaying the rank of the solution after each
build-up step (in DEBUG_MODE only).
Frag. String (FRAG_NR_STR) String displaying the fragment conformation num-
bers of the solution of the build-up step (in DEBUG_MODE only).
#Inst (NOF_INST) Number of instances contributing to this solution (valid for
FlexE only).
SAS protein (SAS_REC) SAS of the receptor (hydrophilic) (only with
QUERY_SASTAB=1)
SAS protein (lipo) (SAS_REC_LIPO) SAS of the receptor (lipophilic) (only with
QUERY_SASTAB=1)
122 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
SAS ligand (SAS_LIG) SAS of the ligand (hydrophilic) (only with
QUERY_SASTAB=1)
SAS ligand (lipo) (SAS_LIG_LIPO) SAS of the ligand (lipophilic) (only with
QUERY_SASTAB=1)
SAS protein ligand complex (plc) (SAS_PLC) SAS of the protein-ligand complex
(hydrophilic) (only with QUERY_SASTAB=1)
SAS plc (lipo) (SAS_PLC_LIPO) SAS of the protein-ligand complex (lipophilic)
(only with QUERY_SASTAB=1)
SAS protein in plc (SAS_REC_PLC) SAS of the receptor in the protein-ligand com-
plex (hydrophilic) (only with QUERY_SASTAB=1)
SAS protein in plc (lipo) (SAS_REC_PLC_LIPO) SAS of the receptor in the
protein-ligand complex (lipophilic) (only with QUERY_SASTAB=1)
SAS ligand in plc (SAS_LIG_PLC) SAS of the ligand in the protein-ligand complex
(hydrophilic) (only with QUERY_SASTAB=1)
SAS ligand in plc (lipo) (SAS_LIG_PLC_LIPO) SAS of the ligand in the protein-
ligand complex (lipophilic) (only with QUERY_SASTAB=1)
Buriedness (BURIEDNESS) Buriedness of active site (only with
QUERY_BURIEDNESS=1)
7.8.14 Listing solutions sorted by RMSD (LISTRMS)
Syntax: LISTRMS <table length>
Description: Displays a table of <table length> solutions of the complex predic-
tion on the screen, sorted by RMSD. The table has the same columns as listed above.
7.8.15 Listing the matches of all solutions (LISTMAT)
Syntax: LISTMAT
Description: Displays a table of all matches of the complex prediction on the
screen. Matches with zero score are automatically omitted from the table. The table
has the following columns:
No. (SOL_NO) The number of the corresponding solution.
Lig. Atom (LIA_ATOM) Ligand atom of the match.
Lig. ANo. (LIA_ATOM_NO) Number of interacting atom of ligand.
Ligand IA-Type (LIA_TYPE) Ligand interaction type.
Rec. Atom (RIA_ATOM) Receptor atom of the match.
Rec. AA (RIA_AA) Receptor amino acid.
Rec. Chain id (RIA_AA_CHAIN) Receptor chain identier.
Rec. AANo (RIA_AANO) Number of the receptor amino acid.
Receptor IA-Type (RIA_TYPE) Receptor interaction type.
Opt. Energy (E_OPT) Optimal score (without geometry penalties) of the match.
Chg. (CHG) Product of formal charges of the interacting atoms.
Chg. fact. (CHG_FAC) Charge factor for the interaction.
LDev. (LDEV) Length deviation.
7.8. DOCKING (DOCKING SUBMENU) 123
LDev. fact. (LDEV_FAC) Length deviation factor.
ADevL (ADEVL) Angle deviation on ligand site.
ADevL fact. (ADEVL_FAC) Angle deviation factor on ligand site.
ADevR (ADEVR) Angle deviation on receptor site.
ADevR fact. (ADEVR_FAC) Angle deviation factor on receptor site.
Res. Engy. (E_RES) Resulting match score (optimal score multiplied by the charge
factor and rescaled by the deviation factors).
Multip. fact. (MULTIP_FAC) Interaction multiplicity factor.
Ens. slot (ENS_SLOT) The Ensemble slot the match belongs to (valid for FlexE
only).
Sel. mat. (SELECTED) 1 if the matching instance is and contributed to the scoring,
0 if not (valid for FlexE only).
7.8.16 Listing all solutions and matches (LISTALL)
Syntax: LISTALL <table length>
Description: Displays a table of <table length> solutions and all matches of the
complex prediction on the screen. For a description of the table columns, see the
two sections above. When FlexE is used a third table is shown, which is explained
at the FlexE command LISTINST (8.4.6).
7.8.17 Listing one solution and the corresponding matches (LISTONE)
Syntax: LISTONE <solution number>
Description: Lists solution <solution number> and the corresponding matches
on the screen. For a description of the table columns, see the two sections above.
7.8.18 Performing specic queries on solutions and matches (QUERY)
In many real cases, the number of solutions and matches is very large. It is possible to select
specic information from the solution and matches tables. There are three ways of selecting
or rearranging the table information:
SELECT specic columns FROM the table(s).
Select specic rows of the table(s) WHERE a certain condition applies.
Output the information selected in this way SORTed BY some criteria.
An SQL-like language is provided for the user to tell FlexX what information to display.
Syntax: QUERY <eld list> <table list> [<condition>] [<order list>] <table
length>
Description: <eld list> is a list of eld names, separated by colons. A eld name
is an identier for a column of one of the tables. A list of valid strings for eld
names is output when you enter QUERY without parameters (see section 7.8.13 and
7.8.15 for a list of valid strings). In the resulting output, only the table columns
represented by <eld list> are listed. An asterisk
*
for <eld list> is valid and
124 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
represents the complete list of eld names. <table list> is a list of the names of the
tables you want to see, separated by colons. Valid table names are solutions and
matches.
An asterisk * for <table list> is also valid and stands for solutions, matches
(or, equivalently, matches, solutions). <table list> must contain all tables, the
columns of which have been selected by <eld list>.
<condition> is optional and selects rows of the tables (whereas <eld list> selects
columns). An atomic condition is a eld name followed by an arithmetic operator
(=, >=, <=, >, < or ! =) or the contains-string operator ([]), followed by an appro-
priate constant. String constants must be enclosed in single quotes. Note that amino
acid numbers and atom names must follow the PDB nomenclature, i.e. leading and
trailing blanks are essential for the PDB encoding scheme and must be part of the
string constant (amino acid numbers are strings, not integers). Atomic conditions
can be joined by the binary Boolean operators and and or. Conditions can also be
nested with brackets (, ).
The underlying semantics for conditions containing an and operator for differ-
ent combinations of tables are different. If you have selected the solutions or the
matches table separately, the condition is checked for each row separately. For exam-
ple, this can be used to nd out all hydrogen bonds to a specic protein atom with
a score less than x kJ/mol.
If both tables are selected and a condition for match table entries is dened contain-
ing an and command, the condition is checked for the whole set of matches of one
solution in common. For example, this can be used to nd out all solutions forming
interactions to amino acid x and amino acid y (see examples below). Note that the
original meaning of and is lost in this context. Thus the following query is currently
impossible: "Show all solutions forming an interaction to amino acid x with score
higher than x
1
and to amino acid y with score higher than y
1
".
<order list> is also optional and describes the order in which the selected rows are
to be displayed on the screen. An <order list> is a list of order specications, sep-
arated by colons. An order specication consists of one of the strings ascending
or descending, followed by a eld name. The eld name must be an element of
<eld list>. The string ascending or descending is optional. If it is missing,
ascending is assumed.
Example
QUERY "sol_no, e_total, e_match, e_lipo, e_rot" solutions
QUERY
*
solutions "sol_no > 20 and (e_total<-10.0 or nof_match > 3)" ""
QUERY
* *
"" "descending e_total, ascending nof_match, descending e_res"
QUERY
* *
"ria_aano [] 53 and ria_aano [] 79" ""
The rst query shows ve columns (solution number and four energy values) of the com-
plete solutions table.
The second query shows all columns of the solutions table, but only those solutions (rows)
whose number is greater than 20 and whose total energy is either less than -10.0 or whose
number of matches is greater than 3.
The third query shows the complete solutions and matches tables, but reordered: the solu-
tions are sorted by decreasing total energies, those with equal total energies are sorted by
7.8. DOCKING (DOCKING SUBMENU) 125
ascending number of matches. The matches of one solution will be sorted by decreasing
resulting energies.
The fourth query shows all solutions forming interactions to amino acids 53 and 79.
The pair of subsequent double quotes "" in example 2 (3) represents a missing optional
parameter <order list> (<condition>).
7.8.19 Performing a specic query a second time (QHIST)
Syntax: QHIST <query no.>
Description: With QHIST you can perform a previous query again. After typing
the command, you will receive a list of the last ten query commands. You can choose
one of them by its number <query no.>.
Requirements: A query must have been previously performed using the com-
mand QUERY.
7.8.20 Writing solutions in a table (PRINTSOL)
Syntax: PRINTSOL <lename> <table length> <separator> <llchar>
<append> [<pvm merge>]
Description: Writes a table of <table length>solutions of the docking in a le. The
name of the le is <filename>.log, if <lename> does not contain any sufx,
otherwise the name is <filename>.
The table has the following columns. The rst and the second columns contain
the ligand name and the index of the placement solution. The next columns are
the columns of the LISTSOL table. The table contains a column for all interaction
geometries. These columns contain the Res. Engy. value of the LISTMATCH table
if they are matched. Each column is separated by <separator>. If a solution has
no result for a match column, this column then contains <llchar>. If <append>
is n, <lename>.log will be created and the rst row is a header row. Otherwise
the table will be appended to <lename>.log without a header row. If the receptor
does not change, the table has the same format for all ligands because each row has
the same columns. The default directory for this command is the path specied in
the PREDICT entry.
In FlexX-PVM, <lename>.log is automatically merged after parallel script execu-
tion. This feature can be switched off by setting <pvm merge> to no.
If <pvm merge> is set no, then the build in batch variable $(PVM_ID)
3
is auto-
matically appended to the lename.
Notes: <lename> can be used by a spreadsheet program like EXCEL.
Important note: For lename usage and le merging within scripts, please refer to
the PVM section on page 147.
7.8.21 Writing all energy and matching scores to a csv le (EXPORT)
Syntax: EXPORT <lename> <sol. no> <add lig atom> <llchar> <append>
[<pvm merge>]
3
$(PVM_ID) : please refer to the PVM section on page 147
126 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
Description: Writes for the selected docking solutions all energy and match-
ing scores to a comma separated values (csv) le. The name of the le is
<filename>.csv, if <lename>does not contain any sufx, otherwise the name
is <filename>. The option <sol. no> determines the selected solutions, there-
fore <sol. no> can be either a single number, a list of numbers separated by blanks
or ,, a list of intervals of the form a-b, or all.
The table has the following columns. The rst and the second columns contain
the ligand name and the index of the placement solution. The next columns are
the columns of the LISTSOL table. The table contains a column for all interaction
geometries in the active site. These columns contain the Res. Engy. value of the
LISTMATCHtable, if they are matched. If a receptor atomforms various matchings,
then each matching scores is printed in the corresponding column separated by |.
If <add lig atom> is set to y, then the ligand atom info of the matching ligand
atoms are also written to the le. The ligand atom info consists of the atom name
and the inle number. The matching score, the atom name and the inle number
are separated by :. Otherwise only the matching score is written.
If a solution has no result for a match column, this column then contains <llchar>.
If <append>is n, <lename>.csv will be created and the rst rowis a header row.
Otherwise the output will be appended to <lename>.csv without a header row.
If the receptor does not change, the table has the same format for all ligands because
each row has the same columns. The default directory for this command is the path
specied in the PREDICT entry.
In FlexX-PVM, the csv le is automatically merged after parallel script execution.
This feature can be switched off by setting <pvm merge> to no.
Notes: <lename>.csv can be used by a spreadsheet program like EXCEL.
Important note: For lename usage and le merging within scripts, please refer to
the PVM section on page 147.
7.8.22 Selecting the admin settings for drawing placements (SELADM)
Syntax: SELADM <graphics object number> [<start fo object>] [<end fo
object>] <temp le> <append>
Description: With SELADM you can specify the graphics object numbers used for
drawing placements and you can determine whether the graphics les are internal
temporary les used only by FlexX or saved for further use. For yes/no questions
you can enter either y, yes or 1 for yes, and similarly n, no or 0 for no.
<graphics object number> Enter integer:
(1255) The graphics created with the DRAW command will be displayed in
graphics object <graphics object number>.
0 fo mode the graphics generated by subsequent DRAW commands will be
sent to a range of graphics objects in a rst-drawn-rst-overwrite manner.
You will be asked to enter two more parameters:
<start fo object> The start graphics object for the fo range.
<end fo object> The end graphics object for the fo range.
<temp le> Yes/no answer:
7.8. DOCKING (DOCKING SUBMENU) 127
yes The graphics are written in temporary les and removed after quitting
FlexX.
no The graphics are written to permanent les chosen by the user. You will be
asked for a lename at the end of each DRAW command (see DRAW below
for example).
<append> Yes/no answer:
yes Previous graphics les are not overwritten. Instead the current graphics
are appended to the previous one in the graphics le.
no The previous graphics le will be overwritten and all previous graphics
made with the DRAW command in this menu will be lost.
7.8.23 Selecting graphics settings for drawing placements (SELGRA)
Syntax: SELGRA <include lig> <include rec> <overlap> <all contact types>
[<contact type selection>]
Description: With SELGRA, you can set specic values for drawings of placements
(docking solutions). The drawing of a docking solution contains optionally the lig-
and in the predicted conformation and receptor, plus a set of dashed lines connect-
ing the interacting groups where interactions are formed. The overlap can also be
drawn. The following choices are available. For yes/no questions you can enter
either y, yes or 1 for yes, and similarly n, no or 0 for no.
<include lig> Yes/no answer:
yes The ligand is included in the drawing. The drawing settings for the ligand
are taken from the settings in the LIGAND menu, except that the molecule
display mode is set to sticks for straightforward visualization with the re-
ceptor. The ligand will be drawn in the graphics object for docking (see
SELADM)
no The ligand is not included in the drawing
<include rec> Yes/no answer:
yes The receptor is included in the drawing. The drawing settings for the
receptor are taken from the settings in the RECEPTOR menu, except that the
molecule display mode is set to lines for straightforward visualization with
the ligand. The receptor is drawn to the graphics object set for the receptor.
In fact, three versions of the receptor are drawn into this graphics object:
the receptor as described above, the receptor as described above plus the
receptor surface and the receptor as described above plus the interaction
surfaces of the hydrogen donor and acceptor groups on the receptor. The
three versions are accessible through a slider in FlexVs Object Control
window.
no The receptor is not included in the drawing
<overlap> Yes/no answer:
yes Lines are drawn connecting heavy ligand receptor atom pairs which have
a non-zero overlap volume (see 7.8.29 for some more information on re-
ceptor ligand overlap)
no The overlap is not shown
128 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
<all contact types> Interactions are shown as dotted lines between the interacting
group on the receptor and the interacting group on the ligand. Yes/no answer:
yes All interaction (contact) types are shown
no Interactions from a selection of interaction (contact) types are drawn. You
will be asked to make a selection of types from a given list:
<contact type selection> Choose a list of types represented by integers.
Enter the list as separate integers or as integer ranges (format a b)
separated by , or blanks. Note that you must enclose the expression
in quotation marks if it contains blanks. e.g. 1, 2, 4, 7 9
7.8.24 Selecting colors for drawing placements (SELCOL)
Syntax: SELCOL <interact color mode> <overlap color mode>
Description: With SELCOL you can set the color modes for drawing the interac-
tions between the ligand and receptor and for the overlap lines. For each of these, a
selection of color modes are available:
<interact color mode> Choose the color mode for drawing the dotted lines repre-
senting interactions between the ligand and receptor. Color mode selection:
INVISIBLE
UNIQUE
ENERGY
OPT_ENERGY
CONTACT
<overlap color mode> Choose the color mode for drawing the solid lines repre-
senting atom pairs in the ligand and receptor with non-zero overlap volume.
Color mode selection:
INVISIBLE
UNIQUE
Below, the possible color modes are explained for some color modes you are also
asked to enter some dening colors. Enter your chosen color as either an angle from
the color circle (0 360 degrees: 0 is invisible, 1360 runs from dark blue, through
red, yellow, green to blue), a color name (as dened in the GRAPHIC static data le),
or an RGB(A) value; 3 (4) oating-point numbers separated by blanks or slashes:
Example
selcol .... "dark green" ....
selcol .... green ....
selcol .... "0.0 0.8 0.1" ....
selcol .... 0.0/0.8/0.1 ....
selcol .... 220 ...
(For a full explanation of how to dene colors see Sec. 11.22.1.)
INVISIBLE The item drawn will be invisible.
UNIQUE The object will be drawn in one user-dened color. You will be asked to
choose the unique color:
7.8. DOCKING (DOCKING SUBMENU) 129
<color> Enter your chosen color.
ENERGY Draw the dotted lines showing interactions in a color representative of
their energy. A color rainbow will be dened between two given colors across
the range of two given energies. You are required to enter:
<no. of intervals> Enter the number of intervals (integer) that the energy
range will be split into
<min energy> Enter the minimum energy value (oating-point number) for
the start of the energy range. The default offered is the best achievable
score for an interaction in FlexX (-8.3)
<max energy> Enter the maximum energy value (oating-point number) for
the end of the energy range. The default value of 0.0 is offered.
<rst color> Enter the rst color of the color rainbow
<second color> Enter the second (end) color of the color rainbow
OPT_ENERGY Draw the dotted lines showing interactions in a color representa-
tive of the optimal energy achievable for this interaction. (The energy for the
interaction at the optimal geometry of the two interacting groups.) A color
rainbow will be dened between two given colors across the range of two
given energies. You are required to enter:
<no. of intervals> Enter the number of intervals (integer) that the energy
range will be split into
<min energy> Enter the minimum energy value (oating-point number) for
the start of the energy range. The default offered is the best achievable
score for an interaction in FlexX (-8.3)
<max energy> Enter the maximum energy value (oating-point number) for
the end of the energy range. The default value of 0.0 is offered.
<rst color> Enter the rst color of the color rainbow
<second color> Enter the second (end) color of the color rainbow
CONTACT The object will be drawn in a color representing its interaction (contact)
type. The colors for each type are dened in the GRAPHIC static data le.
Note: the colors are taken from the contact types on the protein side.
7.8.25 Selecting labels for drawing the placements (SELLAB)
Syntax: SELLAB <ia type> <energy> <opt energy>
Description: When the placements are drawn, FlexX stores information about the
interactions between the ligand and receptor for display in the graphics interface.
You can choose what should appear in the label using the SELLAB command. For
yes/no questions you can enter either y, yes or 1 for yes, and similarly n, no
or 0 for no.
<ia type> Yes/no answer:
yes Include the type of the interaction in the label
no Do not include the type in the label
<energy> Yes/no answer:
yes Include the actual energy score of the interaction in the label
130 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
no Do not include the energy score in the label
<energy> Yes/no answer:
yes Include the optimal achievable energy score for this interaction in the la-
bel the optimal energy occurs when the positions of the two interacting
groups have an optimal geometry
no Do not include the optimal energy score in the label
7.8.26 Drawing placements (DRAW)
Syntax: DRAW <coordinate set> [<rms_limit>] [<rank_limit>] [<lename>]
Description: DRAW generates a drawing of a docking placement (prediction of the
ligand conformation in the active site) and sends it to a le ready to be displayed
in the graphics interface. For details about what exactly is drawn see the SELGRA
command.
<coordinate set> Stored with the ligand are a choice of coordinates. The choices
are (enter one of the following words, or an integer for <dock soln number>):
rms Available only after a docking calculation. For this choice, the coordinates
fromthe docking solution that meet the following two criteria will be used:
<rms_limit> The docking solution must have an RMS deviation less than
this limit (enter the limit as a oating-point number).
<rank_limit> The docking solution must lie ranked on position 1
<rank_limit> (enter the limit as an integer).
For example if <rms_limit> is 1.0 and <rank_limit> is 10, the highest
ranking solution with RMS lower than 1.0 among the rst 10 solutions is
taken.
<dock soln number> Enter an integer greater than 0, corresponding to the
rank of the docking solution you want to use.
[<lename> ] If the graphics are not to be stored in a temporary le (see SELADM),
enter the lename for storing the graphics here.
Important notes: Drawings are not displayed automatically. Use DISPLAY to out-
put the drawing on the graphics device.
7.8.27 Drawing multiple placements (MDRAW)
Syntax: MDRAW <placement selection> [<multiple draw directory> <lename>]
Description: Generates drawings of a selected set of placements.
<placement selection> Enter your selection as a list of integers or integer ranges
(format a b) separated by , or blanks. The integers are the ranks of the
docking solutions you want to draw.
Alternatively, enter the letter q the result list from the last query command
(submenu DOCKING, commands QUERY, LISTSOL, LISTRMS, etc.) is used to
form the selection.
[<multiple draw directory> <lename> ] If the graphics are not to be stored in
temporary les (see SELADM), enter a directory for containing the graphics les
and a base lename for the graphics les here.
7.8. DOCKING (DOCKING SUBMENU) 131
The graphics are appended one after the other within the selected graphics object
and appear in FlexV attached to a slider. You can scroll through the different draw-
ings by moving this slider (see FlexV manual).
Important notes: Drawings are not displayed automatically, use DISPLAY to out-
put the drawings to FlexV.
7.8.28 Listing the graphic items (GRAINF)
Syntax: GRAINF
Description: Outputs a list of all current graphic settings for drawing docking re-
sults.
7.8.29 *Special commands for analyzing docking results (ANALYZE)
This menu contains a collection of commands for analyzing the results of your docking run
or a complex loaded (X-ray data, for example).
Evaluating the score of a protein-ligand complex (SCORE)
Syntax: SCORE <coordinate set> [<rms_limit>] [<rank_limit>] <table format>
<draw> <smiles> [<recalc>] [<recalc_hydro_pos>]
Description: Searches for interactions and computes an energy estimation for the
ligand placed on a given set of coordinates. For yes/no questions you can enter
either y, yes or 1 for yes, and similarly n, no or 0 for no.
<coordinate set> Stored with the ligand are a choice of coordinates. The choices
are (enter one of the following words, or an integer for <dock soln number>):
rms Available only after a docking calculation. For this choice, the coordinates
fromthe docking solution that meet the following two criteria will be used:
<rms_limit> The docking solution must have an RMS deviation less than
this limit (enter the limit as a oating-point number).
<rank_limit> The docking solution must lie ranked on position 1
<rank_limit> (enter the limit as an integer).
For example if <rms_limit> is 1.0 and <rank_limit> is 10, the highest
ranking solution with RMS lower than 1.0 among the rst 10 solutions is
taken.
<dock soln number> Enter an integer greater than 0, corresponding to the
rank of the docking solution you want to use.
x Coordinates are taken from the ligand input le.
ref Coordinates are taken from the reference coordinates (created via one of
the commands SETREF, READREF or MAPREF).
If reference coordinates are used and hydrogens do not have coordinates, these
hydrogens are placed using the local geometry of the x coordinates.
<table format> Yes/no answer:
132 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
yes Return a table containing all scoring function terms plus a table containing
the ligand receptor overlap terms. With a higher verbosity setting, infor-
mation about interactions (at or above level 5) and contacting atoms (at or
above level 7) are also output (see explanation for <recalc> above).
no Return the scoring and overlap information in a single line format.
<draw> Yes/no answer:
yes Automatically generate a drawing of the ligand in the chosen conforma-
tion, the receptor and the interaction information etc. in the form of the
DOCKING/DRAW command. The graphics must be visualized using the
DISPLAY command.
no Do not generate a drawing.
<smiles> Yes/no answer:
yes Write out a SMILES string for the ligand in the output.
no Do not output the SMILES string.
<recalc> If coordinates from a docking solution are selected, a recalculation of the
scoring terms can be enforced using this parameter. Yes/no answer:
yes Force a recalculation of the scoring terms this makes information about
interactions and contacting atoms available for output when coordinates
are taken from a docking solution.
no Use the original scores from the docking calculation however, informa-
tion about interactions and contacting atoms will not be available.
<recalc_hydro_pos> If x coordinates are selected, a recalculation of the hydro-
gens coordinates can be enforced using this parameter. Yes/no answer:
yes The hydrogens are placed using the local geometry of the x coordinates.
no The x coordinates are taken.
Note: The calculated information together with the formed interactions cannot be
stored.
Important note: In rare cases it may happen that the scores which were printed out
directly after a docking calculation and scores which have been recalculated using
a conformation read from le with ANALYZE/SCORE are not exactly the same. Dur-
ing the complex buildup, already placed and scored fragments may, during further
buildup stages, experience a slight positional change due to optimization. Thus, the
nal conformation can deviate fromwhat would be the matching position for this
score. If this conformation is written to le and is read in again, the scoring terms
as calculated during the original docking are not available causing the deviations.
As mentioned before, this scenario has been observed only very rarely, and when it
has, only negligible differences in score occurred.
Evaluating the PLP score of a protein-ligand complex (PLP)
Syntax: PLP <coordinate set> [<rms_limit>] [<rank_limit>] <table format>
Description: Computes the PLP score for the ligand placed on a given set of co-
ordinates. For yes/no questions you can enter either y, yes or 1 for yes, and
similarly n, no or 0 for no.
7.8. DOCKING (DOCKING SUBMENU) 133
<coordinate set> Stored with the ligand are a choice of coordinates. The choices
are (enter one of the following words, or an integer for <dock soln number>):
rms Available only after a docking calculation. For this choice, the coordinates
fromthe docking solution that meet the following two criteria will be used:
<rms_limit> The docking solution must have an RMS deviation less than
this limit (enter the limit as a oating-point number).
<rank_limit> The docking solution must lie ranked on position 1
<rank_limit> (enter the limit as an integer).
For example if <rms_limit> is 1.0 and <rank_limit> is 10, the highest
ranking solution with RMS lower than 1.0 among the rst 10 solutions is
taken.
<dock soln number> Enter an integer greater than 0, corresponding to the
rank of the docking solution you want to use.
x Coordinates are taken from the ligand input le.
ref Coordinates are taken from the reference coordinates (created via one of
the commands SETREF, READREF or MAPREF).
<table format> Yes/no answer:
yes Print the output information in table format. With a higher verbosity set-
ting, information about interactions (at or above level 5) and contacting
atoms (at or above level 7) are also output (available for x and ref coordi-
nate sets only).
no Print the information in single line format suitable for log le output.
Computing the overlap volume (OVERLAP)
Syntax: OVERLAP <coordinate set> [<rms_limit>] [<rank_limit>]
Description: Sometimes detailed information about the overlap volume of a pre-
dicted or a given placement is of interest. Type OVERLAP to output a list of all
overlapping ligand atoms. For each atom, the list of intersecting protein atoms and
the corresponding overlap volumes is output.
<coordinate set> Stored with the ligand are a choice of coordinates. The choices
are (enter one of the following words, or an integer for <dock soln number>):
rms Available only after a docking calculation. For this choice, the coordinates
fromthe docking solution that meet the following two criteria will be used:
<rms_limit> The docking solution must have an RMS deviation less than
this limit (enter the limit as a oating-point number).
<rank_limit> The docking solution must lie ranked on position 1
<rank_limit> (enter the limit as an integer).
For example if <rms_limit> is 1.0 and <rank_limit> is 10, the highest
ranking solution with RMS lower than 1.0 among the rst 10 solutions is
taken.
<dock soln number> Enter an integer greater than 0, corresponding to the
rank of the docking solution you want to use.
x Coordinates are taken from the ligand input le.
134 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
ref Coordinates are taken from the reference coordinates (created via one of
the commands SETREF, READREF or MAPREF).
Requirements: Protein and ligand must be loaded.
Computing the lipophilic contact area (CONTACT)
Syntax: CONTACT <coordinate set> [<rms_limit>] [<rank_limit>]
Description: This command gives detailed information about the contact term of
the scoring function. For each ligand atom, all contributions divided into lipophilic,
ambiguous (one lipophilic, one hydrophilic atom), and clash are summarized.
<coordinate set> Stored with the ligand are a choice of coordinates. The choices
are (enter one of the following words, or an integer for <dock soln number>):
rms Available only after a docking calculation. For this choice, the coordinates
fromthe docking solution that meet the following two criteria will be used:
<rms_limit> The docking solution must have an RMS deviation less than
this limit (enter the limit as a oating-point number).
<rank_limit> The docking solution must lie ranked on position 1
<rank_limit> (enter the limit as an integer).
For example if <rms_limit> is 1.0 and <rank_limit> is 10, the highest
ranking solution with RMS lower than 1.0 among the rst 10 solutions is
taken.
<dock soln number> Enter an integer greater than 0, corresponding to the
rank of the docking solution you want to use.
x Coordinates are taken from the ligand input le.
ref Coordinates are taken from the reference coordinates (created via one of
the commands SETREF, READREF or MAPREF).
Requirements: Protein and ligand must be loaded.
SAS (solvent-accessible surface) of a complex I
Syntax: SASTAB <table format> [<mode>] [<placements>] <x coord> [<ref
coord>]
Description: Calculates the SAS (solvent-accessible surface) of the receptor,
ligand, the protein-ligand complex, the receptor as in the protein-ligand complex,
and the ligand as in the protein-ligand complex with an internal approximation
algorithm and outputs it to the screen.
If <table format> is set to y, the SAS values will be output in form of a table. The
following modes are available then:
(1) Total SAS values
(2) Lipophilic SAS values
(3) Hydrophilic SAS values
If <table format> is set to n, total and lipophilic SAS values are printed on a single
line for every selected placement.
7.8. DOCKING (DOCKING SUBMENU) 135
<placements> corresponds to the selection of placements. It can either be a single
number, a comma-separated list of numbers, a list of intervals of the form a-b, or
all. For every placement selected, the SAS values are calculated.
If <x coord> is set to y, the SAS values for the currently given ligand structure
are calculated. If <ref coord> is set to y, the SAS values for the currently loaded
reference ligand are calculated.
The following table contains the hydrophilic SAS values for the currently given lig-
and (x), the currently loaded reference ligand (ref), and the rst ten docking solu-
tions (1-10).
The columns of the table:
Protein SAS the SAS of the receptor
Placement SAS the SAS of the ligand
Protein-Ligand Complex(PLC) the SAS of the protein-ligand complex
Protein-SAS in PLC the SAS of the receptor in the protein-ligand complex
Ligand-SAS in PLC the SAS of the ligand in the protein-ligand complex
Energy the total FlexX docking energy of the placement
RMS the rms value of the placement
Note: Both energy and rms value of the currently given ligand structure (and the
loaded reference ligand, respectively) are 0.0.
Example
>> Table of hydrophilic SAS:
No.|Protein |Placement |Protein-Ligand|Protein-SAS |Ligand-SAS |Energy | RMS
| SAS | SAS | Complex(PLC) | in PLC | in PLC | |
----+---------+----------+--------------+------------+-----------+---------+-------
fix| 3838.610| 331.749 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000
ref| 3838.610| 338.369 | 3825.917 | 3747.233 | 78.684 | 0.000 | 0.000
1| 3838.610| 324.041 | 3768.605 | 3736.451 | 32.154 | -65.224 | 1.395
2| 3838.610| 331.424 | 3783.861 | 3735.579 | 48.282 | -64.391 | 1.421
3| 3838.610| 316.454 | 3777.157 | 3745.432 | 31.725 | -63.659 | 1.249
4| 3838.610| 342.482 | 3798.655 | 3734.981 | 63.674 | -63.503 | 0.733
5| 3838.610| 334.216 | 3778.072 | 3730.000 | 48.072 | -63.428 | 1.055
6| 3838.610| 328.230 | 3784.225 | 3731.497 | 52.727 | -62.716 | 0.916
7| 3838.610| 333.711 | 3794.373 | 3740.160 | 54.213 | -62.155 | 0.899
8| 3838.610| 348.079 | 3794.319 | 3725.670 | 68.649 | -61.805 | 1.251
9| 3838.610| 331.315 | 3777.869 | 3733.473 | 44.396 | -61.726 | 1.396
10| 3838.610| 335.720 | 3779.957 | 3735.183 | 44.774 | -60.543 | 1.201
SAS (solvent-accessible surface) of a complex II
Syntax: SAS <placements>
Description: Calculates the SAS (solvent-accessible surface) of the protein-ligand
complex with an internal approximation algorithm and outputs it on the screen.
<placements> corresponds to the selection of placements. It can either be a single
number, a comma-separated list of numbers, a list of intervals of the form a-b, or
136 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
all. For every placement selected, the SAS values are calculated. The placement 0
refers to the currently given ligand structure.
Note: The command SAS has expired with Release 2.
Computing an RMSD histogram (RMSHIST)
Syntax: RMSHIST
Description: Computes all pairwise RMSDs between the computed placements.
The RMSDs are counted in bins with 1.0 width.
Requirements: A complex prediction must have been computed.
Comparing PDB water locations (WATER)
Syntax: WATER <search radius> <details> <draw> <dock no>
Description: The WATER command searches for water molecules in the PDB le of
the current protein; resulting water molecules are found near the protein as well as
close to the reference ligand. All waters having a distance less than <search radius>
to both molecules are considered further.
For yes/no questions you can enter either y, yes or 1 for yes, and similarly n,
no or 0 for no. If <details> is answered with yes, information about the B-factor,
the IDs of the closest particle (phantom) and the closest particle (*) that is actually
used in the selected placement as well as the protein and reference ligand contacts
are shown.
If <details> is answered with no, a single line per water molecule in the PDB le is
printed. The columns are:
W Water from PDB le
ID FlexX internal unique ID
name in PDB le
Chain ID from PDB le
number Hetero group number in PDB le
B-Factor from PDB le
# prot. contacts Number of contacts to protein atoms
# lig contacts Number of contacts to reference ligand atoms
ID phantom ID of closest not necessarily used particle
dist. phantom Distance to closest not necessarily used particle
ID used * ID of closest particle involved in the particular placement
dist. used * Distance to closest particle involved in the particular place-
men
In addition, a single line is output containing the following columns:
lename of the PDB le
7.8. DOCKING (DOCKING SUBMENU) 137
score * of the particular placement
RMS * of the particular placement
rank * of the particular placement
# particles Total number of particles
# water The number of PDB waters found in the PDB le within the
<search radius>
# used * The number of particles involved in the particular placement
# part. <= 1.0 The total number of particles within a distance of 1.0 with
more than one protein contact
# part. <= 1.5 The total number of particles within a distance of 1.5 with
more than one protein contact
# part. > 1.5 The total of particles within a distance larger than 1.5 with
more than one protein contact
# part. <= 1.0 The total number of particles within a distance of 1.0 with
exactly one protein contact
# part. <= 1.5 The total of particles within a distance of 1.5 with exactly
one protein contact
# part. > 1.5 The total number of particles within a distance larger than
1.5 with exactly one protein contact
# used <= 1.0 * The number of particles involved in the particular place-
ment within a distance of 1.0 with more than one protein contact
# used <= 1.5 * The number of particles involved in the particular place-
ment within a distance of 1.5 with more than one protein contact
# used > 1.5* The number of particles involved in the particular place-
ment within a distance larger than 1.5 with more than one protein contact
# used <= 1.0* The number of particles within a distance of 1.0 with
exactly one protein contact
# used <= 1.5 * The number of particles involved in the particular place-
ment within a distance of 1.5 with exactly one protein contact
# used > 1.5 * The number of particles involved in the particular place-
ment within a distance larger than 1.5 with exactly one protein contact
If <draw>is switched on, the PDB water molecules found are drawn. Because PDB
water is not included in the graphics mechanism, the color and the object number
cannot be changed. Also, PDB water should be drawn after a rst DISPLAY com-
mand is executed, for example to draw the active site or the ligand.
* Note: The ligand atom names and the computed distances always refer to the
reference ligand. Only the columns that are marked with * in the legend above refer
to a particular placement <dock no>!
Requirements: A ligand including reference coordinates and a protein must be
loaded. In order to get information about particles, parameter PLACE_PARTICLES
138 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
must be set to 1 , 2 or 3 before docking.
Searching for cavities between protein and ligand (CAVITY)
Syntax: CAVITY <placement> <probe_radius> <contact_radius> <contacts>
<draw>
Description: Detects cavities between the selected <placement> and the protein.
The radius of the probe (<probe_radius>), which determines the size of the cav-
ities that are found, the search radius for (< contact_radius>) and the number of
hydrophobic contacts to the protein must be given. In addition the contacts are
drawn if <draw> is set to y.
If more than a single contact to the protein is requested, the number of hydrophobic
contacts are listed in the shell.
Generating a Ligplot input le (2DPLOT)
Syntax: 2DPLOT <coordinate set> [<rms_limit>] [<rank_limit>] <lename>
Description: 2DPLOT generates a pdb le <lename> containing the protein and
the ligand molecule with x coordinates, reference coordinates, coordinates from a
placement or selected by RMSDand rank. The ligand is given a separate amino acid
number so that the le can be easily processed with Ligplot [33].
<coordinate set> Stored with the ligand are a choice of coordinates. The choices
are (enter one of the following words, or an integer for <dock soln number>):
rms Available only after a docking calculation. For this choice, the coordinates
fromthe docking solution that meet the following two criteria will be used:
<rms_limit> The docking solution must have an RMS deviation less than
this limit (enter the limit as a oating-point number).
<rank_limit> The docking solution must lie ranked on position 1
<rank_limit> (enter the limit as an integer).
For example if <rms_limit> is 1.0 and <rank_limit> is 10, the highest
ranking solution with RMS lower than 1.0 among the rst 10 solutions is
taken.
<dock soln number> Enter an integer greater than 0, corresponding to the
rank of the docking solution you want to use.
x Coordinates are taken from the ligand input le.
ref Coordinates are taken from the reference coordinates (created via one of
the commands SETREF, READREF or MAPREF).
<lename> Enter the lename for the 2D plot.
Requirements: A ligand and a protein must be loaded.
Computing and writing all energy and matching scores to a logle (MATCHING)
Syntax: MATCHING <coordinate set> [<rms_limit>] [<rank_limit>] [<recalc>]
[<recalc_hydro_pos>] <lename> <extended output> [<add lig atom>]
[<separator>] <llchar> <append> [<pvm merge>]
7.8. DOCKING (DOCKING SUBMENU) 139
Description: Searches for interactions and computes an energy estimation for the
ligand placed on a given set of coordinates. Then writes all energy and matching
scores to a log le <filename>.
For yes/no questions you can enter either y, yes or 1 for yes, and similarly n,
no or 0 for no.
<coordinate set> Stored with the ligand are a choice of coordinates. The choices
are (enter one of the following words, or an integer for <dock soln number>):
rms Available only after a docking calculation. For this choice, the coordinates
fromthe docking solution that meet the following two criteria will be used:
<rms_limit> The docking solution must have an RMS deviation less than
this limit (enter the limit as a oating-point number).
<rank_limit> The docking solution must lie ranked on position 1
<rank_limit> (enter the limit as an integer).
For example if <rms_limit> is 1.0 and <rank_limit> is 10, the highest
ranking solution with RMS lower than 1.0 among the rst 10 solutions is
taken.
<dock soln number> Enter an integer greater than 0, corresponding to the
rank of the docking solution you want to use.
x Coordinates are taken from the ligand input le.
ref Coordinates are taken from the reference coordinates (created via one of
the commands SETREF, READREF or MAPREF).
If reference coordinates are used and hydrogens do not have coordinates, these
hydrogens are placed using the local geometry of the x coordinates.
<recalc> If coordinates from a docking solution are selected, a recalculation of the
scoring terms can be enforced using this parameter. Yes/no answer:
yes Force a recalculation of the scoring terms.
no Use the original scores from the docking calculation.
<recalc_hydro_pos> If x coordinates are selected, a recalculation of the hydro-
gens coordinates can be enforced using this parameter. Yes/no answer:
yes The hydrogens are placed using the local geometry of the x coordinates.
no The x coordinates are taken.
<lename> The name of the le is <filename>.log, if <lename> does not
contain any sufx, otherwise the name is <filename>.
<extended output> dened the format of output. Yes/no answer:
yes Prints an output line to the le like the command EXPORT. For more de-
tails see 7.8.21.
no Prints an output line to the le like the command PRINTSOL. For more
details see 7.8.20.
<add lig atom> Only if <extended output> is set to y. Yes/no answer:
yes The ligand atom info of the matching ligand atoms is also written to the
le. The ligand atom info consists of the atom name and the inle number.
The matching score, the atom name and the inle number are separated
by :.
140 CHAPTER 7. FLEXX IN COMMANDLINE MODE --- II. MENUS AND COMMANDS
no Only the matching score is written.
<separator> Only if <extended output> is set to n. Each column of the
PRINTSOL line is separated by <separator>.
<llchar> If a solution has no result for a match column, this column then contains
<llchar>.
<append> Ask to append the line to the le. Yes/no answer:
yes The line will be appended to <lename> without a header row. If the
receptor does not change, all lines have the same format for all ligands
because each row has the same columns.
no The le <lename> will be created and the rst row is a header row.
<pvm merge> In FlexX-PVM, the csv le is automatically merged after parallel
script execution. This feature can be switched off by setting <pvm merge> to
no.
Notes: The calculated information together with the formed interactions cannot be
stored.
Important note: See notes of the command SCORE. For lename usage and le
merging within scripts, please refer to the PVM section on page 147.
8
Additional modules for
FlexX
8.1 Parallel Virtual Machine (PVM submenu)
The PVM module of FlexX allows you to execute a script on a Parallel Virtual Machine
(PVM). The PVM submenu is only present if the PVM interface is available and the PVM
module is activated in FlexX. The PVMmodule is a standard module of FlexX, no additional
license is necessary.
Firstly, it will be much easier to set up your parallel calculation if you understand what is
going on behind the scenes. The basic setup is as follows:
A master FlexX process runs on your workstation. This instance of FlexX reads a ready-
prepared script which it can split into jobs (it does this by extracting the iterations of a loop).
The master process then starts slave FlexX processes on remote machines via PVM and a
remote login. These slave processes then receive jobs from the master process. When the
slave nishes one job it may receive another. When all jobs have been sent, the master waits
for them to nish and then tidies up and ends the calculation.
8.1.1 Preliminaries
There are several important points to get right in order to get parallel computing working
with FlexX. Here is a list of things to prepare:
PVM must be installed on all machines that will run master or slave processes.
You must be able to execute a remote login to all machines that will run processes
without having to enter a password. Make sure that your .rhosts or equivalent le
is correctly dened.
The environment variables PVM_ROOT, PVM_ARCH and PVM_RSH must be set. These
must point to the PVM installation, your platform architecture name and your remote
login type. Often the script $PVM_ROOT/lib/pvmgetarch can be used to nd your
platform architecture. For example, the variables may look like this:
PVM_ROOT=/software/pvm/pvm
PVM_ARCH=LINUX
PVM_RSH=/usr/bin/ssh
You should have the following set in your path:
141
142 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
$PVM_ROOT/lib
$PVM_ROOT/lib/$PVM_ARCH
$PVM_ROOT/bin/$PVM_ARCH
All the machines must be able to access the same le system (NFS) for the data les.
All the machines must be able to access the same FlexX executable. This
executable or a link to a central executable must be found in either
$PVM_ROOT/bin/$PVM_ARCH or /home directory/pvm3/bin/$PVM_ARCH. For
example:
/home/user/pvm3/bin/LINUX/flexx -> /install/software/BioSolveIT/FLEXX/bin/flexx
The executable name must match the setting of the program ag FLEXX (see Sec-
tion 10.1).
See the PVM manual for further information on parallel computing with PVM [11].
8.1.2 Starting PVM
If no PVM daemon is running, typing pvm on the console starts the daemon. Or use
the FlexX command TOPVM (8.1.3) to start the PVM daemon. There is a message during
FlexX start-up saying whether a PVM daemon is detected or not. If the PVM environment
variables PVM_VMIDand/or PVM_TMP are used, FlexX uses these environment variables,
too. For this purpose it is necessary that the environment variables PVM_ROOT and PVM_RSH
must be set on all machines that will run master or slave processes.
8.1.3 Conguring PVM
The parallel machine is congured by FlexX itself. Under Linux only (!), in
File Global Preferences ,
the GUI has a tab with Parallel Computing conguration possibilities. The dialog awaits an
input which consists of a list of host names followed by a number specifying the maximum
number of FlexX processes allowed on this host and an optional nice value for all processes
on this host (see section 10.1.7). FlexX only uses the specied hosts regardless of the congu-
ration of PVM. The conguration of PVM can be modied or viewed with the GUIs Global
Preference Settings, or, alternatively, in Commandline Mode, with the following commands.
Outputting the PVM conguration (INFO)
Syntax: INFO
Description: Generates a list of all hosts with the corresponding maximum num-
ber of FlexX processes and outputs it on the screen. If the PVM daemon is not
running, a status message about the PVM daemon is output.
8.1. PARALLEL VIRTUAL MACHINE (PVM SUBMENU) 143
Modifying the PVM conguration (ADD)
Syntax: ADD <host name> <max processes> <nice val>
Description: Adds a new host to the PVM conguration. If a host named <host
name> is already contained in the host list, the maximum number of processes is
changed to <max processes>. If the maximum number of processes is set to 0, the
host remains unused during a parallel computation. Finally, a nice value can be
dened for FlexX processes on this host.
Important notes: ADD modies only the internal list of hosts and does not actually
add the host to the PVM. This is done during start-up of a parallel execution. There-
fore error messages about the availability of a host appear during start-up and not
after adding a host.
Removing a host from the PVM conguration (REMOVE)
Syntax: REMOVE <host name>
Description: Removes a host from the PVM conguration.
Calling the PVM console (TOPVM)
Syntax: TOPVM
Description: Calls the PVM console. If no PVM daemon is running, a daemon
will be started. See the PVM manual for a list of console commands. Typing halt
terminates the daemon, while typing quit does not. Both commands terminate the
console and return to FlexX.
Important notes: Typing reset kills all processes running under the control of the
pvm daemon. This may result in temporary les which are not deleted by FlexX.
8.1.4 Executing parallel batch les
The execution of a parallel batch le is initiated by the SCRIPT command or by the command
line option -b. If all of the following conditions hold, the batch le is executed in parallel:
The FlexX program is a FlexX-PVM executable.
The PVM daemon is running on the machine on which the FlexX process is started.
The control ag USE_PVM_FEATURE is set to 1 (see section 10.1.4).
The current FlexX process is not a work process.
The FlexX conguration (cp. 10.1) contains a list of hosts with a maximum number of
processes greater than zero or a list of hosts is dened interactively with the PVM/ADD
command.
The batch le contains a FOR_EACH loop.
The batch le does not contain any of the following commands:
batch le/script commands: INPUT, SELINP, WAIT
144 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
exx commands: EDITCFG, DISPLAY, TOFLEXV
commands within the PVM submenu
commands within the DATABASE submenu
the command for changing the environment variable PREDICT (namely SET
PREDICT)
If one of the conditions does not hold, the batch le is executed sequentially. An easy way
to control whether a batch le is executed in parallel or not is to start the PVM daemon in
advance.
With parallel execution, the current FlexX process becomes the master process, also called
the scheduler. The master process initiates work processes, schedules tasks to the work
processes, and collects and merges the resulting output les.
In a rst step, the batch le is subdivided into three sections; the break points are the
FOR_EACH and END_FOR statements of the outermost loop which dene the section to be
parallelized. The part before the FOR_EACH statement is called the init batch le, the part be-
tween the FOR_EACH and END_FOR statement is called the loop batch le, and the remaining
part is called the post-processing batch le.
The parallel execution of nested loops within FlexX scripts is supported in cases where the
FOR_EACH and END_FOR statements directly follow one another (!) in a script. In the case
of nested loops the loop batch le adjusts to the section between the innermost FOR_EACH
and END_FOR statement.
The outermost loop must lie on the top level of FlexXs menu structure, that is, structures
such as
LIGAND
FOR_EACH ...
...
END_FOR
END
are forbidden, while
LIGAND
...
END
FOR_EACH ...
...
END_FOR
would be allowed.
The master then initiates FlexX work processes on all hosts according to the list of hosts
dened in the conguration le or interactively. Each work process executes the init batch
le. At this point, the master outputs the hosts on which a FlexX process is started on stdout
and starts the communication protocol.
After initiating the communication protocol, the master process sends the loop batch le
with corresponding input data to the work processes whenever one of them is idle. If the
work process is killed for any reason, the master automatically initiates a new work process
on the corresponding host and starts the next task on it. When all loop iterations have been
8.1. PARALLEL VIRTUAL MACHINE (PVM SUBMENU) 145
executed by the work processes, the master sends the post-processing batch le to them and
terminates the processes.
The output of the work processes is collected in les named pvm_outp_<tid>_<no>. Out-
put generated with LIST
*
or INFO commands preceded by SELOUTP with <pvm merge>
switched on is collected in temporary les and merged afterwards by the master to a single
le. These les do not differ from those generated by a sequential execution of the batch
script.
In addition the master generates a log le named pvm_flexx-run_<no>. This le contains
a summary of all script execution events like the work process creation (including the task
ID of the process), output merging events, and commission execution events (including the
host where the corresponding loop instance was executed and the output of the last INFO
command contained in the loop). The log le contains information in chronological order
and is used for recovering partial calculations.
8.1.5 Aborting and recovering
The recommended way to start parallel execution is with the SCRIPT command frominside
FlexX (instead of a command line start with the -b option). The advantage is that a brief
communication protocol is output in the current window so that you always see how far
processing has already progressed. In this mode, the calculation can be aborted by pressing
any key and then abort as soon as the prompt appears. The scheduler stops sending new
commissions and waits until all processes are idle (roll-out). It will then merge the output
created so far and terminate such that FlexX switches back to interactive mode. Note that
this can take a while depending on the typical runtime of the individual computing tasks.
During this roll-out phase, another keypress followed by abort causes an immediate termi-
nation without waiting for the results of running work processes. To immediately stop the
parallel execution, open a PVM console (by typing pvm in a different shell) and type reset.
If the parallel execution was aborted, the output les are already merged. If the scheduler
was interrupted for an immediate termination, the output les are not merged yet. Merging
can therefore be done ofine with the OFMERGE command.
Ofine merging of PVM output les (OFMERGE)
Syntax: OFMERGE <pvm log le>
Description: OFMERGE merges the output les from a parallel script execution.
Under standard conditions, output les are automatically merged. If the scheduler
is immediately terminated, the output les created up to this point can be manually
merged with OFMERGE afterwards.
Requirements: A valid log le and all temporary les from a previously aborted
calculation must be available.
Recovering an aborted or terminated parallel script execution (RECOVER)
Syntax: RECOVER <pvm log le>
Description: If a parallel script execution was aborted or immediately terminated,
FlexX is able to recover the calculation from the log le. RECOVER will analyze the
log le and create a list of already calculated commissions. The script must then
146 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
be restarted using the SCRIPT command. FlexX will skip the already calculated
commissions and continue its calculation writing into the same log le.
Requirements: A valid log le and all temporary les from a previously aborted
calculation must be available.
8.1.6 Killing a single work process
If a parallel execution starts with the SCRIPT command from inside FlexX, a single work
process can be terminated by pressing any key and then kill as soon as the prompt ap-
pears. Then FlexX asks for the work process to be terminated, and sends a termination
instruction to this work process.
8.1.7 Working with parallel FlexX
It is advisable to start a parallel FlexX execution with an empty PREDICT directory. All
les except the local les containing the batch les (located in the PVM_TEMP directory)
are written into this directory (unless there are absolute lenames in the batch le). The
PREDICT directory must be modied in the FlexX conguration. Changing the PREDICT
directory interactively causes confusion between the master and the work processes.
The directories used during the FlexX run must be available and unique for all hosts. In a
standard installation, this is sometimes not the case for TEMP which can be a local directory.
Make sure that all FlexX processes access the same directories.
When a work process terminates during execution, a new work process is automatically
started by the master process. This process will have a new task ID and in order to avoid
failure of the nal merging of the output les, temporary les of the old work process are
renamed to the task ID of the new process. The new work process now executes the init
batch le before doing any computation. If the init batch le overwrites an output le,
data generated by the terminated process will be lost. This can be avoided by using
SELOUTP in append mode only.
8.1.8 Working with PVM
Working with PVM also has some pitfalls. The following list explains some of them:
Problems occur if your start-up le .cshrc or .profile produces output on stdout.
Make sure that this is not the case.
Hosts which are part of the Parallel Virtual Machine must be available for the user. In
particular, it must be possible to log in or execute an rsh command. Make sure that
your .rhosts le is dened appropriately.
PVM expects the user program to be located at $(PVM_ROOT)/bin/$(PVM_ARCH).
It is best to generate a link from there to the FlexX executable named flexx.
As mentioned previously (p. 142), it is crucial to employ the same FlexX executable for
both master and work nodes.
If you kill a FlexX master process the PVM demons on the client machines are not
terminate correctly and may still running. This may cause problems if you start a new
8.1. PARALLEL VIRTUAL MACHINE (PVM SUBMENU) 147
job in parallel. Thus, we recomment (a) always to interrupt a parallel computation
with abort (see Sec. 8.1.5) and (b) if you had to kill FlexX kill also all pvm demons and
temporary PVM lock les on all client machines manually before restarting another
job, i.e., execute killall pvmd3; rm -f /tmp/pvm* on all client machines.
Finally, for PVM usage there is a constraint on lenames if you want to employ them
within parallel scripts. One scenario might be that youd like to do a parameteriza-
tion study and want to associate the lenames with the parameters, e.g.: SELOUTP
output_for_$(a_parameter)
In that case a le merging process at the end of PVM-parallelized jobs will work correctly
only if:
the respective nal lename(s) remain either constant OR
only contain built-in, constant variables OR
the variables are set during the startup of the script.
If a lename itself contains variables set during the scripts (even outside the parallelized
loop), le merging will most probably not work properly due to operating system behavior.
In order to allow that each slave process write his own les, FlexX-PVM has a build in batch
variable $(PVM_ID), which is built up of the environment variable $PVM_VMID (p. 142), if it
is set, and the pvm task id of the slave process:
$(PVM_ID) - _TEST_VMID_2883585, where $PVM_VMID is set to TEST_VMID
$(PVM_ID) - _2883585, where $PVM_VMID is not set
The batch variable is automatically added to les, which will be merged at the end of PVM-
parallelized jobs.
148 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
8.2 Docking of combinatorial libraries
FlexX
c
is an extension of FlexX which allows a more efcient docking of combinatorial li-
braries. Since a combinatorial library is built from only a few building blocks with a well-
dened way of connecting them, information from previously docked molecules can be
reused to speed up the docking process. FlexX
c
is more like a toolbox of algorithms for
combinatorial docking instead of a single method. The reason for this design is that of-
ten information about the library is available, making the use of one or the other docking
scheme look more appropriate. For a detailed description of the data structures underlying
FlexX
c
we refer to the scientic literature.
In order to use FlexX
c
an additional license key is necessary.
FlexX
c
is a combinatorial docking tool, it is not a combinatorial library design tool. The
difference is that in FlexX
c
we expect a combinatorial library including all building blocks
as well as the rules for how to connect them as input while a combinatorial library design
tool starts with fragments and creates a combinatorial library as output. FlexX
c
can be useful
to prioritize among several libraries for screening or for determining preferable R-groups.
The user interface of FlexX
c
contains two new menus, CLIB and CDOCK. The CLIB menu
contains all commands for handling combinatorial libraries and therefore replaces the
LIGAND menu in the classical docking. The CDOCK menu contains the commands for com-
binatorial docking.
The following batch variables (see section 9.1.1) are available:
$(CLIB_READY) is set to TRUE, if a combinatorial library is available.
$(CORE_ID) contains the index of the currently core group.
$(NOF_CORE_INST) contains the number of instances for the currently core group.
$(NOF_RGROUPS) contains the number of rgoups (without the core group).
8.2.1 Generating combinatorial libraries from scaffolds (PERMUTE module)
As an extension introduced with Release 2 of FlexX, combilibs can be generated from scaf-
folds based on a set of distinct rules. This new feature, which can be licensed seperately
from FlexX
c
, is called PERMUTE because it was initially intended for permutation of proto-
nation states or other close derivatives that can be generated from a ligand as a scaffold. If
a PERMUTE license is available but not the CLIB, you will be unable to read combilibs from
disk, and only the PERMUTE command can be used to access combilibs. Read the permute
tutorial 8.2.6 for further information how to use PERMUTE.
8.2.2 Handling combinatorial libraries (CLIB submenu)
For FlexX
c
, a combinatorial library consists of a core and up to nine additional R-groups,
usually numbered from 1 to 9. The core is a synonym for R-group number 0, there is no
difference in principle between the core and the R-groups. The alternative fragments for an
R-group, called R-group instances, must be stored in a single le (see section 8.2.2). If the le
has the (multi-)mol2 format then the connecting atoms of the R-group instances are dened
by their atom name. Each R-group instance (except core instances) must have exactly one
atom with a unique atom name (usually X), called the X-atom. In addition, each core or
8.2. DOCKING OF COMBINATORIAL LIBRARIES 149
R-group instance can have several so-called R-atoms which connect to other R-groups. R-
atoms must be marked with a unique atomname (usually R) followed by the number (1 9)
which nominates the connecting R-group. X-atoms and R-atoms are required to be terminal
atoms, i.e. they have exactly one bond. The unique neighbor is called the X-neighbor or
R-neighbor atom. In order to connect two R-group instances, the corresponding X-atom
and R-atom are removed and a bond is formed between the X-neighbor and the R-neighbor
atom. The bond length is set to the average X-atom X-neighbor bond length.
Loading R-groups (READ)
Combinatorial libraries can either be loaded as a set of multi-mol2 les (one for each R-
group) or as a compact CSLN le. Both can be done with the comamnd READ:
Syntax (mol2): READ <rgroup le> <r id> [<rgroup no> <x id>]
Syntax (CSLN): READ <csln le>
Description:
(multi-)mol2 les:
Reads an R-group/the core into FlexX
c
main memory. All instances of the R-
group must be stored in a single multi-mol2 le <rgroup le>. All connecting
atoms to additional R-groups (R-atoms) must have the unique atom name <r
id> followed by a number 19. Each instance must have the same connecting
atoms. The rst R-group to load must always be R0 (or the core). Subsequently
loaded R-groups are identied either by <rgroup no> or by the number in the
X-atom name. Each instance must have exactly one X-atom name <x id>,
optionally followed by a number. READ reads in the molecules and outputs
the list of found R-groups. Note that the R-atom type is compared with the X-
atom neighbor type. If the X-atom neighbor type is unique, the R-atom type is
set to it, otherwise it is set to Du (dummy type). The same atomtype correction
is done for the X-atom by analyzing the r-atom neighbors.
CLSN les:
Reads a complete combinatorial library into FlexX
c
main memory. The input
le must have CSLN format and have the extension .csln. Some restrictions
apply to the CSLN le:
mutual bonds must be of type single
multi-Markush atoms will be handled as independent R-groups
R-group instances may not be connected at two attachment points and
cause ring closures
the le size limit is currently 200 KB per input le (This cannot be changed
by the user!)
in order to create 3D coordinates FlexX passes the R-group instances
through the 3DGENERATOR, which therefore has to be dened in the con-
guration.
Note: Instances, which contains only linker atoms, are skipped during the reading.
Important note (mol2): After the rst execution of READ, the combinatorial library
has the OPEN status. After reading all R-groups, CLOSE must be executed.
150 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Important note (CSLN): The lename must have the extension .csln, otherwise
FlexX will assume reading a (multi-)mol2 le.
Generating a combilib from scaffolds (PERMUTE)
Syntax: PERMUTE <use-orig> <core> <cleanup>
Description: This is the only new command for the PERMUTE license. Permute is
designed for generating and evaluating derivatives of a known scaffold as a com-
binatorial library. One example is the permutation of protonation states. To use
permute, a ligand must be loaded and a base fragment selection (DOCK/SELBAS)
must be performed. Permute now tries to identify substructures dened in per-
mute.dat. Based on these substructures the molecule is cleaved and each fragment
is modied by one or more rules dened for each substructure and produces a num-
ber of derivatives for each fragment. The result can be handled like any other con-
ventional combinatorial library. The combilib cannot be written to a le, but all
derivatives can be enumerated to a multi-mol2 le with CLIB/ENUM.
<use-orig> As a rst step one substructure is identied in the scaffold molecule,
and all derivatives are generated from it by transforming it. Enter yes if you
want to use the original fragment as well.
<core> After identication of all fragments, one fragment must be selected as core
fragment, default is -1 for automatic detection of the best core. The automatic
core detection mechanism uses the information that comes from the base frag-
ment selection and tries to nd a fragment with the highest expected interac-
tion energy that does not overlap with multiple R-groups.
<cleanup> The transformation engine is quite powerful in modifying atomic
properties including the protonation of atoms, but for modications that need
recalculation of coordinates, e.g. if atoms have been added, rings are closed or
opened, it is necessary to pass the fragments through a cleanup mechanism.
If you say yes here, all instances are passed through CORINA before putting
them into the combilib.
Important notes/Limitations: Permute has the same constraints as any other com-
bilib, combilibs with more than nine R-groups will be rejected. Multiple matches
of substructures in one ring system cannot be resolved by permute. Permute fails
if two or more substructures are to be applied to rings, because rings cannot be
cleaved and reassembled by the combilib mechanism. The user must dene rules
that combine all the permutation rules for a ring in one single rule.
Finishing R-group loading (CLOSE)
Syntax: CLOSE
Description: Finishes the library load procedure. All open R-groups are substi-
tuted by hydrogens and physico-chemical data is assigned to each instance. After
executing CLOSE, the library is assigned the READY status and can be used for
docking calculations.
Requirements: The core and all R-groups must have been loaded with READ be-
fore.
8.2. DOCKING OF COMBINATORIAL LIBRARIES 151
Set/reset a lter expression for the library (SETFILTER)
Syntax: SETFILTER <expression>
Description: Set or reset an <expression> for the combinatorial library. If an
<expression> is given, then the evaluation for the combinatorial library is acti-
vated. Otherwise the evaluation is deactivated.
<expression> may be a basic molecular property or a complex logical expression
(see section 8.6.1).
If a lter expression is set before a combinatorial library is read, then only the in-
stances, which fulll the <expression>, are loaded. Therefore the <expression>
should be chosen carefully.
If the evaluation is activated, then in each command, where combinatorial
molecules are build up, the <expression> will be evaluated, i.e., in each step,
where an instance is build up to a combinatorial molecule, the extended molecule
has to fulll the <expression> or the instance is removed from the combinatorial
molecule.
Requirements: In order to use the command SETFILTER, you need a license for
SCREEN module!
Example
CLIB
setfilter "mass <= 500" # set a filter expression
read ... # each instance which mass is
# greater as 500 is skipped
close
setfilter "%lipinski()" # set lipinski filter macro
END
CDOCK
selbas %
placec 3
extendmr n % # only combi. molecules which fulfill
# the lipinski filter macro will be
# docked
END
CLIB
setfilter # the evaluation is deactivated
END
Loading reference coordinates (READREF)
Syntax: READREF <rgroup no> <ref le> <ignore hydrogen>
Description: Reads reference coordinates for R-group <rgroup no> from a multi-
mol2 le <ref le>. Reference coordinates are used for manual fragment place-
ment and RMSD calculations. There must be the same number of molecules in
<ref le> as there are instances in the R-group <rgroup no>. The i-th molecule
in <ref le> is mapped onto the i-th instance of R-group <rgroup no>. For each
molecule-instance pair, the assignment of atoms is based solely on the atom num-
152 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
bering. Finally, if <ignore hydrogen> is answered yes, hydrogen atoms are disre-
garded during loading.
Requirements: A library must be loaded and have READY status.
Setting reference coordinates (SETREF)
Syntax: SETREF <rgroup no> <ignore hydrogen>
Description: Copies the loaded coordinates to the reference coordinates for all in-
stances of R-group <rgroup no>. Reference coordinates are used for manual frag-
ment placement and RMSD calculations. You can decide whether the hydrogen
atoms should be taken into account or not in the comparison with the parameter
<ignore hydrogen>.
Requirements: A library must be loaded and have READY status.
Assigning reference coordinates by subgraph matching (MAPREF)
Syntax: MAPREF <rgroup no> <ref le> <check bonds> <ignore hydrogen>
Description: Loads the molecule from <ref le> as a reference subgraph for R-
group <rgroup no>. Reference coordinates are assigned to each instance of R-
group <rgroup no> based on subgraph matching. If multiple matchings are found,
the rst arbitrary matching is used. The SYBYL atom types between atoms of
the input molecule and atoms in the subgraph must be identical in order to be
matched. If <check bonds> is answered yes, bond types must also match. If
<ignore hydrogen>is answered yes, hydrogens are excluded when loading the ref-
erence structure. The subgraph together with the coordinates are stored internally
and further used during base selection and placement (see SELBAS and placement
routines PLACEC, PLACER, PLACESEQ).
Requirements: A library must be loaded and have READY status.
Important notes: Although only the rst mapping is used for the initial assign-
ment of reference coordinates, multiple mappings are processed later in the place-
ment routines.
Deleting a combinatorial library (DELETE)
Syntax: DELETE
Description: Deletes a combinatorial library from FlexX
c
main memory.
Outputting combinatorial library information (INFO)
Syntax: INFO
Description: INFO outputs summary information about the currently loaded li-
brary on the screen. Besides the load status, the number of instances and the input
le is listed for each R-group. Finally, the total number of instances and the total
library size (number of library molecules) is output.
Activating subsets of R-group instances (SELECT)
Syntax: SELECT <rgroup no> <instance list> <select mode>
8.2. DOCKING OF COMBINATORIAL LIBRARIES 153
Description: With SELECT, the set of R-group instances can be reduced to a subset.
In all following calculations, only the subset is included the subset is considered
activated. <rgroup no> is the number of the R-group whose instance set should be
reduced. <instance list> denes the set of remaining instances. <instance list>
is a list of integer ranges (format a-b) separated by comma or blank (if separated
by blanks, the list must be enclosed by quotation marks). Depending on <select
mode>, previous selections are overwritten (o, the default) or extended (e). In
the Python interface, if specied d, the selection for the R-group is cleared and
<instance list> will be ignored. Non-activated instances are not deleted, they can
be (re)activated by a further execution of SELECT.
Requirements: A library must be loaded and have READY status.
Outputting R-group information (RGROUP)
Syntax: RGROUP <rgroup no>
Description: RGROUP outputs the R-group le and the list of all instances with the
internal number, a + indicating the active (subselected) instances, and the molecule
name of the instance.
Switching the core (SWITCH)
Syntax: SWITCH <rgroup no>
Description: During the docking calculations later on, the core plays a special role.
When the library is loaded, the rst R-group loaded is always R-group 0 which will
then be the core. For the calculations later on however, each R-group can play the
role of the core. With SWITCH, the R-group <rgroup no> is dened to be the core.
Requirements: A library must be loaded and have READY status.
Important notes: Switching the core is possible only before starting the docking
calculations and not during the calculations. If the dependency of a combinatorial
library on the pharmacophore constraints is computed with the command CPHARM,
the command SWITCH is not available. In order to use SWITCH, you must rst delete
the dependencies with the command DELPHARM.
Extending a core instance (EXTENDCORE)
Syntax: EXTENDCORE <core id> [<rgroup no> [<inst no>] ...]
Description: With the command EXTENDCORE a single core instance can be ex-
panded with chosen R-group instances. The extended core is specied by the in-
stance number for the core <core id> and a sequence of <rgroup no> and <inst
no>. We have seen in several cases that using this command may drastically im-
prove the results; this especially applies to the correlation between serial and com-
binatorial docking.
Requirements: A library must be loaded and have READY status. If a core was
expanded previously, it must be reset rst using RESETCORE.
Important notes: If an extended core is dened, only the extended core can be
chosen. That means the other commands automatically take the expanded core
and do not ask for a core instance. Some commands are not available if a core
154 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
instance has been extended, e.g. SWITCH, ENUM, READREF, SETREF, MAPREF,
PLACER, PLACESEQ and SELECTR. To reset the extended core, use the command
RESETCORE.
Release an extended core (RELEXTCORE)
Syntax: RELEXTCORE
Description: Release an extended core. All R-groups that currently expand the
core to a extended core are removed.
Requirements: A core must rst have been extended with EXTENDCORE.
Resetting the combinatorial library (RESETCORE)
Syntax: RESETCORE
Description: The combinatorial library is reset to its initial status.
Requirements: If the dependency of a combinatorial library on the pharma-
cophore constraints is computed with the command CPHARM, the command
RESETCORE is not available. In order to use RESETCORE, you must rst delete the
dependencies with the command DELPHARM.
Extracting a molecule from the library (EXTRACT)
Syntax: EXTRACT <core id> [<rgroup no> <inst no>]
Description: With the commands EXTRACT, EXTEND, and RELEASE, molecules
can be extracted from the combinatorial library and subjected to individual
molecule calculations in the same way as if they were loaded as a single molecule
in the LIGAND menu. All commands from LIGAND and DOCKING can be applied to
them.
The rst parameter <core id>denes the instance of the core to be used. Additional
R-groups can be added. For each R-group, the R-group number <rgroup no> and
the instance <inst no> must be specied. Note that R-groups can only be added
in an order such that the resulting molecule is connected. In order to terminate the
selection 1 shall be provided as the terminal <rgroup no>.
The molecule is built by linking the corresponding R-groups and adding some
physico-chemical information at the newly formed bond, the molecule is not copied.
Therefore it is necessary to release the molecule to the library before another library
molecule is extracted.
Requirements: A library must be loaded and have READY status. If a library
molecule was extracted previously, it must be released rst using RELEASE.
Releasing R-groups from an extracted molecule (RELEASE)
Syntax: RELEASE <nof rgroups>
Description: RELEASE releases the last <nof rgroups> added to the currently ex-
tracted library molecule. The combinatorial library molecule extraction works like
a stack. Therefore, only the last added fragments can be removed in opposite order.
The default value for <nof rgroups> is the total number of R-groups (including the
8.2. DOCKING OF COMBINATORIAL LIBRARIES 155
core) such that the complete molecule can be released by accepting the default. A
library molecule has to be released completely before a new one can be extracted.
Requirements: A molecule must have been extracted from the library.
Extending an extracted molecule (EXTEND)
Syntax: EXTEND [<rgroup no> <inst no>]
Description: EXTEND adds additional R-groups to a partially extracted library
molecule. This command works exactly like EXTRACT except that at least a core
instance must have been extracted before.
Requirements: EXTRACT must have been previously performed.
Enumerating a combinatorial library (ENUM)
Syntax: ENUM [<write to mol2 (y/n)> <lename>]
Description: ENUM enumerates the combinatorial library currently loaded start-
ing from the currently extracted molecule. The molecules are constructed and re-
leased only without any further computations. This function is useful for testing
the molecule construction routine before using it in a more time-consuming dock-
ing calculation. The generated molecules can optionally be written to a multi-mol2
le.
Requirements: EXTRACT has to be performed rst.
Outputting information about a library molecule (MINFO)
Syntax: MINFO <extracted> [<rgroup no> <inst no>]
Description: MINFO outputs detailed information about a library molecule. If
<extracted> is answered yes, information is given for the extracted molecule. Oth-
erwise an instance is selected by the <rgroup no> and the instance ID <inst no>.
Requirements: A library must be loaded and have READY status.
Important notes:
Selecting admin settings for drawing the library (SELADM)
Syntax: SELADM <graphics object number> <temp le> <append>
Description: With SELADM you can specify the graphics object numbers used for
drawing the library and you can determine whether the graphic les are internal
temporary les used only by FlexX or are saved for further use. For yes/no ques-
tions you can enter either y, yes or 1 for yes, and similarly n, no or 0 for
no.
<graphics object number> Enter integer:
(1255) The graphics created with the DRAW command will be displayed in
graphics object <graphics object number>
0 The <graphics object number> will match the chosen R-group number.
That is to say, for example, an R-group with number 5 will be drawn in
<graphics object number> 5. (The numbers are the same as the parame-
ter <rgroup no> used with the DRAW comand.)
156 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
<temp le> Yes/no answer:
yes The graphics are written in temporary les and removed after quitting
FlexX.
no The graphics are written to permanent les chosen by the user. You will be
asked for a lename at the end of each DRAW command (see DRAW below
for example).
<append> Yes/no answer:
yes Previous graphics les are not overwritten. Instead the current graphics
are appended to the previous one in the graphics le.
no The previous graphics le will be overwritten and all previous graphics
made with the DRAW command in this menu will be lost.
Selecting graphics settings for drawing the library (SELGRA)
Syntax: SELGRA <mol display mode> <hydro> <interact geoms> <all contact
types> [<contact type selection>] <surf> <R connect> <X connect>
Description: With SELGRA you can set specic default values for drawing combi-
natorial libraries. For yes/no questions you can enter either y, yes or 1 for yes,
and similarly n, no or 0 for no.
<mol display mode> Species the drawing display mode for molecule selection:
1 Lines
2 Sticks
3 Balls & sticks
4 Space-lled spheres
<hydro> Species whether and how hydrogens should be drawn on the ligand
selection:
0 Do not draw hydrogens.
1 Draw all hydrogens.
2 Draw only hydrogens bonded to hetero atoms (non-carbon atoms).
<interact geoms> Yes/no answer:
yes Interaction geometries (interaction surfaces around potential interacting
groups) are drawn.
no No interaction geometries are drawn.
<all contact types> Yes/no answer:
yes Interaction geometries for all contact types (interaction types) are drawn
if <interact geoms> is set to yes.
no Interaction geometries are drawn for a selection of contact types (interac-
tion types). You will be asked to select types from a given list:
<contact type selection> Choose a list of types represented by integers.
Enter the list as separate integers or as integer ranges (format a b)
separated by , or blanks. Note that you need to enclose the expression
in quotation marks if it contains blanks e.g. 1, 2, 4, 7 9.
<surf> Determines surface drawing:
8.2. DOCKING OF COMBINATORIAL LIBRARIES 157
0 Draw no surface
1 Draw the molecular surface: If FlexV is used to visualize, the Connolly sur-
face is drawn. Otherwise only concave patches as triangles are drawn.
Note: Hydrogens are not considered when drawing the surface if hydrogens
are to be drawn (<hydro>) they will be ignored when drawing the surface.
<R connect> Yes/no answer:
yes Vectors are drawn at the positions where further R-groups can be added
(R-atoms).
no No vectors are drawn.
<X connect> Yes/no answer:
yes Vectors are drawn at the positions where current R-group is connected to
the parent R-group or core (X-atom).
no No vectors are drawn.
Important notes: The Connolly surface is rendered by its analytical calculated
patches. This enables selection of the level of curvature approximation but makes
the rendering much more complicated. Therefore a few percent of the patches are
rendered incorrectly (we will try to reduce this rate). In addition, there is currently
only pairwise cusp trimming.
Selecting colors for drawing the library (SELCOL)
Syntax: SELCOL <library color mode> <interact geoms color mode> <surface
color mode>
Description: With SELCOL you can set the color modes for the library molecules,
interaction geometries and molecular surfaces. For each of these, a selection of color
modes is available:
<ref. coords> Yes/no answer:
yes The following modications concern the settings for drawing the ligand
with the reference coordinates.
no The following modications concern the settings for drawing the ligand
with the input coordinates.
<library color mode> Choose the color scheme for drawing the library molecules
color mode selection:
INVISIBLE
ATOM
UNIQUE
FRAGMENT
ENERGY
<interact geoms color mode> Choose the color mode for drawing the interac-
tion geometries if they are to be drawn. The interaction geometries consist
of patches or surfaces that indicate the positions of interacting groups in the
molecule. Color mode selection:
INVISIBLE
UNIQUE
158 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
CONTACT
<surface color mode> Choose the color mode for coloring the molecular surface
if it is to be drawn. Color mode selection:
INVISIBLE
UNIQUE
SURF_ATOM
CEN_DIST
SURFPATCH
Below, the possible color modes are explained for some color modes you are also
asked to enter some dening colors. Enter your chosen color as either an angle from
the color circle (0 360 degrees: 0 is invisible, 1360 runs from dark blue, through
red, yellow, green to blue), a color name (as dened in the GRAPHIC static data le),
or an RGB(A) value; 3 (4) oating-point numbers separated by blanks or slashes:
Example
selcol .... "dark green" ....
selcol .... green ....
selcol .... "0.0 0.8 0.1" ....
selcol .... 0.0/0.8/0.1 ....
selcol .... 220 ...
(For a full explanation of how to dene colors see Sec. 11.22.1.)
INVISIBLE The item drawn will be invisible.
ATOM The ligand will be colored according to the element types of the atoms. The
atoms are drawn in the color dened for its element type in the static data le
GRAPHIC, while the bonds are drawn half and half in the neighboring atom
colors.
UNIQUE The object will be drawn in one user-dened color. You will be asked to
choose the unique color:
<color> Enter your chosen color.
FRAGMENT Color the ligand according to its fragments. You are required to
choose the fragmentation scheme and three colors for this color mode:
<fragmentation> Integer selection enter the number that represents your
chosen fragmentation scheme.
<base color> Enter a color for the base fragment.
<rst color> The remaining fragments will be alternately colored with two
colors: enter the rst color here . . .
<second color> . . . and the second color here.
fragment: When the SELBAS command is called in the DOCKING menu, FlexX
calculates several fragmentation schemes for the ligand. Each scheme contains
a base fragment which will be the rst fragment of the ligand to be placed in
the active site while the remainder of the ligand is split into further fragments
which will be successively added to the base fragment during the incremental
reconstruction of the ligand in the active site. The fragments are based on the
8.2. DOCKING OF COMBINATORIAL LIBRARIES 159
components FlexX dened when the ligand was loaded (see, for example, the
SELGRA command for an explanation of component).
ENERGY Draw the ligand conformation in a color representative of its docking
solution score (energy). A color rainbow will be dened between two given
colors across the range of two given docking scores. You are required to enter:
<no. of intervals> Enter the number of intervals (integer) that the energy
range will be split into.
<min energy> Enter the minimum energy value (oating-point number) for
the start of the energy range. The default is the score of the best docking
solution.
<max energy> Enter the maximum energy value (oating-point number) for
the end of the energy range. The default is the score of the worst scoring
docking solution.
<rst color> Enter the rst color of the color rainbow.
<second color> Enter the second (end) color of the color rainbow.
CONTACT The object will be drawn in a color representing its interaction (contact)
type. The colors for each type are dened in the GRAPHIC static data le.
SURF_ATOM Convex patches in the surface are colored by atom type (see color
mode ATOM). Any reentrant patches (i.e. saddle and concave patches) are
drawn in a user-dened color. You are required to enter the reentrant patch
color in this mode:
<reentrant color> Enter your chosen color.
CEN_DIST The surface is drawn in a rainbow of colors representing how far the
surface lies from the (geometric) center of the molecule. For this mode you are
required to enter the start and end colors of the rainbow plus the number of
intervals to be colored across the rainbow range:
<no of intervals> The number of intervals into which the range will be split.
<rst color> Start color for the rainbow.
<second color> End color for the rainbow.
SURFPATCH The surface is colored according to the surface patch type. You are
required to enter colors for the various patch types in this mode:
<concave color> Enter your chosen color for concave patches.
<saddle color> Enter your chosen color for saddle patches.
<convex color> Enter your chosen color for convex patches.
Selecting labels for drawing the library (SELLAB)
Syntax: SELLAB <X name> <X id> <atom name> <inle number> <SYBYL
type> <fragment number>
Description: When the library molecules are drawn, FlexX stores information in
labels for display in the graphic interface. You can choose what should appear in
the label using the SELLAB command. For yes/no questions you can enter either
y, yes or 1 for yes, and similarly n, no or 0 for no.
<X name> Yes/no answer:
160 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
yes Include the molecule name of the R-group instance at the X-atoms.
no Do not include the instance names.
<X id> Yes/no answer:
yes Include the ID of the R-group instance at the X-atoms.
no Do not include the instance IDs.
<atom name> Yes/no answer:
yes Include the atom name as taken from the input le in the label for atoms.
no Atom names will not be included in the label for atoms.
<inle number> Yes/no answer:
yes Include the number of the atom as taken from the input le in the label for
atoms.
no Inle numbers will not be included in the label for atoms.
<sybyl type> Yes/no answer:
yes Include the SYBYL atom types in the label for atoms.
no SYBYL atom types will not be included in the label for atoms.
<fragment number> Yes/no answer:
yes Include the fragment number in the label for atoms.
no Fragment numbers will not be included in the label for atoms.
Note: For an explanation of fragments see, for example, the LIGAND/SELCOL
command.
Drawing the combinatorial library (DRAW)
Syntax: DRAW <rgroup no> <active only> <all active> <transform> [<parent r
id>] [<selection>] [<mul draw directory> <lename>]
Description: DRAW generates a drawing of the R-group molecule and sends it to a
le ready to be displayed in the graphics interface. For details about what exactly
is drawn see the SELGRA command. For yes/no questions you can enter either y,
yes or 1 for yes, and similarly n, no or 0 for no.
<rgroup no> Enter the R-group number of the R-group to be drawn.
<active only> Yes/no answer:
yes Draw only the instances of the R-group that are activated (see SELECT
command).
no Draw all instances of the R-group.
<all active> Yes/no answer:
yes Draw all the activated instances of the R-group (see SELECT command).
no Draw only a selection of the activated R-group instances. You must enter a
selection of instances see the parameter <selection> below.
<transform> Yes/no answer:
yes Transform the instance molecules relative to a specied instance of the
parent R-group (i.e. arrange the instances in space so that they are visual-
ized aligned with the parent R-group). You must enter the parent R-group:
8.2. DOCKING OF COMBINATORIAL LIBRARIES 161
[<parent r id> Enter the R-group number of the parent R-group to which
the instances should be transformed.
no Do not transform the instances.
[<selection> ] This parameter is only required if you have chosen the following
settings:
<active only> yes
<all active> no
In this case, only a selection of activated R-group instances will be drawn. En-
ter your selection as a list of integers or integer ranges (format a b) separated
by , or blanks. The integers are the IDs of the instances.
[<multiple draw directory> <lename> ] If the graphics are not to be stored in
temporary les (see SELADM), enter a directory for containing the graphics les
and a base lename for the graphics les here.
Requirements: A library must be loaded and have READY status.
Listing the graphic settings (GRAINF)
Syntax: GRAINF
Description: Outputs a list of all current graphic settings (the graphic context) for
the combinatorial library.
8.2.3 Pharmacophore constraints for combinatorial libraries (CLIB menu)
In order to use pharmacophore constraints for combinatorial libraries, a set of constraints
has to be loaded into FLEXC-PHARMs workspace with the READ command in the PHARM
menu (all information about pharmacophore constraints can be found in section 8.3). Only
the additional commands in the CLIB menu are described here.
After loading the constraint le, the dependency of the combinatorial library on the phar-
macophore constraints must be computed using the CPHARM command in the CLIB menu.
Now the constraints are used as a lter during the combinatorial docking or after a combi-
natorial docking.
The following commands use the constraints as a lter: PLACEC, EXTENDR, EXTENDMR and
PLACESEQ.
Computing the dependency on pharmacophore constraints (CPHARM)
Syntax: CPHARM
Description: Compute the combinatorial library dependency on pharmacophore
constraints and generate the R-group master list. If the length of the R-group mas-
ter list exceeds <CPHARM_MAX_LIST_LEN>, then FLEXC-PHARM automatically
switches to the post-docking lter mode for checking against the constraints.
Important notes: A set of constraints must have been loaded. Once the depen-
dency is computed, the command SWITCH is not available. In order to use SWITCH,
you must rst delete the dependencies with the command DELPHARM. If an ex-
tended core molecule is used, only the post-docking lter mode is available.
162 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Deleting the dependency on pharmacophore constraints (DELPHARM)
Syntax: DELPHARM
Description: Delete the combinatorial library dependency on pharmacophore con-
straints.
Static data in FLEXC-PHARM (conguration)
Name (type): <CPHARM_MAX_LIST_LEN> (integer)
Description: This is the maximum length of the R-group master list (see
CLIB/CPHARM output) that is allowed for the pharmacophore checks during com-
binatorial docking. The master list contains the list of R-group countergroup
combinations that passed the pre-docking checks against the constraints. Un-
der certain circumstances a very long list leads to an extremely long FLEXC-
PHARM combinatorial docking time. If the length of the master list exceeds
<CPHARM_MAX_LIST_LEN>, then FLEXC-PHARM automatically switches to the
post-docking lter mode for checking against the constraints.
Default value: 20000
Reasonable range: 5000 - 50000
8.2.4 Docking combinatorial libraries (CDOCK submenu)
In a combinatorial docking run a huge amount of placement information can be generated
in a short time. It is therefore necessary to store this information in secondary memory and
control the amount of information stored.
Internally, FlexX
c
only keeps the scores achieved for the k highest ranking placements.
The number of scores kept for each library molecule can be controlled by the parameter
<SCORE_TABLE_SIZE>.
During the calculation, placement les can be written. The number and content of
the les is controlled by three parameters: <STORE_PLACEMENT_THRESHOLD>,
<STORE_PLACEMENT_MODE> and <MAX_NOF_FILE_WRITE>:
<STORE_PLACEMENT_THRESHOLD> denes a threshold such that only molecules
having a score below this threshold are stored in placement les.
<STORE_PLACEMENT_MODE> denes the le format and the number of placements
stored. If set to -1, a pdf le containing the whole set of placements is stored. If set to
a value k > 0, a multi-mol2 le is written containing the k rst placements. If set to 0,
no les are created.
<MAX_NOF_FILE_WRITE> controls the number of les created. If set to
-1, a le is created for each molecule (according to the settings of
<STORE_PLACEMENT_THRESHOLD> and <STORE_PLACEMENT_MODE>).
If set to 0, all placements are written to a single multi-mol2 le (provided that
<STORE_PLACEMENT_MODE> > 0). If set to k > 0, FlexX
c
keeps only the
placement les of the k highest scoring library molecules.
<KEEP_ALL_SCORES_ACHIEVED> If <KEEP_ALL_SCORES_ACHIEVED> is set to
1, FlexX
c
keeps the scores achieved for all placements (only for the placement so-
lutions of EXTENDMR and PLACESEQ.
8.2. DOCKING OF COMBINATORIAL LIBRARIES 163
For further information about the combinatorial docking parameters see also section 11.4.
If a le is created for a single library molecule (either a mol2 or a pdf le), the lename is con-
structed in the following way. The rst part of the name is the constant string <lename>
which is a parameter of various placement commands (see below). The second part de-
scribes the library molecule: there is a string for each R-group contained in the library
molecule having the format [C|R<rgroup no>]-<instance id>. These strings are concate-
nated with the separator _ in the order in which the R-groups are added by the construction
algorithm. If <lename> is a multi-mol2 le, several score values for each placement are
printed as a comment line (FLEXX_SCORE).
Selecting base instances (SELBAS)
Syntax: SELBAS <nof base mol> <nof in core> [<nof in rgroup> ...]
Description: SELBAS selects the instances used for base placement. The user can
only control the number of selected instances, not the selection process itself. First,
the total number of instances <nof base mol> can be selected, then the maximum
number of instances for the core and each R-group can be specied.
For the combinatorial docking algorithms currently available, SELBAS doesnt re-
ally make sense. In order to get a result for all library molecules, you must select all
instances of the core or the R-group (for R-group placement). SELBAS is basically
implemented for other combinatorial docking algorithms which are under devel-
opment. Note that SELBAS must be performed before any placement routine. Note
also that SELBAS is signicantly different from DOCKING/SELBAS which performs
a selection of base fragments.
Requirements: A library must be loaded and have READY status. Previously cal-
culated placements must be deleted.
Placing the core instances (PLACEC)
Syntax: PLACEC <mode>
Description: PLACEC docks the core instances into the protein active site using
FlexXs incremental construction algorithm. All three phases (selection of base frag-
ments, placing base fragments, incremental construction) are performed automat-
ically. <mode> controls the placement algorithm for base placement. The same
modes are allowed as for single molecule docking, manual (m), perturbate (p), tri-
angle (3), line (2) except for covalent docking (see DOCKING/PLACBAS 7.8.1 for de-
tails).
Requirements: A library must be loaded and have READY status. Previously
calculated placements must be deleted and base instances must be selected using
SELBAS.
Note: PLACEC currently (release 2.1) generates placements for all active instances
of the core and keeps them in memory. Especially if you have switched the core
to an R-group you may have many core instances. While standard FlexX
c
is able
to handle this, FLEXC-PHARM might run out of memory, here. In order to avoid
any problems you should loop over all core instances and set only one by one as
currently active. You can use the variable $(NOF_CORE_INST) to determine the
number of instances e.g:
164 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
for_each $(inst) fromto 0 $(NOF_CORE_INST)
clib
select $(CORE_ID) $(inst) # set instance $(inst) of core group as active
end
cdock
placec 3 # only instance $(inst) of core group will be placed
extendmr n ....
end
end_for
Local optimization of core placements (OPTC)
Syntax: OPTC <core id> [<placements> <expand radius> <nof it> <sort>]+
Description: Optimizes the score of the core placements. <core id> is a selection
of core instances. It can either be a single number, a list of numbers separated by ,,
a list of intervals of the form a-b, or all.
For each selected core instance with calculated placements, the following parame-
ters are identical as in DOCKING/OPTIMIZE.
Requirements: Core placements must have been computed.
Writing core placements in pdf format (WRITEC)
Syntax: WRITEC <core id> <base lename>
Description: Writes a set of core placements in a FlexX-specic le format (.pdf
format) on disk. The default directory for this command is the path specied in
the entry PREDICT (cp. conguration). The pdf format is based on ASCII and can
therefore be read and edited with standard tools.
Reading core placements in pdf format (READC)
Syntax: READC <core id> <lename>
Description: Reads a set of placements for core instance <core id> in a FlexX-
specic le format (.pdf format) from the le <lename>. The default directory
for this command is the path specied in the entry PREDICT (cp. conguration).
Important Notes: The placement information is based on the receptor and core
instance. Thus, load the receptor, and the core instance les in FlexXs main memory
before executing the READC command must be the same as the les that were in the
main memory during generation of the core placements. Otherwise, FlexX ends up
in an inconsistent state which is not detected in every case.
Placing R-group instances (PLACER)
Syntax: PLACER <rgroup no> <mode> <store> [<lename>]
Description: PLACER docks all instances of R-group <rgroup no> into the active
site using the base placement algorithm <mode> (see DOCKING/PLACEBAS for
a description of the modes). If <store> is answered y, placements are stored
in the le <lename> with respect to the settings in SETTINGS (see section 11.4)
8.2. DOCKING OF COMBINATORIAL LIBRARIES 165
(<STORE_PLACEMENT_THRESHOLD>, <STORE_PLACEMENT_MODE>, and
<MAX_NOF_FILE_WRITE>).
Sequential docking of the library (PLACESEQ)
Syntax: PLACESEQ <continue> [<sequence>] <rgroup no> [<inst no>]
[<rgroup no> [<inst no>]...] <mode> <store> [<lename>]
Description: PLACESEQ performs a sequential placement of all library molecules.
The molecules are created sequentially and then docked into the active site. No
information of previous placements is re-used. The docking results are therefore
identical to the results of a standard calculation of each individual library molecule.
The total amount of computing time is of course also the same.
You can restart aborted computations or start from a given sequence. For both of
these scenarios, the <continue> question must be answered with y. If you answer
n here, FlexX will directly ask you for the R-group(s) and compute all combina-
tions between them.
In the restart scenario (i.e. you answered y to <continue>), FlexX internally
checks whether an aborted sequence is still available. If so, it will propose the lat-
est sequence you computed and ask (<sequence>, y or n) whether you want to
continue with the one following it. If not, it will ask for a start sequence and continue
the calculation from the sequence following this start sequence. The start sequence has
to be specied as a sequence of <rgroup no> and <inst no>.
<rgroup no> denes the R-groups to be added. <inst no> denes the instance
of R-group <rgroup>. Use CLIB/SELECT in advance to restrict the R-group in-
stances. The following parameters are the same as in PLACER. <mode> denes the
placement mode for the base placement (see DOCKING/PLACEBAS for a description
of modes). If <store> is answered y, placements are stored in the le <lename>
according to the settings in SETTINGS (<STORE_PLACEMENT_THRESHOLD>,
<STORE_PLACEMENT_MODE> and <MAX_NOF_FILE_WRITE>).
Requirements: A library must be loaded and have READY status. Previously
calculated placements must be deleted and base instances must be selected using
SELBAS.
Notes: The calculation can be aborted by pressing any key and entering abort as
soon as the prompt appears. Before the prompt appears, the complex build up of
the current sequence will be nished.
Extending core placements by single R-groups (EXTENDR)
Syntax: EXTENDR <rgroup no> <store> [<lename>]
Description: EXTENDR adds all active instances of R-group <rgroup no>
to all core placements using the incremental construction algorithm. If
<store> is answered y, placements are stored in the le <lename> ac-
cording to the settings in SETTINGS (<STORE_PLACEMENT_THRESHOLD>,
<STORE_PLACEMENT_MODE> and <MAX_NOF_FILE_WRITE>).
Requirements: A library must be loaded and have READY status. Previously cal-
culated placements must be deleted and base instances must be selected and placed
using SELBAS and PLACEC.
166 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Selecting R-group instances by score (SELECTR)
Syntax: SELECTR <rgroup list> <nof active>
Description: SELECTR divides the set of instances for R-groups in <rgroup list>
into active and inactive ones. The selection criterion is the score of a previous dock-
ing run. <nof active> is the number of instances which should be dened as active
afterwards. If <nof active> is terminated with a percent sign, the value is consid-
ered to be the percentage that should be active.
Requirements: A library must be loaded and have READY status. Previously cal-
culated placements must be deleted and R-group instances must be places using
either SELBAS and PLACER or SELBAS, PLACEC and EXTENDR.
Extending core placements by multiple R-groups (EXTENDMR)
Syntax: EXTENDMR <continue> [<sequence>] <rgroup no> [<inst no>]
[<rgroup no> [<inst no>] ...] <store> [<lename>] <compare> [<cmp le>]
Description: EXTENDMR adds multiple R-groups to already created core place-
ments in a recursive fashion (recursive combinatorial docking, see [23] for a de-
scription of the algorithm).
You can restart aborted computations or start from a given sequence. For both of
these scenarios, the <continue> question must be answered with y. If you answer
n here, FlexX will directly ask you for the R-group(s) and compute all combina-
tions between them.
In the restart scenario (i.e. you answered y to <continue>), FlexX internally
checks whether an aborted sequence is still available. If so, it will propose the latest
sequence you computed and ask (<sequence>, y or n) whether you want to con-
tinue with it. If not, it will ask for a start sequence and continue the calculation from
this start sequence. The start sequence must be specied as a sequence of <rgroup
no> and <inst no>.
<rgroup no> denes the list and order of R-groups to be added. <inst no> denes
the instance of R-group <rgroup>. The list of R-groups can be terminated with
-1. The R-groups must be selected in an order such that the core and all selected
R-groups form a connected molecule. Note that only active instances are placed in
the docking procedure (after loading, all instances are active; the set of actives can
then be limited with CLIB/SELECT or CDOCK/SELECTR).
If <store> is answered y, placements are stored in the le(s) <lename> accord-
ing to the settings in SETTINGS (<STORE_PLACEMENT_THRESHOLD>,
<STORE_PLACEMENT_MODE> and <MAX_NOF_FILE_WRITE>). If
<compare> is answered y, the placements created for each library molecule
are compared with the placement in <cmp le>_<mol id>, where <mol id> is
created according to the lename rules described above.
Requirements: A library must be loaded and have READY status. Previous calcu-
lated placements must be deleted and core instances must be placed using SELBAS
and PLACEC.
Notes: The calculation can be aborted by pressing any key and entering abort as
soon as the prompt appears. Before the prompt appears, the complex build up of
the current sequence will be nished.
8.2. DOCKING OF COMBINATORIAL LIBRARIES 167
Deleting placement information (DELETE)
Syntax: DELETE
Description: DELETE deletes all placement information in the main memory. The
command is unique for all kind of placement results. Note that les created during
a combinatorial docking run are not deleted.
Listing placement results (LISTP)
Syntax: LISTP <mode> [<active> <ll gaps> <nof placements> <sort>]
[<nof_sol>]
Description: LISTP creates a table of docking solutions, one row for each docked
library molecule. Depending on the performed calculations, one of four modes
must be selected with <mode>: c (core placements), s (single rgroup placements),
m (multiple R-group placements) or a (all multiple R-group placements, only if
<KEEP_ALL_SCORES_ACHIEVED> is set to 1).
If <mode> is set to a, the placements will be sorted by energy and the best
<nof_sol> placement scores will be listed in the table, one row for each placement.
Otherwise the table can be restricted to active molecules by answering <active>
with y. If placement information is missing for some library molecules, the table
can be completed (lled up with 0s) by answering <ll gaps>with y. The number
of placement scores to be output is controlled by <nof placements> and nally
<sort> denes the sort criterion which is either 0 (by R-group instance numbers)
or 1 (by energy).
Extracting a library molecule and loading placement data (EXTRACT)
Syntax: EXTRACT <mode> <core id> [<r id>...]
Description: EXTRACT creates a library molecule and loads the corresponding
placement data. The library molecule is specied by the instance numbers for the
core <core id> and the R-groups <r id>. <mode> denes which kind of place-
ment data should be loaded. Allowed values are c (core placements), r (single R-
group placements), and m (multiple R-group placements).
Requirements: A combinatorial docking must have been performed and
placement data must have been written to a pdf le. Make sure that
<STORE_PLACEMENT_MODE> is set to -1 before the calculation and the
<store> parameter of the corresponding docking command is answered y.
Releasing a library molecule and deleting placement data (RELEASE)
Syntax: RELEASE
Description: RELEASE destroys placements previously loaded with EXTRACT and
releases the extracted library molecule back to the library. RELEASE is automatically
executed between EXTRACT commands.
Writing multiple R-group placements to a le (WRITESOL)
Syntax: WRITESOL <basename> <le_format> <append> <nof_sol>
168 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Description: WRITESOL writes the best <nof_sol> multiple R-group placements
to a mol2 or sdf le calling <basename>. The placement scores will be sorted by
energy.
If <le_format> is set to 1, WRITESOL writes the placement data to a mol2
le. Otherwise if <le_format> is set to 0, the destination le is an sdf le. If
<append> is set to y, <basename> is a multi-mol2 or a multi-sdf le. Otherwise
the placement data will be written to single mol2 or sdf les. The names of these
les consist of <basename>and of a well-dened string which describes the library
molecule.
If a mol2 le is written, several score values for each dock entry are printed as a
comment line (FLEXX_SCORE). Otherwise the score values for each dock entry are
printed as data blocks in the sdf le.
WRITESOL reads the data of the best <nof_sol> placements from the pdf les
1
.
Requirements: <KEEP_ALL_SCORES_ACHIEVED> must have been set to 1
and a combinatorial docking must have been performed with EXTENDMR or
PLACESEQ respectively. The placement data must have been written to a pdf le.
Make sure that <STORE_PLACEMENT_MODE> is set to -1 before the calculation
and the <store> parameter of the corresponding docking command is answered
y.
Extracting the library molecule with the best alignment score and loading the corre-
sponding placement data (EXTRACTTOP)
Syntax: EXTRACTTOP
Description: EXTRACTTOP creates the library molecule with the best docking en-
ergy score and loads the corresponding placement data
2
.
Requirements: <KEEP_ALL_SCORES_ACHIEVED> must have been set to 1
and a combinatorial docking must have been performed with EXTENDMR or
PLACESEQ respectively. The placement data must have been written to a pdf le.
Make sure that <STORE_PLACEMENT_MODE> is set to -1 before the calculation
and the <store> parameter of the corresponding docking command is answered
y.
8.2.5 Compatibility with other modules
Since FlexX Release 2 the FlexX
c
module can be used in combination with the FlexE module
(section 8.4). A combinatorial library can be docked simply by loading the library in the
CLIB menu (see section 8.2) and switching to the united protein structure before docking in
the CDOCK menu.
Since release 2.1 the FlexX
c
module can be used in combination with the FlexX-Pharmmod-
ule. For further information see section 8.2.3.
1
In rare cases, for numerical reasons, it may happen that you cannot read in a generated pdf le. Please
report such problems to support@biosolveit.de if possible.
2
In rare cases, for numerical reasons, it may happen that you cannot read in a generated pdf le. Wed kindly
ask you to report such problems to support@biosolveit.de if possible.
8.2. DOCKING OF COMBINATORIAL LIBRARIES 169
HN
NH
O
HO
O
HO
permute_test:1:NoName
Figure 8.1: An imaginary sample compound for explaining the PERMUTE facility.
8.2.6 How to use PERMUTE; a tutorial
Intention and basic methodology
The intention of the PERMUTE mechanism is to provide a powerful mechanism for testing
all possible protonation states of a ligand during the docking process.
The permutation engine provided by the PERMUTE command is not designed to distinguish
between chemophysical properties or overall chemical knowledge, it is just a mechanism
to generate derivatives from an initial scaffold. The currently loaded ligand is used as the
scaffold. Based on some simple rules written in SMARTS
TM
/SMILES, a small combinatorial
library that represents all possible derivatives is generated via CLIB/PERMUTE.
As an important example and the main application, the permutation of protonation states
can be handled by this mechanism. Basically, different protonation states and tautomers of a
ligand are very similar to each other, but in the docking process, changes in the protonation
of a ligand yield to several changes of chemical properties of the ligand. If the protonation
of a carboxylic acid changes from deprotonated (which is the default) to a protonated state,
the protonation affects the formal charges, the interaction behavior of the site (H-acceptors,
become H-donors), and of course the steric and geometric properties. Of course, you may
say a carboxylic acid should always be deprotonated, this is true in pure water, but things
may change in deeply buried active sites, where the environment is much less polar than in
the solvent.
A small set of protonation rules is provided with the download package of LeadIT. This set
is not comprehensive, but it should be usable as a basic set and the user may extend this set
by special demands.
Writing permutation rules
So how does PERMUTE work? Figure (8.1) shows a chemical compound that consists of four
protonatable sites, two carboxylic acids and two secondary amines, one of them located in a
ring system.
So, rules to detect these sites are necessary for the permutation process. For the carboxylic
170 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
acids we can write a SMARTS
TM
subgraph like C(O)O and for the secondary amines we can
write [ND2], thats all. We do not need to distinguish between secondary amines in rings
or outside, the handling is different, but the rules can be identical. Later we will see that
there are some special constraints in rings. OK, now we need to decide what changes we
want to apply to the detected sites. For the carboxylic acids, we can write two forms, the de-
protonated form C(=O)[O-] and the protonated form C(=O)O. Accordingly the rules for the
amine are simply NH and [NH2+] for the deprotonated and protonated forms respectively.
Those rules are located in the static data le permute.dat.
@SUBGRAPH 1 2 carboxylate
smarts C(=[OD1])[OD1]
data
C(=O)[OH]
C(=O)[O-]
end
@SUBGRAPH 1 2 sec_amine
smarts [$([$sec_amine;!$tripam]([
*
])[
*
])]
data
[NH]
[NH2+]
end
The @SUBGRAPHkeyword initializes a newpermutation rule, it is followed by a numerical
class (0 and 1 are possible) and a priority. The class tells PERMUTE to use the group as core (0)
or as R-group (1). The difference is that the rst substructure detected in the ligand is treated
as the core fragment. All other sites are treated as R-groups. permute.dat currently has
no rules that force groups as core fragments. The priority is used like the priority in all
other substructure denition les. The rules are internally sorted by their priority and if a
substructure matches some atoms in the compound, these atoms cannot be part of any other
substructure and will not be matched again. This is an important constraint, especially if
you expect that a certain rule should match a substructure in the ligand and it doesnt. In
those case another pattern has a higher priority and matches one of the atoms of the pattern
with lower priority and the atom is invisible for further matches. So select the priorities
carefully and it is usually a good idea to give the more specic pattern a higher priority. If
permutation rules are dened with the same priority, the order they occur in the permute.dat
le is decisive. The third argument is a short name for the subgraph that is printed during
the substructure identication process.
The next line contains the smarts keyword followed by a SMARTS
TM
pattern used to iden-
tify the permutable sites. Then in the next lines a data/end block follows and each line in this
block contains a transformation rule that describes what modications should be applied to
the identied subgraph. (Refer to the SMARTS
TM
chapter and the LIGAND/TRANSFORM
command for further information.)
8.2. DOCKING OF COMBINATORIAL LIBRARIES 171
combilib generation
In the very rst step a core fragment must be identied. This can be forced by substruc-
tures that have a class identier of 0 or, and that is the recommended way, it is chosen
automatically. For docking of ligands, a powerful mechanism for selecting the base frag-
ment is implemented and accessible via the DOCKING/SELBAS command. Before call-
ing the CLIB/PERMUTE command it is necessary to call the DOCKING/SELBAS command.
PERMUTE identies all sites in the current ligand based on the rules in permute.dat. Before
discussing the further steps, we should start with some practical experience so far.
Permute quick start
In the examples/lig directory, there is a ligand called permute_test.mol2. So if examples
is your current working directory, simply start FlexX and read this molecule as the ligand,
and your screen should look similar to this. Make sure that no automatic modications are
applied to the ligand, switch them off via SELINIT.
LEADIT> ligand
LEADIT/LIGAND> selinit !
*
LEADIT/LIGAND> read permute_test
>> File permute_test.mol2 contains 1 compounds.
>> Type assignment check, OK.
>> Ligand NoName loaded from file ~/leadit/examples/lig/permute_test.mol2.
Current process size: 51508 kB
Now it is necessary to call the SELBAS command from the DOCKING menu, just use the
automatic base selection.
LEADIT/LIGAND> docking
LEADIT/DOCKING> selbas a
>> Base fragment selection
>> Automatic base selection
After this, change to the CLIB menu and call the PERMUTE command, please answer the
questions according to the following screen:
LEADIT/DOCKING> CLIB
LEADIT/CLIB> PERMUTE
Use original fragment [y] : n
Core fragment (-1 = automatic selection) <-1,3> [-1] : -1
Use external cleanup program [n] : n
The rst question, Use original fragment, should be answered with no. If the original
matched fragment has to be removed fromthe combilib, this means that only the derivatives
generated from this fragment will be taken but not the initial state in the ligand. The idea is
that for protonation states, the state that is found in the ligand is treated as undened. We
do not care about it, but it is important to have all possible states later in the combilib. So
answering yes in this protonation example would yield to three instances in each R-group,
two states come from the rules and one state is the initial state found in the ligand.
172 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
The second question allows for selection of a base fragment from one of the possible frag-
mentations, simply select -1 here for automatic selection.
The third question asks you if an external cleanup program should be used to correct the
geometry of the generated R-group instances. For protonation it is usually not necessary, but
for rule sets that make further modications to the groups it may be important. If unsure,
and if an external cleanup program is available and properly congured, answer yes here.
Now, PERMUTE tries to match the substructures from permute.dat onto the ligand, and if
one is found, the names of the matched subgraphs are printed on the screen like this. In our
small example four sites will be detected, the two carboxilic acids and the two secondary
amines.
>> Transformations finished.
>> Identifying R-groups.
R-Group 1: generated from pattern "sec_amine"
R-Group 2: generated from pattern "sec_amine"
R-Group 3: generated from pattern "carboxylate"
R-Group 4: generated from pattern "carboxylate"
Next, a core group must be identied. Based on the identied substructures and the base
fragment selection from SELBAS, PERMUTE tries to select a fragment which is a favorable
base fragment and does not overlap with more than one of the R-groups identied so far.
>> Selecting base fragment as core.
Frag. | Atoms | Ring | E max | Overlap with
------+-------+------+---------+-------------
0 | 7 | YES | -29.4 | 2 1
1 | 9 | YES | -28.4 | 3
2 | 3 | NO | -20.5 | 4
3 | 8 | YES | -29.3 | 4 2
------+-------+------+---------+-------------
>> Using base fragment from fragmentation 1 (R-group:3) as core.
The small table above shows the core fragments from all fragmentations detected by
SELBAS, the number of atoms, the expected maximum interaction energy and the over-
lapping R-groups. In this case the base fragment of fragmentation 1 that has atoms with
R-group 3 in common is selected. R-group 3 is now selected as R-group 0 (the core frag-
ment) and expanded by all atoms that are part of the base fragment.
In a further expansion step, the core fragment is expanded by atoms that are directly con-
nected to it, until atoms belonging to another R-group are found. Then the adjacent R-
groups are expanded in the same way, until the complete molecule is subdivided into R-
groups.
In the next step, each R-group is duplicated and manipulated by one of the transformation
rules in the data section of the data set of the subgraph denition. The composition of the
different instances of the R-groups are printed as SMILES representations on the screen.
>> R-Group 0 carboxylate: O=C(O)c1cc(C1
*
)ccc1
0/1 : transform carboxylate >> C(=O)[OH] => O=C(C1CC
CC(C[1
*
])C1)O
8.2. DOCKING OF COMBINATORIAL LIBRARIES 173
0/2 : transform carboxylate >> C(=O)[O-] => O=C(C1CC
CC(C[1
*
])C1)[O-]
>> R-Group 1 sec_amine: [0
*
]N2
*
1/1 : transform sec_amine >> [NH] => [0
*
]N[2
*
]
1/2 : transform sec_amine >> [NH2+] => [0
*
][NH2+][2
*
]
>> R-Group 2 sec_amine: [0
*
]C1CC(3
*
)CNC1
2/1 : transform sec_amine >> [NH] => [0
*
]C1CNCC([3
*
])C1
2/2 : transform sec_amine >> [NH2+] => [0
*
]C1C[NH2+]CC([3
*
])C1
>> R-Group 3 carboxylate: [0
*
]C(=O)O
3/1 : transform carboxylate >> C(=O)[OH] => [0
*
]C(O)=O
3/2 : transform carboxylate >> C(=O)[O-] => [0
*
]C([O-])=O
In the nal step, the generated combilib is automatically initialized for further work.
>> Inserting R-group 0 into combilib
R-groups found: R1
Molecule 0 : 10->R1
Molecule 1 : 10->R1
>> Inserting R-group 1 into combilib
>> Inserting R-group 2 into combilib
>> Inserting R-group 3 into combilib
Core contains 2 instances.
R-Group 1 contains 2 instances.
R-Group 2 contains 2 instances.
R-Group 3 contains 2 instances.
>> combilib generation finished.
>> Closing combilib:
Combilib is closed.
>> Initiating combilib data assignment
>> Combilib data assignment finished, combilib ready for docking.
>> Combilib generation from ligand NoName successful.
Enumerating protonation states
The combilib generated from PERMUTE is now ready to go. To get a feeling for the contents
of the generated combilib, we can enumerate it to a multi-mol2 le that contains the ligand
in all possible protonation states. To do so, must rst call the CLIB/EXTRACT command,
just hit return for all questions, and you will see an output like this.
LEADIT/CLIB> EXTRACT
Core instance <0,1> [0] :
Add R-group (-1=stop, R-1) <-1,9> [1] :
R-group instance <0,1> [0] :
Add R-group (-1=stop, R-2) <-1,9> [2] :
R-group instance <0,1> [0] :
Add R-group (-1=stop, R-3) <-1,9> [3] :
R-group instance <0,1> [0] :
174 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
>> Removing ligand molecule.
>> Molecule synthesis #atoms #bonds #rings
---------------------------------------------------------------
C-000 18 18 1
C-000_R1-000 20 20 1
C-000_R1-000_R2-000 35 36 2
C-000_R1-000_R2-000_R3-000 38 39 2
After this initial phase, using the CLIB/ENUM command it is possible to enumerate the com-
plete library and to write each molecule into a multi-mol2 le.
LEADIT/CLIB> ENUM
Write molecules to mol2 file: [n] : y
Filename : permute_enum.mol2
>> Current Clib molecule: C-000_R1-000_R2-000_R3-000
>> Clib molecule enumeration: 16 molecules
A new multi-mol2 le, permute_enum.mol2, has now been generated. You can visualize
it via FlexV, and use the slider control to check the different protonation states.
Docking with permute
Docking with combilibs generated by PERMUTE is identical to docking any other combina-
torial library. If you already have experiences with FlexX
c
just skip this section. One special
issue is important when working with permute libraries. In a normal FlexX
c
docking, all
molecules generated from a combinatorial docking have no relation to each other. In the
libraries generated via PERMUTE all generated derivatives are in a very close relation to each
other and it would be nice to analyze the results of all derivatives in one solution list. It is
possible to keep a list with all docking and alignment solutions in memory, including their
score. Set the ag KEEP_ALL_SCORES_ACHIEVED = 1 to make this feature available. The
command CDOCK/LISTP allows you to list the composition of the top scoring compounds
sorted by score. With the command CDOCK/WRITESOL it is possible to write the placements
out as multi-mol les.
So lets try that with our test ligand from the example above. In the examples directory,
there is a trypsin receptor 1dwd that we can use for docking. The example ligand is just
an articial molecule and its not known to be an active inhibitor, but FlexX nds solutions
with reasonable scores.
Here are the command to be called, the combilib should be already generated and prepared,
the commands for preparation are given as comments again.
# LIGAND
# READ permute_test.mol2
# DOCKING
# SELBAS a
# CLIB
# PERMUTE n -1 n
LEADIT> RECEPTOR
8.2. DOCKING OF COMBINATORIAL LIBRARIES 175
LEADIT/RECEPTOR> READ 1dwd
LEADIT/RECEPTOR> CDOCK
LEADIT/CDOCK> SET KEEP_ALL_SCORES_ACHIEVED 1
LEADIT/CDOCK> SELBAS # take defaults
Maximum number of base molecules <1,8> [8] :
Maximum number of base molecules in core <0,2> [2] :
Maximum number of base molecules in R-1 <0,2> [0] :
Maximum number of base molecules in R-2 <0,2> [0] :
Maximum number of base molecules in R-3 <0,2> [0] :
>> Selection of molecules for base placement
Id | Name | Score
---------------------------------------------------
Core: 2 (all)
R-1 : 0
R-2 : 0
R-3 : 0
LEADIT/CDOCK> PLACEC %
>> Core molecule 0 :
Base selection : 1 fragment(s)
Base placement : 400 placements
>> Core molecule 1 :
Base selection : 1 fragment(s)
Base placement : 426 placements
>> Total number of core placements: 241
There are two ways of docking a combilib, sequentially(PLACESEQ) or incrementally
(EXTENDMR). The difference is simple, PLACESEQ generates each ligand and performs a
complete docking run with each entity. EXTENDMR uses the incremental buildup of the com-
bilib, so based on the already placed core fragment all possible R-group instances are now
added subsequently to the core. This is much faster than the sequential docking process.
In this example we only use EXTENDMR. Note it is important to answer the question Store
placements with yes, because WRITESOL needs the .pdf le later to generate molecules from
the placements after the docking run.
LEADIT/CDOCK> EXTENDMR
Continue with EXTENDMR ? [n] : y
Select core instance <0,1> [0] :
Next R-group to add (-1=stop,1) [all] :
Select instance of R-group 1: <0,1> [0] :
Select instance of R-group 2: <0,1> [0] :
Select instance of R-group 3: <0,1> [0] :
Store placements [n] : y # say yes here !!!!
Placement base filename [cdock] : # cdock is ok
Compare placements [n] : n
>> Continue with the sequence: C0-000 R1-000 R2-000 R3-000
>> Mol-No || Core | R- 1 | R- 2 | R- 3 | # Pl | Score | Time |
Molecule name
1 || 0 | 0 | 0 | 0 | 72 | -14.639 | 1.52 s |
176 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
C-000_R1-000_R2-000_R3-000
2 || 0 | 0 | 0 | 1 | 52 | -13.474 | 0.61 s |
C-000_R1-000_R2-000_R3-001
3 || 0 | 0 | 1 | 0 | 72 | -14.048 | 1.17 s |
: : : : : : : :
C-000_R1-000_R2-001_R3-000
16 || 1 | 1 | 1 | 1 | 54 | -16.640 | 0.81 s |
C-001_R1-001_R2-001_R3-001
>> The solutions are written to predict/cdock_<molecule name>.pdf.
Process time used: 17.07 s.
Just 17 seconds for docking 16 different protonation states, thats pretty fast, isnt it? Now
we want to see a list of the top 100 solutions sorted by their score, so call the command
CDOCK/LISTP to do this, and we can use the CDOCK/WRITESOL command to write the
placements to a multi-mol2 le.
LEADIT/CDOCK> LISTP a 10
>> No || Core | R 1 | R 2 | R 3 |Sol No| Score
----------------------------------------------------------------------
1 || 1 | 0 | 1 | 0 | 1 | -19.909
2 || 1 | 1 | 0 | 0 | 1 | -19.751
3 || 1 | 1 | 1 | 0 | 1 | -19.682
4 || 1 | 0 | 1 | 0 | 2 | -19.486
5 || 1 | 0 | 1 | 1 | 1 | -19.462
6 || 1 | 1 | 1 | 0 | 2 | -18.991
7 || 1 | 1 | 0 | 0 | 2 | -18.587
8 || 1 | 1 | 1 | 0 | 3 | -18.587
9 || 1 | 1 | 0 | 0 | 3 | -18.459
10 || 1 | 1 | 1 | 0 | 4 | -18.457
# now write the top 100 solutions to a multi-mol2 file
LEADIT/CDOCK> writesol
Filename [base] : permute_1dwd
SDF file (0) / Mol2 file (1) <0,1> [1] : 1
Multi-mol2 file [y] : y
Number of placements <1,945> [100] : 100
8.2.7 Handling of ring systems with PERMUTE
Ring systems are a bit problematic for the application of permutation rules. We have seen
that a rule can be independent from its occurrence inside or outside of a ring, but if a sub-
structure is found in a ring, the R-group will be automatically expanded to the complete
ring system.
The FlexX
c
module is unable to close rings in the buildup process, this is why rings can
match only one subgraph description and can therefore be part of only one R-group. So if a
ring has multiple titratable sites of identical priority, permute will generate an error because
it cannot decide which rule is to be applied.
A solution to handle ring systems is to write all possible protonation states for the complete
ring system. Here is an example for an imidazole as it occurs in histidines and all possible
tautomers and protonation states.
8.2. DOCKING OF COMBINATORIAL LIBRARIES 177
@subgraph 1 2 imidazole
smarts N1-C=N-C=C1(-
*
)
begin
N1-C=N-C=C1(-
*
)
N1=C-N-C=C1(-
*
)
[NH]=C-N-C=C1(-
*
)
end
A similar example may be piperazine, a six-membered aliphatic ring with two nitrogen
atoms. The secondary amine pattern from the rst example would match both nitrogen
atoms, and, because two matchings in one ring are forbidden, we need a rule like the one
below to handle piperazine correctly. In this example only three different states are dened
because a doubly charged piperazine has to be excluded.
@subgraph 1 2 piperazin
smarts N1CCNCC1
begin
N1CCNCC1
[NH2+]1CCNCC1
N1CC[NH2+]CC1
end
The relation of PERMUTE and TRANSFORM
What permute can do depends mainly on what the command LIGAND/TRANSFORM is able
to do. This is because R-group building is an identify/copy/modify process and modica-
tion is limited to the capabilities of the transformation engine.
Basically there are a lot of possibilities beyond protonation, like any kind of terminal group
replacement, enlargement and shrinking of rings, removing terminal groups, simulation of
encymatic cleavages etc.
Currently there are no predined rule sets available to do those things directly, but it is
an easy task to write some essential rules in a user-dened permute.dat le and to test
whether the results t the expectations.
178 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
8.3 Docking under pharmacophore type constraints
8.3.1 Pharmacophore type constraints
FlexX-Pharm allows the user to include constraints similar to pharmacophore type con-
straints in the docking calculation. FlexX-Pharm is not a standard module in FlexX. In
order to access the FlexX-Pharm functionality, an extra license key is required.
There are two main types of constraints. For the rst type, a specic interacting group in
the receptor is selected and an interaction between the ligand (any part of the ligand) and
this interacting group must be seen in every docking solution. This is called an interaction
constraint. The second type of constraint is a spatial constraint. This takes the form of a
spherical volume in the active site, and an atom or a group of a dened type must lie within
the sphere in the docking solution.
In addition to this setup, spatial constraints may also be employed in the form of Gauss
spheres. Please refer to section 8.5.4 for more details on this.
Each constraint can be assigned to be essential, optional, or can be assigned a label
which denes it as a logical term. If a constraint is assigned essential then there must
be compulsory matching in the docking solution. The assignment optional allows for
some exibility in the form of partial matches. The number of optional constraints matched
in the nal docking solution must lie between a specied minimum and maximum.
If the constraints have been dened as logical terms, then every docking solution must fulll
a logical expression. This logical expression is a combination of logical terms and
logical interconnections, and is described in more detail below.
Once dened in the active site, the set of constraints constitutes the pharmacophore in FlexX-
Pharm.
8.3.2 Running a FlexX-Pharm calculation
If there is a set of pharmacophore constraints present then a FlexX-Pharm docking calcula-
tion is carried out, otherwise a regular FlexX docking calculation takes place.
The constraints are loaded after the receptor and ligand. For each constraint, countergroups
are rst identied in the ligand. A countergroup can be, for example, an atom that can
make an interaction with an interaction constraint on the receptor, or an atomof the element
type expected for a spatial constraint. The countergroups are checked against geometry of
the constraints as dened in the active site before docking starts. This is done by comparing
maximuminter-molecular distances between countergroups in the ligand with the distances
between the constraints in the site. It may already be obvious at this stage that the ligand
cannot t the constraints and the docking calculation does not proceed.
FlexX-Pharm does not change the docking algorithms used in FlexX but rather calls extra
routines after each of the FlexX docking steps. After each incremental construction step,
including the base placement, FlexX-Pharm carries out checks of the partially placed lig-
and against the constraints. There are three checking methods: any docking solution which
fails one of these is deleted from the docking calculation, otherwise it progresses to the next
checking step. The three checking methods are not just ltering steps but are also look-
ahead checks which predictively eliminate docking solutions from the docking calculation.
The methods can be summarized as follows. The rst is referred to as a logical check, where
docking solutions are eliminated according to the availability (or unavailability) of counter-
8.3. DOCKING UNDER PHARMACOPHORE TYPE CONSTRAINTS 179
groups to meet the constraints. The second and third are distance-based checking methods
and are referred to as distance checks and directed tweak checks. A summary of how many
solutions were rejected at each step can be seen in the base placement and complex construc-
tion output information. As the docking proceeds FlexX-Pharm lters out all solutions that
do not obey the constraints, leaving only those that do so in the nal solutions set.
The set of constraints dened in FlexX-Pharm may also be more simply used as a lter
against the normal FlexX nal docking solutions set allowing one to identify which solutions
t the constraints. This post-docking ltering mode may be switched on automatically in
FlexX-Pharmfor timing reasons, for example during virtual screening runs (see section 8.3.6
below).
8.3.3 Conguring FlexX-Pharm
For details of the standard FlexX conguration, esp. the relevant ags and parameters,
please see section 10.1. This section describes only the extra entries required in the
config.dat le for FlexX-Pharm.
@DIRECTORIES: Dening directory paths
PHARM Contains the path for the directory where FlexX-Pharm looks for the pharma-
cophore constraints description les (with the extension .phm, also supported are the
le extensions .dat or .phm) (see section 8.3.4 Preparing the input data" below).
8.3.4 Preparing the input data
The constraint information has to be prepared in a separate input le. The two different
types of constraints, interaction and spatial, are each entered in their own record. The form
of the le is arranged similarly to other FlexX .dat les. Comments are denoted #, record
identiers start with the symbol @, while uppercase/lowercase and spacing are ignored.
Below follows a description of the pharmacophore constraints input le. For more detailed
information on the input le description especially for examples of dening interaction
constraints on hetero atoms or crystallographic waters please see the additional input
description README les included in the <example/pharm/ph_input_egs/> directory.
Interaction constraints
The interaction record begins with the @interact record identier followed by of a list
of interaction constraints. An interaction constraint is described by the atom on which the
interacting group is dened, plus the type of interaction and some additional information
depending on the type of interaction. Each interaction constraint is entered on a separate
line consisting of the following elds:
@interact
<essential/optional/logical term label> <atom name> <aa name> \
<aa chain ID> <aa num> <interaction type> [hydrogen ID] \
[interaction surface number]
180 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
<essential/optional/logical term label> Either the word essential or the word
optional or a label to dene the constraint as a logical term. The logical term la-
bels can be a single character or a string (see example).
<atom name> The atom name as found in the PDB le.
<aa name> The 3-letter amino acid PDB le code.
<aa chain ID> The chain ID for the amino acid, as found in the PDB le. An underscore
_ denotes an empty chain ID.
<aa num> The number of the amino acid as in the PDB le.
<interaction type> The type of interaction required (e.g. h_acc). The various types of
interaction available are found in the contype.dat data le.
<hydrogen ID> For some h_don interactions (e.g. ARG) two interaction centers are avail-
able. In this case the hydrogen dening the correct interaction center must be specied.
Hydrogen names are found in the amino.dat data le.
<interaction surface number> <0,1> For certain interactions two interaction surfaces are
available. The correct surface is chosen by entering 0 or 1. The order of the surfaces
(0 or 1) is dened in geometry.dat (but visualization with FlexV is probably the
simplest method of identifying the surfaces!).
Example
@interact
essential _ne2 his _ 297 h_don
optional _nh2 Arg _ 246 h_don _HH21
OPTIONAL _CG PHE E 223 phenyl_center 0
Or, alternatively with logical terms:
Example
@interact
A _OD2 ASP A 27 h_acc 1
B _OD1 ASP A 27 h_acc 1
Section 11.3 (.rdf le) describes in greater detail the required input format for atom names
and amino acids in FlexX.
Spatial constraints
The spatial record begins with the @spatial record identier, followed by a list of spatial
constraints. Spatial constraints can be dened in one of two ways:
Method 1
@spatial
<essential/optional/logical term label> <x> <y> <z> \
<sphere radius> <element>
8.3. DOCKING UNDER PHARMACOPHORE TYPE CONSTRAINTS 181
<essential/optional/logical term label> <x> <y> <z> \
<sphere radius> SMARTS <expression>
<essential/optional/logical term label> Either the word essential or the word
optional or a label to dene the constraint as a logical term. The logical term la-
bel can be a single character or a string (see example).
<x> <y> <z> Coordinates of the sphere center.
<sphere radius> The radius of the constraint sphere ().
<element> The element type of the ligand atom that must lie in the sphere. The element
name must be either the 2-letter element symbol (use an underscore _ to replace a
leading blank) or a PDB atom name in a FlexX recognizable format.
SMARTS The keyword SMARTS signals that the next expression is a SMARTS
TM
expres-
sion. A list of allowed SMARTS
TM
expressions is given in section 11.13.
<expression> A SMARTS
TM
expression matching all atoms which must lie in the sphere
with <sphere radius>.
Method 2 This method denes a point in space relative to the active site. Two atoms are
selected in the site and the sphere is built at a certain distance from the rst atom along the
imaginary line between the two atoms.
@spatial
<essential/optional/logical term label> \
<atom name 1> <aa name 1> <aa chain ID 1> <aa num 1> \
<atom name 2> <aa name 2> <aa chain ID 2> <aa num 2> \
<sphere position> <sphere radius> <element>
<essential/optional/logical term label> \
<atom name 1> <aa name 1> <aa chain ID 1> <aa num 1> \
<atom name 2> <aa name 2> <aa chain ID 2> <aa num 2> \
<sphere position> <sphere radius> SMARTS <expression>
<essential/optional/logical term label> Either the word essential or the word
optional or a label to dene the constraint as a logical term. The logical term la-
bel can be a single character or a string (see example).
<atom name 1> The name of the rst atom as found in the PDB le.
<aa name 1> The 3-letter amino acid PDB le code of the rst atom.
<aa chain ID 1> The chain ID for the amino acid of the rst atom, as found in the PDB le.
An underscore _ denotes an empty chain ID.
<aa num 1> The number of the amino acid of the rst atom as in the PDB le.
<atom name 2> The name of the second atom as found in the PDB le.
<aa name 2> The 3-letter amino acid PDB le code of the second atom.
<aa chain ID 2> The chain ID for the amino acid of the second atom, as found in the PDB
le. An underscore _ denotes an empty chain ID.
182 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
<aa num 2> The number of the amino acid of the second atom as in the PDB le.
<sphere position> Distance of the sphere center from the rst atom.
<sphere radius> The radius of the constraint sphere ().
<element> The element type of the ligand atom that must lie in the sphere. The element
name must be either the 2-letter element symbol (use an underscore _ to replace a
leading blank) or a PDB atom name in a FlexX recognizable format.
SMARTS The keyword SMARTS signals that the next expression is a SMARTS
TM
expres-
sion. A list of allowed SMARTS
TM
expressions is given in section 11.13.
<expression> A SMARTS
TM
expression matching all atoms which must lie in the sphere
with <sphere radius>.
Example
@spatial
ESSENTIAL 28.0 3.0 17.0 2.0 _O
essential _ca leu _ 131 _cd2 his _ 200 6.6 1.5 _C
Or, alternatively with SMARTS
TM
:
Example
@spatial
C 23.0 58.0 24.0 1.0 SMARTS [n,c]
D 20.0 67.0 23.5 1.5 SMARTS [$(C(O)O)]
The existence and type of the third record in the constraints input le depends on whether
optional constraints have been dened (for partial matching) or whether the constraints
have been dened as logical terms (for a logical expression).
Note that logical terms cannot be used together with optional constraints - they are in-
compatible.
Partial matching: @partial record
The @partial record relates to the optional constraints that have been entered in the
@interact and @spatial records. The record consists of one line only.
@partial <minimum> <maximum>
<minimum> The minimum number of optional constraints that must be matched in the
docking solution.
<maximum> The maximum number of optional constraints that must be matched in the
docking solution.
Note that if the maximum number of optional constraints is dened to be zero then the
constraints act as negative constraints. This means a spatial constraint becomes an exclusion
volume (for the given element only of course) and an interaction constraint becomes an
interaction that must not be seen in the docking solutions.
8.3. DOCKING UNDER PHARMACOPHORE TYPE CONSTRAINTS 183
Example
@partial 1 2
Logical expression: @logical_expression record
If all constraints are specied with logical terms, then a required record in the input le is
the @logical_expression record, relating the Boolean correlation of all constraints. The
logical expression follows on the next line after the @logical_expression keyword.
@logical_expression
<expression>
<expression> A combination of terms and logical interconnections (and, or, not).
Example
@logical_expression
(A or B) and (C or D)
Please note that in order for FlexX-Pharmto evaluate the logical expression correctly, brack-
ets have to used for or connected expressions. Otherwise, FlexX-Pharm would read the
expression simply from left to right, not giving weight to neither or nor and. The expres-
sions are only explicit with brackets.
Note that the memory usage and runtime depend only on the number of pharmacophore
constraints and the number of potiential candidates for each constraint, and not on the num-
ber of constraints used in the logical expression.
8.3.5 Menus and commands
The PHARM menu is only available if the PHARM module is activated with a valid license
key.
Typing the submenu name brings you to the submenu, typing END returns you to the parent
menu. You can type commands and menu names in uppercase or lowercase letters.
Working with constraints (the PHARM submenu)
Reading a set of constraints (READ)
Syntax: READ <lename>
Description: Reads the set of constraints into FlexX-Pharms workspace. The le
<lename> must have the extension phm (the extensions .dat and .cfg are also
accepted for compatibility with older versions and other tools). The constraint in-
formation is output to the screen. If the ligand has already been entered then the
pre-docking checks on the ligand are carried out at this point. Information about
howmany combinations of ligand countergroups accepted and rejected at this point
is output to the screen. If a docking calculation has already taken place previous to
loading these constraints then the constraints can be used as a post-docking lter
(see the FILTER command).
184 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Requirements: The receptor must have been previously loaded.
Interactive picking of constraints via FlexV (PICKPH)
Syntax: PICKPH [<constraint>]
Description: Launches FlexV with the pharm-control panel: an interactive con-
straint generation facility. When FlexV starts, the active site of the receptor is visible
and the control panel is opened next to the main FlexV window. Using the control
panel the user can visualize various types of exx interaction surfaces in the active
site and select them to use as pharmacophore constraints. The spheres for spatial
constraints can also be easily positioned within the site. If constraints have already
been loaded and if <constraint> is set to y, the pharmacophore constraints are
displayed in the pharm-control panel. When the user has nished putting together
a set of constraints with the FlexV interface, the constraints input le (described
above) may be generated with one click of a button. The resulting le can then sim-
ply be read using the READ command (see above). Please see the FlexV User Guide
for detailed information about this interface.
Requirements: The receptor must have been previously loaded.
Outputting information about constraints (INFO)
Syntax: INFO
Description: Outputs basic information about the constraints such as the number
of interaction, spatial, optional and essential constraints or logical terms, plus the
minimum and maximum number of optional constraints that must be met or the
logical expression.
Editing the constraints input le (EDIT)
Syntax: EDIT
Description: Brings up the current constraints input le in the editor.
Important notes: After editing the le there is NO automatic reload.
Deleting the constraints (DELETE)
Syntax: DELETE
Description: Removes the set of constraints from FlexX-Pharms workspace.
Using the constraints to lter existing docking solutions (FILTER)
Syntax: FILTER <delete>
Description: Tests a set of existing docking solutions against the set of constraints.
A list is output containing each docking solution that matches the constraints. For
each solution a table is output showing which constraints were matched. At the
end of the list follows the percentage of total docking solutions that matched each
constraint. Solutions that do not obey the constraints can be permanently removed
from the docking solutions list by setting <delete> to y.
Requirements: A docking calculation must have taken place.
8.3. DOCKING UNDER PHARMACOPHORE TYPE CONSTRAINTS 185
(Important note: This SELxxx scheme is similar to all SELxxx schemes used in FlexX. For
more details refer to the main FlexX manual.)
Setting administration defaults for drawing constraints (SELADM)
Syntax: SELADM <graphics object number> [<start fo object>] [<end fo
object>] <temp le> <append>
Description: With SELADM you can specify the graphics object numbers used for
drawing constraints and you can determine whether the graphics les are internal
temporary les used only by FlexX or saved for further use. For yes/no questions
you can enter either y, yes or 1 for yes, and similarly n, no or 0 for no.
<graphics object number> Enter integer:
(1255) The graphics created with the DRAW command will be displayed in
graphics object <graphics object number>.
0 fo mode the graphics generated by subsequent DRAW commands will be
sent to a range of graphics objects in a rst-drawn-rst-overwrite manner.
You will be asked to enter two more parameters:
<start fo object> The start graphics object for the fo range.
<end fo object> The end graphics object for the fo range.
<temp le> Yes/no answer:
yes The graphics are written in temporary les and removed after quitting
FlexX.
no The graphics are written to permanent les chosen by the user. You will be
asked for a lename at the end of each DRAW command (see DRAW below
for example).
<append> Yes/no answer:
yes Previous graphics les are not overwritten. Instead the current graphics
are appended to the previous one in the graphics le.
no The previous graphics le will be overwritten and all previous graphics
made with the DRAW command in this menu will be lost.
Setting default values for drawing constraints (SELGRA)
Syntax: SELGRA <spatials> <interactions> <interact geoms> <ia points> <all
contact types> [<contact type selection>]
Description: With SELGRA you can set specic default values for drawing con-
straints. For yes/no questions you can enter either y, yes or 1 for yes, and simi-
larly n, no or 0 for no.
<spatials> Yes/no answer:
yes Include spatial constraints in the drawing.
no Do not include spatial constraints.
<interactionss> Yes/no answer:
yes Include interaction constraints in the drawing.
no Do not include interaction constraints.
186 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
<interact geoms> Yes/no answer:
yes Interaction geometries (interaction surfaces around potential interacting
groups) are drawn.
no No interaction geometries are drawn.
<ia points> Yes/no answer:
yes A set of points describing the interaction surface are drawn. (All calcula-
tions involving interaction surfaces in FlexX use these sets of points).
no No points are drawn.
<all contact types> Yes/no answer:
yes Interaction geometries for all contact types (interaction types) are drawn
if <interact geoms> is set to yes.
no Interaction geometries are drawn for a selection of contact types (interac-
tion types). You will be asked to select types from a given list:
<contact type selection> Choose a list of types represented by integers.
Enter the list as separate integers or as integer ranges (format a b)
separated by , or blanks. Note that you need to enclose the expression
in quotation marks if it contains blanks e.g. 1, 2, 4, 7 9.
Selecting the coloring mode for drawing constraints (SELCOL)
Syntax: SELCOL <spatial color mode> <interact geoms color mode>
Description: With SELCOL you can set the color modes for the spatial constraints
and for drawing interaction geometries for the interaction constraints. For each of
these, a selection of color modes is available:
<spatial color mode> Choose the color mode for drawing the spatial constraints
color mode selection:
INVISIBLE
UNIQUE
<interact geoms color mode> Choose the color mode for drawing the interac-
tion geometries if they are to be drawn. The interaction geometries consist
of patches or surfaces that indicate the positions of interacting groups in the
molecule. Color mode selection:
INVISIBLE
UNIQUE
CONTACT
The possible color modes are explained below for some color modes you are also
asked to enter some dening colors. Enter your chosen color as either an angle from
the color circle (0 360 degrees: 0 is invisible, 1360 runs from dark blue, through
red, yellow, green to blue), a color name (as dened in the GRAPHIC static data le),
or an RGB(A) value; 3 (4) oating-point numbers separated by blanks or slashes:
8.3. DOCKING UNDER PHARMACOPHORE TYPE CONSTRAINTS 187
Example
selcol .... "dark green" ....
selcol .... green ....
selcol .... "0.0 0.8 0.1" ....
selcol .... 0.0/0.8/0.1 ....
selcol .... 220 ...
(For a full explanation of how to dene colors see Sec. 11.22.1.)
INVISIBLE The item drawn will be invisible.
UNIQUE The object will be drawn in one user-dened color. You will be asked to
choose the unique color:
<color> Enter your chosen color.
CONTACT The object will be drawn in a color representing its interaction (contact)
type. The colors for each type are dened in the GRAPHIC static data le.
Drawing the constraints (DRAW)
Syntax: DRAW [<lename>]
Description: DRAW generates a drawing of the constraints and sends it to a le
ready to be displayed in the graphics interface. For details about what exactly is
drawn see the SELGRA command.
[<lename> ] If the graphics are not to be stored in a temporary le (see SELADM),
enter the lename for storing the graphics here.
Important notes: Drawings are not displayed automatically. Use DISPLAY to out-
put the drawing on the graphics device.
Docking (the DOCKING submenu)
Additional information for some commands relating to the pharmacophore constraint
checks is found in the DOCKING submenu. The extra output is described in this section.
Placing the base fragment when docking with constraints (PLACEBAS)
Syntax: PLACEBAS For syntax see FlexX main manual.
Description: See FlexX main manual.
Important notes: At the end of the PLACEBAS output, FlexX-Pharm adds infor-
mation about how many base placements were deleted in the look-ahead checks.
The rst number is the total number rejected, followed by three numbers in brack-
ets indicating howmany of the total were deleted by the logical, distance and nally
directed tweak checks. If no constraints are present, these numbers are all zero.
Building up the complex when docking with constraints (COMPLEX)
Syntax: COMPLEX For syntax see FlexX main manual.
Description: See FlexX main manual.
188 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Important notes: In the COMPLEX output summary, FlexX-Pharm adds informa-
tion about how many docking solutions were deleted in the look-ahead checks. The
rst number is the total number rejected, followed by three numbers in brackets
indicating how many of the total were deleted by the logical, distance and nally
directed tweak checks. If no constraints are present these numbers are all zero.
Outputting information about the docking with constraints (INFO)
Syntax: INFO For syntax see FlexX main manual.
Description: See FlexX main manual.
Important notes: In the table output format, FlexX-Pharm includes a pharma-
cophore constraints processing time in the timing summary. This refers to the time
taken for the pre-docking constraints/ligand preparation at the PHARM/READ com-
mand. In the non-table formats, this timing is included in the total time output. (The
time taken for constraint checking during docking is included in the base place-
ment and complex build-up timings.) Also in the non-table formats, an integer is
output at the end of the line. If the integer has the value 1, then FlexX-Pharm car-
ried out constraint checking during docking. If the integer has the value 2, then
FlexX-Pharm switched automatically to the post-docking lter mode to check the
constraints.
8.3.6 Static data in FlexX-Pharm
*Program parameters (conguration)
For details of the standard FlexX configuration and other static data please see the Sec-
tion 10.1 This section only describes the extra FlexX-Pharm parameters.
Name (type): <PHARM_OPTIMIZATION_TOL> (oating point)
Description: This is a tolerance () in the distances calculated between placed
parts of the ligand and the constraints in the active site during the look- ahead
checks. The tolerance allows for movements in the position of placed atoms in sub-
sequent optimizations of the ligand placement.
Default value: 0.6
Reasonable range: 0.0 - 1.5
Name (type): <PHARM_INTERACTION_TOL> (oating point)
Description: In FlexX/FlexX-Pharm an interaction center does not necessarily
have to lie exactly on an interaction surface to make an interaction. This param-
eter is a tolerance () used in FlexX-Pharm distance calculations in the look-ahead
checks to account for the more complex rules FlexX applies in deciding whether or
not an interaction will be formed in subsequent fragment placements.
Default value: 1.0
Reasonable range: 0.0 - 2.5
Name (type): <PHARM_MAX_BONDS> (integer)
Description: This is the maximum number of rotatable bonds allowed in a path
through the molecule in a directed tweak calculation. If the number of rotatable
8.3. DOCKING UNDER PHARMACOPHORE TYPE CONSTRAINTS 189
bonds exceeds this value, then the directed tweak look-ahead check is omitted. If
the value of this parameter is set to 0, then the directed tweak check is never used.
If it is set to -1, then the directed tweak check is always used.
Default value: 10
Reasonable range: -1 - 20
Name (type): <PHARM_MAX_LIST_LEN> (integer)
Description: This is the maximum length of the master list (see PHARM/READ out-
put) that is allowed in order for the constraint checks to be applied during dock-
ing. The master list contains the list of ligand countergroup combinations that
passed the pre-docking checks against the constraints. In certain circumstances a
very long list leads to an overly long FlexX-Pharm docking time, which is unde-
sirable especially for screening calculations. If the length of the master list exceeds
<PHARM_MAX_LIST_LEN>, then FlexX-Pharm automatically switches to using
the post-docking lter mode for checking against the constraints. (Refer also to the
DOCKING/INFO command in section 8.3.5 above.)
Default value: 20000
Reasonable range: 5000 - 500000
*Scoring terms (geometry.dat)
This section describes how to ensure interaction constraints can still be used when us-
ing a customized scoring function. The discussion involves scoring terms found in the
geometry.dat le in the @scoring_parameters section. For more details about
geometry.dat please see section 11.7.
FlexX-Pharm assumes interactions between the ligand and receptor take place only if the
FlexX match energy term is not zero. If this term is not zero, an interaction exists and so
it will be possible to nd the interaction constraint during docking with FlexX-Pharm. If a
scoring function is employed where the match energy term is not considered during scoring
(such as PLP scoring), FlexX-Pharm recognizes the situation and looks instead only at the
basic factor of the G_match term. So, to ensure interaction constraints can be found in
this situation, make sure that the G_match basic factor is set to be non-zero, while the two
scaling factors are set to zero. In this way, the scoring will not be inuenced by the G_match
term, but FlexX-Pharm can still fully function.
Running FlexX-Pharm with particles
It is possible to use particles when using FlexX-Pharm. To place particles during docking
the <PLACE_PARTICLES> ag in the config.dat le should be set to 1 (see section 10.1.
However, there are a couple of points to note:
interaction constraints cannot be specied to lie on particles
particles cannot be dened to lie inside a spatial constraint volume
interaction constraints on the receptor may be fullled by an interaction froma particle
and not from the ligand itself.
190 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
8.3.7 Compatibility with other modules
From FlexX Release 2 onwards the FlexX-Pharm module can be used in combination with
the FlexE module (section 8.4). You can use the same pharmacophore constraints as for FlexX
and you need no extra step. The constraints will be automatically extended to a set of alter-
native constraints for the particular ensemble structures as follows: interaction constraints
are assigned to all ensemble structures and only one of them is required. There are two
ways to dene spatial constraints: if the coordinates are given directly nothing is changed.
There will be only one resulting spatial constraint in the united protein structure. However,
if the position of the spatial constraint is given relative to two receptor atom positions, the
constraints are multiplied within each particular ensemble structure, but not for each com-
bination of ensemble structures (in order to keep the number of alternatives low). So, for
FlexE, the better way to dene spatial constraints is to use coordinates directly.
FlexX-Pharm is currently (Release 2) not compatible with the FlexX
c
module (section 8.2).
A loaded pharamcophore constraint will be ignored if you read or dock a combinatorial
library.
8.3.8 Comments
Reading and writing docking solutions
Unfortunately, the user cannot yet read and write docking solutions to or fromFlexX-Pharm
(DOCK/READ DOCK/WRITE). This is because each docking solution contains (and therefore
requires) its own individual information about the pharmacophore constraints. This infor-
mation is not yet output by the DOCK/WRITE command.
Another consequence here is that docking solutions written from FlexX cannot be read into
FlexX-Pharm and vice versa.
8.4. DOCKING INTO ENSEMBLES OF PROTEIN STRUCTURES 191
8.4 Docking into ensembles of protein structures
FlexE is an extension of FlexX which enables protein structure variations and protein exi-
bility to be taken into account during docking.
In order to run FlexE you need a special license key to activate the FlexE module!
The protein exibility is represented by an ensemble of superimposed protein structures.
Similar parts of the structures are merged, whereas dissimilar areas are treated as separate
alternatives. We call this compact representation united protein description. FlexE splits up
the protein structures into amino acids and further divides each amino acid into backbone
and side chain part. A particular protonation state or conformation of such a part is called
instance. Instances from different ensemble structures may be recombined to create "new"
protein structures. FlexE selects suitable instances from the given set according to the scor-
ing function. However, FlexE cannot generate completely new instances from scratch!
Due to the recombination of instances, dependencies between instances occur. These are
caused by logical and geometrical exclusion principles resulting in the concept of incompat-
ibility between instances. Internally, incompatibility is represented as a graph. Valid protein
structures are independent sets of instances with this graph fullling certain constraints.
For more details about the underlying models and algorithms, we refer to the respective
publications [6, 7].
The ensemble of protein particular structures is dened in an ASCII le, the ensemble descrip-
tion le (edf), which is quite similar to the receptor description le (rdf) for a single protein
structure. The user interface to FlexE mainly contains the additional menu ENSEMBLE and
a few further commands in the menus RECEPTOR and DOCKING. In addition there are a few
extra parameters for FlexE.
In the newENSEMBLE menu you can load and handle an ensemble of protein structures and
build the united protein description, which is stored in a separate slot 0, whereas the par-
ticular ensemble structures lie in slots 1-30. In addition there is an additional slot where the
normal FlexX receptor structure can be stored parallel to the ensemble. In order to dock into
a particular ensemble structure or the united protein description, you must select the struc-
ture with the command RECEPTOR/SELENS. Subsequently, you can dock into the selected
structure using the commands in the DOCKING menu (see 7.8). The standard FlexX dock-
ing algorithms are applied for the particular ensemble structure. A modied FlexE docking
algorithm is used only for the united protein description (structure 0). FlexE computes the
docking score for each instance of the united protein description separately and selects the
set of instances with the best score for each placement. You can list the scores of the instances
with the new command DOCKING/LISTINST.
Before you start working with FlexE, we would remind you that FlexE is prototype soft-
ware. We are testing the program with a continuously growing set of proteins and lig-
ands, but we are sure that FlexE is not error-free. Please report errors or inconsistencies to
flexe-info@biosolveit.de.
FlexE is being developed as part of the RELIMO
3
project at the German National Research
Center for Information Technology (GMD), Institute for Algorithms and Scientic Comput-
ing (SCAI).
3
RELIMO is a German acronym for Receptor modeling and de-novo design of combinatorial libraries. The
project is funded by the German Federal Ministry for Education, Science, Research and Technology (BMBF)
and the participating industrial partners Boehringer Ingelheim Pharma KG and Merck KGaA, Darmstadt under
grant 0311 620.
192 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
8.4.1 The ensemble description le
As mentioned above, the ensemble description le (.edf) is quite similar to a receptor de-
scription le (.rdf), which will be explained in detail in section 11.3. Here we will explain
only the differences to the .rdf le. The main differences are as follows:
The @pdb_files record allows the denition of several protein structures.
There are two additional records @ref_lig_files and @align.
All other records have an additional slot for the structure this rule refers to.
@pdb_les: Specifying the PDB les
The @pdb_files record species the .pdb les from which the ensemble structures are
read:
@pdb_files
<slot> <PDB filename>
:
With the advent of Release 2 it is also possible to use the keyword @protein_files instead
of @pdb_files and to read protein strutures in MOL2 format. Ensemble structures in PDB
and MOL2 format can be mixed.
<slot> is a non-ambiguous reference to the structure in the range of 1-30. It is also called
ensemble slot. <PDB lename> is the name of the .pdb le. FlexX will look for the specied
PDB le in the PDB directory specied in the conguration (variable PDB).
A .pdb le must be given for slot 1. It is used as reference for superimposing the ensemble
structures and clustering of instances. All other entries are optional, (i.e. you can dene an
ensemble of only one structure!) The entries need neither be sorted nor continuous. If you
want to exclude an ensemble structure temporarily it is sufcient to comment out the pdb
le here. All following corresponding rules will then be ignored.
@ref_lig_les: Specifying the reference ligand les
The @ref_lig_files record species the reference ligand les used to determine the ac-
tive site interactively, if no pocket les are given:
@ref_lig_files
<slot> <ref_lig file>
:
<slot> is the non-ambiguous reference to the structure dened in the @pdb_files record
(see above). <ref_lig_le> is the name of the reference ligand le (in mol2-format). FlexX
will look for the specied reference ligand le in the LIGAND directory specied in the con-
guration. Coordinates of the reference ligand will also be transformed when the ensemble
structures are superimposed.
8.4. DOCKING INTO ENSEMBLES OF PROTEIN STRUCTURES 193
@align: Specifying the alignment
The @align record species, which atoms of the particular ensemble structures are matched
onto which atoms of the reference structure (slot 1). This alignment is used for superimpos-
ing the structures (if done with FlexX) and for the alignment of instances for the united
protein description:
@align
<slot> <atom> <aa> <chain> <aa_nr> <atom> <aa> <chain> <aa_nr> [<offset>]
The rst ve entries refer to the particular ensemble structure and the second ve entries
refer to the reference structure (slot 1):
<slot> Ensemble slot (range 2-30, wildcard (*) and selection format (e.g. 1,2-5) are allowed)
<atom> Atom name in PDB format, i.e. four characters (use "_" instead of " ")
<aa> Amino acid name in three-letter code
<chain> Chain identier (use "_" instead of " ")
<aa_nr> Amino acid number
<offset> An optional offset between the corresponding amino acid numbers of the partic-
ular ensemble structure and the reference structure (slot 1). This currently only works
if there are no insertions or deletions.
Wildcards (*) are allowed at all positions, they refer to any atom, amino acid etc. In contrast,
the wildcard () can be used for the reference structure to refer to corresponding atomnames,
amino acids etc.
Ensemble slot: Modication of all other rules
All other rules in the .edf le apply as in the .rdf le. They are explained in detail in
section 11.3. There is only one modication: each rule has an additional entry to dene the
ensemble slot to which the rule refers to. Wildcards (*) and selections (e.g. 1,2-5) can be used
here. The slot denition is usually the rst or second entry of the rule. For the exact syntax,
please refer to the default .edf le in Appendix A.2.
8.4.2 Handling ensembles (ENSEMBLE submenu)
READ
Syntax: READ <lename>
Description: Reads the ensemble description le <lename> from disk. The le
must be in FlexE-specic EDF le format explained in section 8.4.1. It must have the
extension .edf. The rules set out at the beginning of section 11 apply to the le-
name. The default directory for this command is the path specied in the variable
ENSEMBLE (cp. conguration).
All ensemble structures are read in and prepared in the same way as if they had been
read as separate .rdf les, i.e. the following operations are executed in this order:
loading the PDB le, selecting or loading the active site atom selection, computing
or loading the surface atom selection, adding polar hydrogens to the active site,
assigning interaction types and geometries to the active site.
194 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
WRITE
Syntax: WRITE <selection> <lename> [<hydrogens>]
Description: Writes an ensemble to the le <lename> or a set of les. If a set of
les is written, you must give a generic <lename> that contains = and # which
are replaced by the corresponding pdb le name (=) or the ensemble slot (#), re-
spectively.
Depending on <selection> you can either save the
Protein
e ensemble description le (.edf)
r receptor description les (.rdf)
f full proteins, or the
s surface atom selections
a active site atom selections
Ligand
l reference ligands
t transformed reference ligands
m minimized ligands
The .edf and .rdf les are written in their specic format. Ligands are written
in mol2-format. Active site atom selections and the full proteins are written in PDB
format. Surface atom selections are written in a FlexX-internal format with the ex-
tension .sdf. Although they are ASCII les, please do not edit sdf les.
For active site and full protein les, hydrogens are written if <hydrogens> is an-
swered yes. Note that only hydrogens contained in the PDB le or those added
by FlexX in the active site are written. If an active site le is written for further use
in FlexX, hydrogens should not be included (for FlexX protonation rules see section
6.5.7).
If the united protein structure is written, alternative conformers are marked by al-
ternate location indicators. Conformers that belong to all input structures have no
such indicator. For the rst nine slots the slot numbers are used as alternate location
indicators. Then capital letters are used.
The default lenames are those dened in the edf le. Thus, you can specify le-
names for the active site and the surface in the edf le, let FlexX compute the active
site and surface atom selection and save them with the write command.
If you specify the written les in your .edf le, the surface atoms or active site
atoms are loaded the next time you load the protein. This is much faster than re-
computing the surface atoms each time you load the protein. The default directory
for this command is the path specied in the entry SITE or RECEPTOR, respectively
(cp. conguration).
READRDF
Syntax: READRDF <lename 1> <lename 2> ... <lename n>
Description: Reads a set of .rdf les as an ensemble. It is assumed that the align-
ment of these ensemble structures can be done by a default rule that matches ev-
erything on the corresponding amino acid name, chain identier and amino acid
8.4. DOCKING INTO ENSEMBLES OF PROTEIN STRUCTURES 195
number without any offset (see @align record 8.4.1 for details).
DELETE
Syntax: DELETE
Description: Removes an ensemble from FlexXs workspace. All data associated
with the ensemble such as the united protein description, the hash table, and place-
ments will be removed automatically, too.
EDIT
Syntax: EDIT
Description: Calls the editor displaying the ensemble description le currently in
FlexXs main memory. The editor dened as (EDITOR) from your conguration
will be used.
INFOALIGN
Syntax: INFOALIGN <one_letter_code>
Description: Outputs the alignment of the amino acids of the particular ensemble
structures either as a one-letter code (default) if <one_letter_code> is set to true, or
as a three-letter code otherwise. NOTE: The alignment is matched onto the reference
structure from slot 1. Amino acids that are missing in structure 1 will not appear in
this alignment.
INFOENS
Syntax: INFOENS
Description: Displays the main characteristics of an ensemble, such as the .pdb
lenames, active site denition, loaded hetero group, surface and reference ligands
les, the probe location and whether the ensemble structure is transformed by su-
perimposing onto the reference structure.
For more details on particular ensemble structures, switch to the structure with the
command RECEPTOR/SELENS and use RECEPTOR/INFO.
INFORMSD
Syntax: INFORMSD <active> <part> <details>
Description: Computes the RMSD between the ensemble structures 2-30 and the
reference ensemble structure 1. If <active> is set to true, only atoms of the active
site will be considered. You can also select which <part> of the proteins are com-
pared: the complete structure (0), only backbone atoms(1), or only side chain atoms
(2). If <details> is set to true, the distance between each pair of corresponding
atoms will be output.
196 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
RMSDMATRIX
Syntax: RMSDMATRIX <active> <part>
Description: Computes an RMSD matrix of all ensemble structures. If <active> is
set to true, only atoms of the active site will be considered. You can also select which
<part> of the proteins are compared: the complete structure (0), only backbone
atoms(1), or only side chain atoms (2).
SUPER
Syntax: SUPER <active> <part>
Description: Superimposes the ensemble structures 2-30 onto the reference ensem-
ble structure 1. If <active>is set to true, only atoms of the active site are considered.
You can also select which <part> of the proteins are superimposed: the complete
structure (0), only backbone atoms(1), or only side chain atoms (2).
Note: In order to generate the initial superposition you should overlay the back-
bone of the complete protein structures: SUPER n 1
ACTIVE
Syntax: ACTIVE <radius> <complete_aa>
Description: Selects the atoms that belong to the active site. All protein atoms
which are closer than <selection radius> from a ligand atom of any reference lig-
and of the ensemble are taken to be the set of active site atoms. If <complete> is
answered with yes, the selection is extended to complete amino acids.
Note: Since the command selects all atoms around any ligand, make sure that en-
semble structures (and the reference ligands) are either far way from each other or
correctly superimposed (either externally or using the command SUPER). Otherwise
strange selections may occur.
BUILD
Syntax: BUILD
Description: Creates a united protein description from the ensemble structures.
The following operations are initiated in this order: aligning the respective instances
of the ensemble structures, clustering the aligned instances, connecting the clus-
tered instances, determining the compatibility of the instances and the connected
components in the incompatibility graph, assigning interaction types and geome-
tries to the united protein description.
Preconditions: The ensemble structures (and the reference ligands) must be super-
imposed (either externally or using the command SUPER) before the command can
be used.
Selecting admin settings for drawing the ensemble (SELADM)
Syntax: SELADM <ref_lig> <start graphics object number> <temp le>
<append>
8.4. DOCKING INTO ENSEMBLES OF PROTEIN STRUCTURES 197
Description: With SELADM you can specify the graphics object numbers used for
drawing the ensemble structures and you can determine whether the graphics les
are internal temporary les used only by FlexX or saved for further use. For yes/no
questions you can enter either y, yes or 1 for yes, and similarly n, no or 0 for
no.
<ref_lig> Yes/no answer:
yes The following modications concern the settings for drawing the reference
ligands of the ensemble.
no The following modications concern the settings for drawing the protein
structures of the ensemble.
<start graphics object number> Enter the number of a graphics object (1-255) as
the start of a range of objects into which the members of the ensemble will
be drawn. (There is no fo mode here (see other SELADM commands for an
explanation of fo mode).)
<temp le> Yes/no answer:
yes The graphics are written in temporary les and removed after quitting
FlexX.
no The graphics are written to permanent les chosen by the user. You will be
asked for a lename at the end of each DRAW command (see DRAW below
for example).
<append> Yes/no answer:
yes Previous graphics les are not overwritten. Instead the current graphics
are appended to the previous one in the graphics le.
no The previous graphics le will be overwritten and all previous graphics
made with the DRAW command in this menu will be lost.
Selecting the graphics settings for drawing the ensemble (SELGRA)
Syntax: SELGRA <ref_lig> <. . . parameters. . . >
Description: With SELGRA you can set specic default values for drawing ensem-
ble structures. For yes/no questions you can enter either y, yes or 1 for yes, and
similarly n, no or 0 for no.
<ref_lig> Yes/no answer:
yes The following modications concern the settings for drawing the reference
ligands of the ensemble.
no The following modications concern the settings for drawing the protein
structures of the ensemble.
<. . . parameters. . . > For details of the remaining parameters see either
RECEPTOR/SELGRA (7.6.13) or LIGAND/SELGRA (7.5.16) depending on
whether the modications are for the reference ligands or protein structures.
Selecting the coloring for drawing the ensemble (SELCOL)
Syntax: SELCOL <ref_lig> <. . . color modes. . . >
198 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Description: With SELCOL you can set the color modes for the ensemble struc-
tures. For example, for the reference ligands you can select color modes for among
other things the molecules and molecular surfaces, and for the protein structures
you can select color modes for the backbone. For yes/no questions you can enter
either y, yes or 1 for yes, and similarly n, no or 0 for no.
<ref_lig> Yes/no answer:
yes The following modications concern the settings for drawing the reference
ligands of the ensemble.
no The following modications concern the settings for drawing the protein
structures of the ensemble.
<. . . color modes. . . > For details of the color modes see either RECEPTOR/SELCOL
(7.6.14) or LIGAND/SELCOL (7.5.17) depending on whether the modications
are for the reference ligands or protein structures.
In addition, however, you will notice one extra color mode that is not available
for the RECEPTOR/SELCOL and LIGAND/SELCOL commands. This color mode is
exclusive to ensemble drawing and is available for drawing the reference ligand
structures, protein structures and protein backbone:
POLYCOL Color the ensemble structures according to the ensemble slot. Each slot
is assigned a different color and everything belonging in one ensemble slot will
be drawn in that color.
Selecting the labels for drawing the ensemble (SELLAB)
Syntax: SELLAB <ref_lig> <. . . parameters. . . >
Description: When the ensemble structures are drawn, FlexE stores information in
labels for display in the graphic interface. You can choose what should appear in
the label using the SELLAB command. For yes/no questions you can enter either
y, yes or 1 for yes, and similarly n, no or 0 for no.
<ref_lig> Yes/no answer:
yes The following modications concern the settings for drawing the reference
ligands of the ensemble.
no The following modications concern the settings for drawing the protein
structures of the ensemble.
<. . . parameters. . . > For details of the remaining parameters see either
RECEPTOR/SELLAB (7.6.15) or LIGAND/SELLAB (7.5.18) depending on
whether the modications are for the reference ligands or protein structures.
(Note: charges are not available for labeling the reference ligand structures.)
Drawing the ensemble (DRAW)
Syntax: DRAW <ens_selection> [<lename>]
Description: DRAW generates a drawing of the ensemble and sends it to le ready
to be displayed in the graphics interface. For details about what exactly is drawn
see the SELGRA command.
8.4. DOCKING INTO ENSEMBLES OF PROTEIN STRUCTURES 199
<ens_selection> Selection of the ensemble structures to be drawn (structure in
slot 0 is the united protein description). The selection is a list of integers or
integer ranges (format a-b) separated by , or blanks the integers represent
the ensemble slots.
[<lename> ] If the graphics are not to be stored in a temporary le (see SELADM),
enter the lename for storing the graphics here.
Important notes: Drawings are not displayed automatically. Use DISPLAY to out-
put the drawing on the graphics device.
GRAINF
Syntax: GRAINF <ref_lig> ...
Description: Lists all current graphic settings for the protein structures (0) or the
reference ligands (1) depending on <ref_lig>. See RECEPTOR/GRAINF (7.6.17) and
LIGAND/GRAINF (7.5.21) for details.
8.4.3 Drawing the incompatibility graph (ENSEMBLE/GRAPH submenu)
Selecting admin settings for drawing the incompatibility graph (SELADM)
Syntax: SELADM <graphics object number> [<start fo object>] [<end fo
object>] <temp le> <append>
Description: With SELADM you can specify the graphics object numbers used for
drawing the incompatibility graph and you can determine whether the graphic les
are internal temporary les used only by FlexX or saved for further use. For yes/no
questions you can enter either y, yes or 1 for yes, and similarly n, no or 0 for
no.
<graphics object number> Enter integer:
(1255) The graphics created with the DRAW command will be displayed in
graphics object <graphics object number>.
0 fo mode the graphics generated by subsequent DRAW commands will be
sent to a range of graphics objects in a rst-drawn-rst-overwrite manner.
You will be asked to enter two more parameters:
<start fo object> The start graphics object for the fo range.
<end fo object> The end graphics object for the fo range.
<temp le> Yes/no answer:
yes The graphics are written in temporary les and removed after quitting
FlexX.
no The graphics are written to permanent les chosen by the user. You will be
asked for a lename at the end of each DRAW command (see DRAW below
for example).
<append> Yes/no answer:
yes Previous graphics les are not overwritten. Instead the current graphics
are appended to the previous one in the graphics le.
no The previous graphics le will be overwritten and all previous graphics
made with the DRAW command in this menu will be lost.
200 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Selecting graphics settings for drawing the incompatibility graph (SELGRA)
Syntax: SELGRA <all components> <compatibility edges> <size>
Description: With SELGRA you can set specic default values for drawing the in-
compatibility graph. For yes/no questions you can enter either y, yes or 1 for
yes, and similarly n, no or 0 for no.
<all components> Yes/no answer:
yes All connected components of the incompatibility graph are drawn.
no Only connected components from a selected list are drawn. The available
components are those in the active site only. You will be asked to make a
selection of components:
<component selection> Choose a list of components represented by in-
tegers. Enter the list as separate integers or as integer ranges (format
a b) separated by , or blanks. Note that you need to enclose the
expression in quotation marks if it contains blanks. e.g. 1, 2, 4, 7 9
<compatibility edges> Yes/no answer:
yes Edges (lines) between compatible graph nodes (within connected compo-
nents) are drawn.
no Edges (lines) between incompatible nodes are drawn.
<size> Enter the radius (in ngstroms) which will determine the size of the drawn
nodes. The size only has an effect if the ball and stick representation for the
incompatibility graph is selected in FlexV.
Selecting coloring for drawing the incompatibility graph (SELCOL)
Syntax: SELCOL <instance color mode> <edge color mode>
Description: With SELCOL you can set the color modes for the instances and the
connecting edges in the incompatibility graph. For each of these, a selection of color
modes is available:
<instance color mode> Choose the color mode for drawing the instances color
mode selection:
UNIQUE
COMPONENT
<edge color mode> Choose the color mode for drawing the edges color mode
selection:
UNIQUE
COMPONENT
The possible color modes are explained below for some color modes you are also
asked to enter some dening colors. Enter your chosen color as either an angle from
the color circle (0 360 degrees: 0 is invisible, 1360 runs from dark blue, through
red, yellow, green to blue), a color name (as dened in the GRAPHIC static data le),
or an RGB(A) value; 3 (4) oating-point numbers separated by blanks or slashes:
8.4. DOCKING INTO ENSEMBLES OF PROTEIN STRUCTURES 201
Example
selcol .... "dark green" ....
selcol .... green ....
selcol .... "0.0 0.8 0.1" ....
selcol .... 0.0/0.8/0.1 ....
selcol .... 220 ...
(For a full explanation of how to dene colors see Sec. 11.22.1.)
UNIQUE The object will be drawn in one user-dened color. You will be asked to
choose the unique color:
<color> Enter your chosen color.
COMPONENT Each connected component is drawn in a different color.
Selecting the labels for drawing the incompatibility graph (SELLAB)
Syntax: SELLAB <component ID> <instance ID> <info>
Description: When the incompatibility graph is drawn, FlexX stores information
about the instances for display in the graphics interface. You can choose what
should appear in the label using the SELLAB command. For yes/no questions you
can enter either y, yes or 1 for yes, and similarly n, no or 0 for no.
<component ID> Yes/no answer:
yes Include the component ID in the label the format is C<ID> (see below).
no Do not include the component ID in the label.
<instance ID> Yes/no answer:
yes Include the instance ID in the label the format is I<ID> (see below).
no Do not include the instance ID in the label.
<info> Yes/no answer:
yes Include all information about the instance ID in the label. This setting
overrides the answers given to the above two questions. The information
included in the label takes the following order and format:
I<ID> Instance ID
<aa_name> Amino acid name
<chain_ID> Chain ID
<aa_number> Amino acid number
<ref_slot> Ensemble slot from which the conformer originates
<cur_slot> Current slot
<type> Type of conformer: BB = BackBone; SC = SideChain
<active> Part of active site: a = active site; n = NOT within active
site
<tmpl_nr> Template number from EDF le
<tors_code> Code for torsion type
C<ID> Component ID
no Do not include the complete instance information in the label the answers
to the above two questions determine the label text.
202 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Note If you are working with FlexV as the graphical interface, you may notice
problems when trying to click on overlapping objects in the picture, i.e. when
you try to click on the double nodes (i.e. nodes that are very close to each
other or rather those that are connected by incompatibility edges) to label them
you have to click a little way away in the background. If you click directly on
the double nodes you cannot label them.
Drawing the incompatibility graph (DRAW)
Syntax: DRAW [<lename>]
Description: DRAW generates a drawing of the incompatibility graph and sends it
to le ready for display in the graphical interface. For details about what exactly is
drawn see the SELGRA command.
[<lename> ] If the graphics are not to be stored in a temporary le (see SELADM),
enter the lename for storing the graphics here.
Important notes: Drawings are not displayed automatically. Use DISPLAY to out-
put the drawing on the graphics device.
GRAINF
Syntax: GRAINF
Description: Outputs a list of all current graphic settings (the graphic context) for
the incompatibility graph.
8.4.4 Generating ensembles (ENSEMBLE/GENRDF submenu)
All commands in this menu are in a very experimental state!
The GENRDF submenu will allow PDB les to be read directly without any manually de-
ned receptor or ensemble description le. The basic idea is to have a predened generic
ensemble description le that contains all possible @template and @h_torsion entries and
takes into account different alternate locations in the PDB le.
Three different generic EDF les are provided in the static data directory. They are called
generic<level>.edf. The higher the <level> the more parts of the protein structure are
varied. Please note that whenever you use such generic edf les, you need to change the
AMINO and CHARGES entries from the conguration (see pages 312 and 267, resp.). The
respective pendants to amino.dat and pcharges.dat for usage of the GENRDF command are
amino_gen.dat and pcharges_gen.dat.
You can dock directly into this generic ensemble with FlexE or generate a single RDF de-
scription of the most suitable @template and @h_torsion entries with respect to a reference
ligand in order to dock with FlexX. The optimized protein structure is written to slot 31,
which is the slot for the normal FlexX receptor structure. You can apply FlexX directly
in slot 31 or write out the resulting rdf by switching to slot 31 and using the command
RECEPTOR/WRITE.
Currently, two algorithms are implemented for selecting the most suitable @template and
@h_torsion entries: GENRDF1 and GENRDF2. Both optimize receptor description les by
selecting the best amino acid template with respect to the scoring of the reference ligand.
8.4. DOCKING INTO ENSEMBLES OF PROTEIN STRUCTURES 203
GENRDF1 does this locally for each amino acid independently. GENRDF2 tries to nd a
global optimum using the FlexE approach.
READPDB
Syntax: READPDB <pdb_lename> <lig_lename> <level> <radius>
<complete_aa>
Description: Reads a PDB le directly without a manually dened receptor or en-
semble description le. The ensemble information is stored in special edf les
called generic<level>.edf. The higher the <level> the more parts of the protein
structure are varied (see generic<level>.edf in static data for details). The active
site is determined automatically with the given <radius>. Complete amino acids
are selected if <complete_aa> is true.
Preconditions: The reference ligand must be given in MOL2 format. The name
of the le must be <pdb_lename>_cryst.mol2. If hetero groups are present, they
must all be given in a multi-mol2 le called <pdb_lename>_hf.mol2.
GENRDF1
Syntax: GENRDF1 <radius> <complete_aa>
Description : Generates an optimized receptor description le by locally selecting
the best amino acid template with respect to the scoring function. The optimized
protein structure is written to slot 31, which is the slot for the normal FlexX receptor
structure. The resulting rdf can be written by switching to slot 31 and using the
command RECEPTOR/WRITE (7.6.5).
Preconditions: A reference ligand must be loaded (LIGAND/READ, 7.5.1,
LIGAND/READREF, 7.5.11).
GENRDF2
Syntax: GENRDF2<radius> <complete_aa>
Description: Generates an optimized receptor description le by globally selecting
the best amino acid template with respect to the scoring function using the FlexE
scoring mechanism. The optimized protein structure is written to slot 31, which is
the slot for the normal FlexX receptor structure. The resulting rdf can be written by
switching to slot 31 and using the command RECEPTOR/WRITE (7.6.5).
Preconditions: A reference ligand must be loaded (LIGAND/READ, 7.5.1,
LIGAND/READREF, 7.5.11).
INFOGEN
Syntax: INFOGEN
Description: Prints information about the generic ensemble. For each amino
acid/hetero atom and slot, the choosen template, torsion, alternate location and
assignment are shown.
204 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
EDITGEN
Syntax: EDITGEN <level>
Description: Calls the editor displaying the generic ensemble description le of
<level>. The editor dened as (EDITOR) in your conguration will be used.
CHECKPDB
Syntax: CHECKPDB
Description: Not yet implemented!
8.4.5 Additional receptor commands (RECEPTOR submenu)
SELENS
Syntax: SELENS <selection> [<overwrite>]
Description: Selects an ensemble structure as the current receptor. The structure in
slot 0 is the united protein description and structure 31 is an additional slot where
the normal FlexX receptor structure can be stored parallel to the ensemble. The ad-
ditional information of the current receptor will be removed automatically, e.g. the
hash table, placements etc. If you leave slot 31, the normal FlexX receptor structure
will be forgotten. In this case FlexX will ask rst whether to overwite the structure
information. Answer <overwrite> y to overwite the structure information.
CLUSTERIA
Syntax: CLUSTERIA
Description: Clusters the interaction points. The result is smaller hash table.
8.4.6 Additional docking commands (DOCKING submenu)
LISTINST
Syntax: LISTINST <table length>
Description: Displays a table of all instances that are touched by particular solu-
tions. The table has the following columns:
No. (SOL_NO) The number of the solution.
Inst. AA (INST_AA) Instance amino acid.
Inst. CH (INST_CHAIN) Instance chain identier.
AA Nr. (INST_AA_NR) Number of the instance amino acid.
Inst. part (INST_PART) Backbone (BB) or side chain (SC) part.
ENS. slot (INST_SLOT) Slot that instance originally stems from.
Comp. ID (INST_COMP_ID) ID of connected component.
Comp. type (INST_COMP_TYP) Type of connected component (0: none, 1: single
(only one instance), 2: clique, 3: large).
Sel. inst. (INST_SELECT) Instance selected for docking (yes: 1).
Total Score (INST_E_TOTAL) Total score of the instance.
8.4. DOCKING INTO ENSEMBLES OF PROTEIN STRUCTURES 205
Match Score (INST_E_MATCH) Contribution of the matched interacting groups.
Lipo Score (INST_E_LIPO) Contribution of the lipophilic contact area.
Ambig Score (INST_E_AMBIG) Contribution of the lipophilichydrophilic (am-
biguous) contact area.
Clash Score (INST_E_CLASH) Contribution of the clash penalty.
PLP Score (INST_E_PLP) PLP (piecewise linear potential) atom score.
FFRL Score (INST_E_FFRL) Force eld receptorligand score.
nof atoms (INST_NOFATM) Number of overlapping atoms (for a description of
the overlap test, see 11.4.1).
Ovlp. vol. (INST_VOLUME) total protein/ligand overlap volume (sum) (for a de-
scription of the overlap test, see 11.4.1).
Avg. vol. (INST_AVGVOL) Average volume of protein/ligand overlap (for a de-
scription of the overlap test, see 11.4.1).
Max. vol. (INST_MAXVOL) Maximumvolume of protein/ligand overlap (for a de-
scription of the overlap test, see 11.4.1).
8.4.7 FlexE specic parameters
Preparing the united protein description
Name: <INST_EXT_ACT_RADIUS> (oating point)
Description: Extra radius for extended active site. Instances within this radius
around the active site are considered while docking. Placements that interact with
instances further away will be rejected.
Default value: 2.0
Reasonable range: 0.0 20.0
Name: <INST_CLUSTER_DELTA> (oating point)
Description: Threshold for clustering instances.
Default value: 0.5
Reasonable range: 0.0 2.5
Name: <INST_OVERLAP_VOL> (oating point)
Description: Maximum overlap volume between two instances. Instances which
have a larger overlap volume are geometrically incompatible (for a description of
the overlap test, see 11.4.1).
Default value: 6.0
3
Reasonable range: 0.0 15.0
3
Name: <INST_BOND_TOLERANCE> (oating point)
Description: Atom bond length tolerance between two instances. If the difference
to the theoretical bond length is larger the instances are not connected.
Default value: 1.0
Reasonable range: 0.0 2.5 should correspond to INST_CLUSTER_DELTA
Name: <INST_TORSION_TOLERANCE> (oating point)
206 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Description: Torsion angle tolerance between two instances. If the difference to
the theoretical torsion angle (180.0
) of the peptide bond is larger the instances are
not connected.
Default value: 30.0
Reasonable range: 0.0 - 60.0
Name: <RESTRICT_COMP_TO_STRUCTURE> (ag)
Description: If this is set to 1, only conformers of the same ensemble structure are
compatible. NOTE: This leads to one single connected component in the incom-
patibility graph so the search algorithm is very inefcient. The conformer can still
be clustered. Clustered conformers can then be used in all structures belonging to
the cluster. However, due to the fact that clustering clashes between conformers
are more likely, that may lead to conicts. This ag is therefore best used without
clustering conformers, i.e. INST_CLUSTER_DELTA = 0.0).
Default value: 0
Reasonable range: 0,1
Name: <INDIRECT_INCOMP> (ag)
Description: If this is set to 1, the indirect incompatibility between instance, e.g.,
the incompatibility that results from indirect dependencies between instance, is
computed beforehand exhaustively. Since this new computation is still experimen-
tal it can be switched off for debugging, but we recommend to leave it switched
on.
Default value: 1
Reasonable range: 0,1
Docking
Name: <TAKE_FIRST_SET> (Boolean)
Description: If the parameter is true, FlexE takes the rst independent set when
selecting instances, otherwise it searches through the whole search space, which
may take a considerable amount of time!
Default value: 1
Reasonable range: 0,1
Generic ensemble
Name: <ENS_CLASH_HYDROGEN> (oating point)
Description: Two hydrogens clash if they are closer to each other than this thresh-
old.
Default value: 1.8
Reasonable range: 1.0 2.5
Name: <ENS_CLASH_METAL> (oating point)
Description: A hydrogen clashes with a metal ion if they are closer to each other
than this threshold.
Default value: 1.5
Reasonable range: 1.0 2.5
8.4. DOCKING INTO ENSEMBLES OF PROTEIN STRUCTURES 207
Name: <ENS_CLASH_ADD> (oating point)
Description: Additional search radius to nd heavy atoms that may have clashing
hydrogens.
Default value: 1.5
Reasonable range: 1.0 2.5
Clustering interaction points
Name: <IA_CLUSTER_DELTA> (oating point)
Description: Threshold for clustering of interaction points.
Default value: 0.5
Reasonable range: 0.1 1.5
8.4.8 Compatibility with other modules
Since FlexX Release 2 the FlexE module can be used in combination with the FlexX-Pharm
module (section 8.3 and with the FlexX
c
module (section 8.2).
You can use the same pharmacophore constaints as for FlexX and you need no extra steps.
The interaction constraints will be automatically extended to a set of alternative constraints
for the particular ensemble structures. For more details see the FlexX-Pharm section (8.3).
A combinatorial library can be docked simply by loading the library in the CLIB menu
(section 8.2) and switching to the united protein structure before docking in the CDOCK
menu.
208 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
8.5 Lattice energies and grids
The rationale:
It may be useful to assess energies or other data fromprograms which write such data based
on grids. Field data can currently be read in from two sources: either the ones obtained in
ACNT format, e.g. a DRUGSCORE energy eld, or data in GRID format. To assess coordi-
nates also apart from the grid points themselves, the energy values of the respective lattice
points are smeared out by a Gaussian approximation scheme. The result will therefore be
a Gauss-approximated energy or data eld.
8.5.1 Depth of the lattice points
As an alternative denition to standard energies one may think of a depth within a pro-
tein. It may often hold that the more buried a ligand is, the better. To assess a depth
value, the points on an energy lattice as described above can help. In a rst step, these
values can be labeled with distinct values for their depth as described in [27]. The possible
values for the result is in the range of 0-14 (8 corners of a cube plus 6 centers of the faces):
a value of 0 means that the grid point is outside the protein. In contrast, a lattice point
of depth 14 is deeply buried in the cavity of an active site. Following the rationale that
energy values exhibit negative values, only the points with negative energy values (i.e. pos-
itive depth values) will be labeled, because they are the ones that lie inside the protein (see
also the denition of ACNT and GRID lattices).
To ease handling with such points and shrink the amount of data to be processed, there is a
threshold fur further processing: only lattice points with a depth value equal to or greater
than DEPTH_OF_LATTICE_POINTS (see section 8.5.6) will be considered. This affects the
COMPGA or DRAW command.
8.5.2 Gauss function
Next, the energy values of the lattice as computed above will be approximated with Gaus-
sian functions (see command COMPGA). A Gaussian g is represented by
g(x) = h
g
exp
b
g
|x c
g
|
2
(8.1)
where h
g
, b
g
and c
g
are the height, width and the center of the Gaussian g. The height h
g
will be deduced from the energy values of the lattice points. The width b
g
is taken from the
parameter <DEFAULT_GAUSS_WIDTH> (see section 8.5.6). The center c
g
will be deduced
from the coordinates of the lattice points.
It may be useful to compute the overlap volume between a deduced Gaussian and docked
ligand atoms. To this end, each ligand atom will be assigned with a generic Gaussian the
same for all ligand atoms. The height of this generic Gaussian is 10.0, and the width b is
taken from the parameter <DEFAULT_GAUSS_WIDTH>. Finally, the centers are again at
the coordinates of the respective ligand atoms.
The overlap between a deduced Gauss function g and the Gauss function a representing the
8.5. LATTICE ENERGIES AND GRIDS 209
atoms can now be calculated as:
O(g, a) = h
g
h
a
b
g
+ b
a
3
2
exp
b
g
b
a
b
g
+ b
a
d
2
g,a
= 10.0 h
g
2b
g
3
2
exp
b
g
2
d
2
g,a
, (8.2)
where d
g,a
is the distance between the deduced Gauss center and the ligand atom. The
overlap between the deduced Gaussians and an entire ligand will be calculated by
g
i
a
j
O(g
i
, a
j
) (8.3)
where g
i
are the deduced Gaussians and a
j
are the generic atom Gaussians. The commands
to perform all these calculations are described below.
8.5.3 Working with grid-based energies (the RECEPTOR/GAUSS submenu)
ACNT
Syntax: ACNT <lename> <energy_type>
Description: Reads in an energy lattice from a le. The ASCII le <lename>
must be in ACNT le format. <energy_type> species the type of the energy lat-
tice (actually this is only a name for the lattice that is used within FlexX as unique
reference). If you choose % for <energy_type>, the type will be taken from the le.
ACNT generates statistical output (see the example below).
Important notes: Make sure that the origin of both your lattice data and pro-
tein/ligand coordinates match! (Visualize them in FlexV.)
Requirements: A receptor must be loaded.
Here is a sample output:
Example
Minimum interaction energy: -37441.047
Minimum interaction energy in the active site (min depth 10): -37441.047
Total number of grid points: 6480
Nof grid points with pos values: 4106
Nof grid points with neg values: 2306
>> table: min and max of the positive/negative values
grid points| min neg val | max neg val | min pos val | max pos vag |
all | -3.744e+04 | -1.100e+01 | 4.147e+01 | 1.403e+05 |
depth: 10 | -3.744e+04 | -2.800e+01 | --- | --- |
table: nof grid points with negative values between 0% and 100% of min neg val (all)
GRID | <10% | <20% | <30% | <40% | <50% | <60% | <70% | <80% | <90% | <100% | total
all | 480 | 478 | 427 | 300 | 235 | 186 | 103 | 44 | 39 | 14 | 2306
depth: 10 | 265 | 299 | 288 | 194 | 160 | 110 | 61 | 26 | 24 | 7 | 1434
210 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
GRID
Syntax: GRID <lename> <grid type>
Description: Reads in a GRIDenergy lattice froma le. The ASCII le <lename>
must be in GRIDle format. <grid type> species the type of the energy lattice. As
above, this is only a name for the lattice to be used within FlexX as unique reference.
GRID generates statistical output (see example).
Important notes: Make sure that the origin of both your lattice data and pro-
tein/ligand coordinates match! (Visualize them in FlexV.)
Requirements: A receptor must be loaded.
SCALE
Syntax: SCALE <factor>
Description: Scales the values of the energy lattice with <factor>.
Requirements: An energy lattice must be loaded with ACNT or GRID.
COMPGA
Syntax: COMPGA <percent>
Description: Approximates all negative values of the energy lattice by a Gaussian
broadening. The approximation will be performed with respect to the depth label
of the lattice points. Only points exhibiting depth labels greater than or equal to
DEPTH_OF_LATTICE_POINTS will be considered.
If min is the minimum energy value with respect to DEPTH_OF_LATTICE_-
POINTS, then only energy values in the range of [min; percent min] will be ap-
proximated with Gaussians. The range of <percent> is [0.0; 1.0].
In the end, the procedure yields for an energy e
i
at a grid point with coordinates c
i
a value of
e
i
=
g
j
g
j
(c
i
),
where g
j
are the deduced Gaussians (see (8.1) in section 8.5.2).
The type of the Gaussian matches the type of the energy lattice.
Important notes: The original energy lattice will not be destroyed by the approxi-
mation with Gaussians.
Requirements: An energy lattice must be loaded with ACNT or GRID.
READG
Syntax: READG <lename>
Description: Reads Gaussians from a multi-mol2 le <lename>.
Important notes: If <lename> does not end on ".gauss2", this sufx will be
added automatically.
WRITEG
Syntax: WRITEG <lename>
8.5. LATTICE ENERGIES AND GRIDS 211
Description: Writes all Gaussians of all types to the multi-mol2 le <lename>.
Important notes: If <lename> does not end on ".gauss2", this sufx will be
added automatically.
Requirements: Gaussians must have been computed with COMPGA or loaded with
READG.
DELETEG
Syntax: DELETEG <type_id>
Description: Deletes all Gaussians of type <type_id>.
Requirements: Gaussians must have been computed with COMPGA or loaded with
READG.
LISTG
Syntax: LISTG
Description: Lists the types of Gaussians (see example):
Requirements: Gaussians must be computed with COMPGA or loaded with READG.
Here is a sample output:
212 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Example
>> List of gaussians: 12 types
Type id | type | g_id | energy | max.overlap | coord
---------+-------+------+--------+-------------+---------------------
1 | C_2 | 1 | -39.05 | 584.77 | -2.39 34.19 15.86
1 | C_2 | 2 | -36.32 | 543.88 | -5.19 38.39 15.16
1 | C_2 | 3 | -34.96 | 523.61 | -4.49 36.99 15.86
2 | C_3 | 1 | -39.30 | 588.57 | -2.39 34.19 15.86
2 | C_3 | 2 | -38.32 | 573.92 | -5.19 38.39 15.16
2 | C_3 | 3 | -36.21 | 542.30 | -6.59 39.79 15.16
3 | C_ar | 1 | -39.91 | 597.76 | -2.39 34.19 15.86
3 | C_ar | 2 | -38.71 | 579.67 | -4.49 36.99 15.86
3 | C_ar | 3 | -35.27 | 528.23 | -5.89 39.09 15.86
4 | C_cat | 1 | -33.97 | 508.71 | -5.89 39.09 15.86
5 | N_3 | 1 | -34.73 | 520.18 | -5.89 39.79 15.16
5 | N_3 | 2 | -32.22 | 482.47 | -2.39 33.49 16.56
6 | N_am | 1 | -33.46 | 501.06 | -2.39 34.19 15.86
6 | N_am | 2 | -30.22 | 452.65 | -5.89 39.79 15.86
7 | N_ar | 1 | -32.08 | 480.43 | -4.49 36.99 15.86
7 | N_ar | 2 | -31.40 | 470.20 | -1.69 34.19 15.86
7 | N_ar | 3 | -28.77 | 430.95 | -5.89 37.69 16.56
8 | N_pl3 | 1 | -32.16 | 481.72 | -6.59 39.79 15.16
8 | N_pl3 | 2 | -31.83 | 476.66 | -1.69 34.19 15.86
8 | N_pl3 | 3 | -28.02 | 419.71 | -3.09 35.59 15.86
9 | O_2 | 1 | -25.80 | 386.44 | -3.09 34.19 14.46
9 | O_2 | 2 | -21.77 | 325.97 | -1.69 33.49 15.16
9 | O_2 | 3 | -21.64 | 324.16 | -6.59 39.79 14.46
10 | O_3 | 1 | -36.43 | 545.58 | -1.69 34.19 17.26
10 | O_3 | 2 | -33.09 | 495.62 | -1.69 33.49 15.16
10 | O_3 | 3 | -32.27 | 483.28 | -6.59 39.79 14.46
11 | O_co2 | 1 | -33.98 | 508.84 | -1.69 33.49 15.16
11 | O_co2 | 2 | -28.53 | 427.29 | -1.69 34.19 16.56
11 | O_co2 | 3 | -27.21 | 407.54 | -3.09 34.19 14.46
12 | S_3 | 1 | -35.16 | 526.51 | -3.09 35.59 15.86
Legend:
Type id Unique identier for a type of Gaussians.
type Lattice/type name (see <energy_type> for ACNT or GRID).
g_id Unique identier for the itemized Gaussians of one type.
energy The height/amplitude of Gaussian: h
g
(see (8.1) in section 8.5.2).
max.overlap The overlap between the deduced Gaussian g and a generic atom Gaussian a: O(g, a)
with d
g,a
= 0 (see (8.2) in section 8.5.2).
coord The coordinates of the Gauss center c
g
(see (8.1) in section 8.5.2).
Selecting admin settings for drawing the energy lattice and Gaussians (SELADM)
Syntax: SELADM <graphics object number> [<start fo object>] [<end fo
object>] <temp le> <append>
Description: With SELADM you can specify the graphics object numbers used for
drawing energy lattices and Gaussians and you can determine whether the graphics
les are internal temporary les used only by FlexX or saved for further use. For
yes/no questions you can enter either y, yes or 1 for yes, and similarly n, no
8.5. LATTICE ENERGIES AND GRIDS 213
or 0 for no.
<graphics object number> Enter integer:
(1255) The graphics created with the DRAW command will be displayed in
graphics object <graphics object number>.
0 fo mode the graphics generated by subsequent DRAW commands will be
sent to a range of graphics objects in a rst-drawn-rst-overwrite manner.
You will be asked to enter two more parameters:
<start fo object> The start graphics object for the fo range.
<end fo object> The end graphics object for the fo range.
<temp le> Yes/no answer:
yes The graphics are written in temporary les and removed after quitting
FlexX.
no The graphics are written to permanent les chosen by the user. You will be
asked for a lename at the end of each DRAW command (see DRAW below
for example).
<append> Yes/no answer:
yes Previous graphics les are not overwritten. Instead the current graphics
are appended to the previous one in the graphics le.
no The previous graphics le will be overwritten and all previous graphics
made with the DRAW command in this menu will be lost.
Selecting graphics settings for drawing the energy lattice and Gaussians (SELGRA)
Syntax: SELGRA <draw lattice> <draw gaussian>
Description: With SELGRA you can specify which lattice or Gaussians will be
drawn.
<draw lattice> If set to:
0 No lattice will be drawn
1 The energy lattice will be drawn. To color the lattice points according to
energy, the following colors will be used. The energy value of the lattice
point lies between x and y percent of the minimum energy value:
color x y
red 80 100
yellow 60 80
green 40 60
white 20 40
2 The lattice points that do not clash with the protein will be drawn. To color
the lattice points according to depth, the following colors will be used:
214 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
color depth
black 14
dark blue 13
blue 12
color depth
dark red 11
red 10
dark green 9
color depth
yellow >=8
gold -1
Note: Lattice points in the protein have a positive energy and are assigned
a depth of -1.
3 all lattice points which have a positve energy will be drawn.
<draw gaussian> It is possible to draw two types of Gaussians energy or over-
lap. Gaussians have nonzero values even at innite distances so to draw a
Gaussian g, the radius r
g
of its visual representation has to be constrained. In
FlexX, the radius will be deduced from the Gauss level asked at the DRAW
command. How each type of Gaussian is drawn is explained below.
If <draw gaussian> is set to:
0 No Gaussians will be drawn
1 Overlap Gaussians will be drawn.
To draw an overlap Gaussian, the radius r
g
will be calculated by the over-
lap between the Gauss function g and the Gauss function a representing
the atoms (see paragraph 8.2 in section 8.5.2):
p
o
max
= h
g
h
a
b
g
+ b
a
3
2
exp
b
g
b
a
b
g
+ b
a
r
2
g
(8.4)
r
g
=
(b
g
+ b
a
) ln
p
o
max
h
g
h
a
b
g
+b
a
3
2
b
g
b
a
where h
g
and b
g
(and h
a
and b
a
respectively) are the height and width of
a Gauss function (index g) (or the atom-based Gauss function (index a,
resp.). o
max
is the maximum overlap between a Gauss function g and an
atom-representing Gauss function a.
p
is a fraction value:
p
[0; 1.0].
Therefore, the overlap between an atom and a Gaussian is greater than
p
o
max
, if the distance between the atom and the Gauss center is smaller
than r
g
.
2 Energy Gaussians will be drawn.
To draw energy Gaussians, the radius r
g
will be calculated from the Gauss
function (see the paragraph 8.1 in section 8.5.2) as follows:
v
=
h
g
exp
b
g
r
2
g
(8.5)
r
g
=
ln
v
[h
g[
b
g
8.5. LATTICE ENERGIES AND GRIDS 215
where h
g
and b
g
are the height and width of a Gauss function g.
v
is a
value greater than 0.
Selecting colors for drawing Gaussians (SELCOL)
Syntax: SELCOL <overlap color> <energy color>
Description: With SELGRA you can specify which colors to drawoverlap or energy
Gaussians in. Enter your chosen color as either an angle from the color circle (0
360 degrees: 0 is invisible, 1360 runs from dark blue, through red, yellow, green to
blue), a color name (as dened in the GRAPHIC static data le), or an RGB(A) value;
3 (4) oating-point numbers separated by blanks or slashes:
Example
selcol .... "dark green" ....
selcol .... green ....
selcol .... "0.0 0.8 0.1" ....
selcol .... 0.0/0.8/0.1 ....
selcol .... 220 ...
(For a full explanation of how to dene colors see Sec. 11.22.1.)
<overlap color> Choose the color for overlap Gaussians.
<energy color> Choose the color for energy Gaussians.
Drawing the energy lattice and Gaussians (DRAW)
Syntax: DRAW [<min depth>] [<gauss type id> <gauss level>] [<lename>]
Description: DRAW generates a drawing of the lattice points and/or the Gaussians
and sends it to le ready for display in the graphics interface. For details about what
exactly is drawn see the SELGRA command.
[<min depth> ] (Parameter asked only when drawing lattice points, see SELGRA)
Enter a depth value (integer) where only lattice points with a depth label
greater than or equal to <min depth> will be drawn.
[<gauss type id> ] (Parameter asked only when drawing Gaussians, see SELGRA
if no Gaussians are present, this parameter will not be asked (see Important
notes below)).
Choose which type of Gaussians to draw. The Gaussian types can be seen for
example in the output shown with the LISTG command (see example above
the type IDs are shown in the rst column). Enter one integer to choose one
type.
[<gauss level> ] (Parameter asked only when drawing Gaussians, see SELGRA
if no Gaussians are present, this parameter will not be asked (see Important
notes below)).
Enter the Gauss level for drawing the Gaussians. The value entered for the
Gauss level depends on whether overlap or energy Gaussians are to be drawn
(selected with the SELGRA command):
216 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
overlap The value of <gauss level> must be greater than 0 (
v
in (8.5)).
energy The value of <gauss level> must be a value given as a percent (
p
in
(8.4)).
(See explanation of Gaussian visualization in the SELGRA command descrip-
tion.)
[<lename> ] If the graphics are not to be stored in a temporary le (see SELADM),
enter the lename for storing the graphics here.
Important notes: A lattice must be loaded. To draw Gaussians, Gaussians must
have been computed from the lattice (see COMPGA) or read from le (see READG).
Drawings are not displayed automatically. Use DISPLAY to output the drawing to
the graphics device. We recommend FlexV.
8.5.4 Using Gaussians for lter constraints
Besides the simple computation of overlap values etc., one can think of Gaussian-based
thresholds as lters during the docking. There are two options to use Gaussians as such
lter constraints. First, the Gaussian description can be taken directly as a Gauss constraint.
Alternatively, the Gaussian description can be taken as a so-called spatial pharmacophore
constraint for FlexX-Pharm.
Use Gaussians as Gauss constraints
The Gaussian description can directly be taken as a Gauss constraint during a docking cal-
culation or after a docking calculation to delete solutions that fall below or exceed a given
overlap volume. The relevant atoms forming overlap can be given as a list of SYBYL atom
types or dened by a SMARTS
TM
expression. If the Gaussian lter is active (see section
8.5.6), FlexX will check the partially placed ligand for compatibility with the Gauss con-
straints. This is done after each incremental construction step including the base place-
ment phase! Any docking solution which fails the check will be deleted from the docking
calculation. A summary of how many solutions were rejected at each step can be seen in the
base placement and complex construction output information.
Currently, there are certain restrictions which apply with SMARTS
TM
, therefore a list of
allowed SMARTS
TM
expressions is given in section 11.13 on page 325. To use
recursive SMARTS
TM
(see subsection 11.13.7) such as [$(C(O)O)] in a batch script, please
use the following expression: " [$(C(O)O)] ". Please also note the blanks between " and !
SELGAUSS
Syntax: SELGAUSS <gauss_type> <g_index> <include> <overlap_vol>
<smarts> <expression>
Description: Selects a Gaussian as a Gauss constraint. <gauss_type> species the
Gaussian property and <g_index> species the Gaussians of the selected type.
If <include> set to y, the type of the Gauss constraint is include. Otherwise the
constraint type is exclude. <overlap_vol> species the overlap volume: O
min
or
O
max
.
If the Gauss constraint type is include, the added overlap volume of the relevant
atoms with the selected Gaussian must be equal to or greater than O
min
to comply.
8.5. LATTICE ENERGIES AND GRIDS 217
This means:
g
i
a
j
O(g
i
, a
j
) >= O
min
, (8.6)
where g
j
are selected Gaussians and a
j
are the generic atom Gaussians of the rele-
vant atoms (see section 8.5.2).
Otherwise the type is exclude; then the added overlap volume must be smaller
than O
max
; this means
g
i
a
j
O(g
i
, a
j
) < O
max
. (8.7)
If <smarts> is set to y, the relevant atoms are ltered by a SMARTS
TM
expression
to follow. Otherwise the atoms will be dened by a list of SYBYL atom types.
Requirements: Gaussians must have been computed with COMPGA or loaded with
READG.
CAND
Syntax: CAND
Description: Lists all atoms which can satisfy one or more Gauss constraints (the
CANDidates).
Requirements: A Gauss constraint must have been selected with SELGAUSS.
LISTFG
Syntax: LISTFG
Description: Lists the selected Gauss constraints (see example).
Requirements: A Gauss constraint must have been selected with SELGAUSS.
Here is an example:
Example
>> List of gauss constraint:
Id |Property|Nof G| Type | Overlap | Atoms
-----+--------+-----+---------+---------+---------
1 | N_am | 1 | include | 300.00 | N
2 | O_3 | 1 | exclude | 320.00 | [s,c]
3 | C_3 | 3 | include | 800.00 | C.3
Legend:
Id Unique identier of the Gauss constraints.
Property Name of type of the selected Gaussians (see LISTG).
Nof g Number of selected Gaussians for the Gauss constraint.
Type Type of the Gauss constraint (see SELGAUSS).
Overlap O
min
or O
max
(see SELGAUSS).
Atoms The description of the relevant atoms.
218 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
FILTER
Syntax: FILTER <detail_info> <delete>
Description: Tests a set of docking solutions against the Gauss constraints. For
each solution a table is printed out that shows what constraints have been matched.
If <detail_info> is set to y, the added overlap volume and the selected overlap
volume are given for each constraint. At the end of the list the percentage of total
docking solutions that matched each constraint is printed. Solutions that do not
obey the Gauss constraints can be permanently removed fromthe docking solutions
list by setting <delete> to y.
Requirements: A Gauss constraint must have been selected with SELGAUSS and
docking solutions must have been calculated.
DELETEFG
Syntax: DELETEFG <gauss const>
Description: Deletes selected Gauss constraints. <gauss const> species which
Gauss constraint is to be deleted. This can either be a single number, a list of num-
bers separated by blanks or a comma, a list of intervals of the form a-b, or simply
all.
Requirements: A Gauss constraint must have been selected with SELGAUSS.
DRAWFG
Syntax: DRAWFG
Description: Draws all selected Gauss constraints. Gauss spheres are labeled with
the following expression:
<gauss_prop> (<id>/<nof_g>) <constraint type> <atom selection>
<overlap1>/<overlap2>
where:
<gauss_prop> The property of the current constraint (= type of selected Gaus-
sian; see LISTFG).
<id> The index of the current Gaussian.
<nof_g> The number of selected Gaussians for the current constraint.
<constraint type> The type of the current Gauss constraint: include or exclude
(see SELGAUSS).
<atom selection> Description of the relevant ligand atoms.
<overlap1> This is the minimum contribution for an atom which lies in the Gauss
sphere. In other words: the overlap between the respective Gaussian and a
generic atom Gaussian is greater than or equal to this value (overlap1). (Please
also refer to paragraph 8.2 in section 8.5.2).
<overlap2> O
min
or O
max
(see SELGAUSS).
Requirements: A Gauss constraint must have been selected with SELGAUSS.
8.5. LATTICE ENERGIES AND GRIDS 219
Using Gaussians as spatial pharmacophore constraints
As mentioned above, a further alternative is to use the Gaussian description for spatial phar-
macophore constraints denition within FlexX-Pharm. The relevant atoms can again be
given as an element type or ltered by a SMARTS
TM
expression.
A list of allowed SMARTS
TM
expressions is given in section 11.13 on page 325.
To use recursive SMARTS
TM
(see subsection 11.13.7) such as [$(C(O)O)] in a batch script,
please use the following expression: " [$(C(O)O)] ". Please also note the blanks between "
and !
SELPHARM
Syntax: SELPHARM <gauss_type> <g_index> <constraint_type>
[<logical_term>] <radius> <smarts> <expression>
Description: Selects a Gaussian as a spatial constraint. <gauss_type> speci-
es the Gaussian property and <g_index> species the Gaussian of the selected
type. The center of spatial constraint gets the coordinates of the selected Gaussian.
<constraint_type> species the constraint type. If it is set to
1, the type is essential;
2, the type is optional;
3, the type is logical.
If the constraint type is logical, then <logical_term> species the logical term.
<radius> species the radius of the spatial constraint.
If <smarts> set to y, the relevant atoms will be specied with a SMARTS
TM
ex-
pression. Otherwise the atoms will be given through the element type.
Note: You cannot use logical and essential/optional constraints together.
Requirements: Gaussians must be computed with COMPGA or loaded with READG.
TOPHARM
Syntax: TOPHARM [<cont>] <keep_le> [<lename>] [<fulll_opt>]
[<logical_expression>]
Description: Moves the selected spatial constraints to the PHARM module (see
chapter 8.3). If pharmacophores are loaded in the PHARM module, to proceed
<cont>must be set to y. To store the pharmacophore le, <keep_le>must be set
to y. If <keep_le> is set to y, <lename> species the pharmacophore le. If
<lename>does not have the sufx ".phm", this sufx will be added automatically.
If there is at least one spatial constraint with type optional, then <fulll_opt>
species the number of optional constraints, which docking solutions must be sat-
ised (see section Partial Matching in 8.3.4). If the type of spatial constraints is
logical, then <logical_expression> species the logical expression that docking
solutions must satisfy.
Important notes: If pharmacophores are loaded in the PHARM module and
<cont> is set to y, these pharmacophores will be deleted.
Requirements: Spatial constraints must be selected with SELPHARM.
220 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
WRITESC
Syntax: WRITESC <lename> [<fulll_opt>][<logical_expression>]
Description: Writes the selected spatial constraints into a pharmacophore le with
name <lename>. If there is at least one spatial constraint of type optional, then
<fulll_opt> species the number of optional constraints which the docking so-
lutions must satisfy (see section Partial matching in 8.3.4). If the type of spatial
constraints is logical, then the <logical_expression> species the logical expres-
sion with which docking solutions must comply.
Important notes: If <lename> does not have the sufx ".phm", this sufx will
automatically be added.
Requirements: Spatial constraint must be selected with SELPHARM.
DRAWSC
Syntax: DRAWSC
Description: Draws all selected spatial constraints.
Requirements: A spatial constraint must have been selected with SELPHARM.
LISTSC
Syntax: LISTSC
Description: Lists the selected spatial constraints (see example).
Requirements: A spatial constraint must have been selected with SELPHARM.
Example
>> List spatial constraints derived from Gaussians:
essential -2.39 34.19 15.86 1.53 N
optional -3.09 34.19 14.46 1.46 [$(C(O)O)]
>> 2 spatial constraints (essential 1/optional 1)
DELETESC
Syntax: DELETESC
Description: Deletes all selected spatial constraints.
Requirements: A spatial constraint must have been selected with SELGPHARM.
8.5.5 Buriedness of active site
As described above (see section 8.5.1), a grid-based procedure can help to determine the
most buried points within an active site. The grid points are labeled with a value for the
depth within an active site (see section 8.5.1). We further refer to these labels as lattice
energies and approximate more realistic and less discrete values by a Gaussian broadening
procedure.
8.5. LATTICE ENERGIES AND GRIDS 221
CGRID (create grid)
Syntax: CGRID
Description: Creates a lattice for the active site; all lattice points are labeled with
the depth of the active site (see section 8.5.1).
Requirements: A receptor must have been loaded.
CCONSTR (create constraints)
Syntax: CCONSTR <percent>
Description: Turns the initial depth label of the lattice points to formal
energy values by multiplication with 1. These calculated energy values
of the lattice are approximated with Gauss functions (see sections 8.5.2 and
8.5.3). The approximation will be performed with respect to the depth label
of the lattice points. Only points with a depth label equal to or greater than
DEPTH_OF_LATTICE_POINTS will be considered. If min is the minimum energy
value with respect to DEPTH_OF_LATTICE_POINTS, then only energy values in
the range of [min; percent min] will be approximated with Gaussian. The range of
<percent> is [0.0; 1.0].
In the end, the procedure yields for an energy e
i
at a grid point with coordinates c
i
a value of
e
i
=
g
j
g
j
(c
i
),
where g
j
are the deduced Gaussians (see (8.1) in section 8.5.2).
The type of these Gaussians is site.
These Gaussians are further taken directly as a Gauss constraint (see section 8.5.4)
by internally performing "SELGAUSS % all y % y *" (see command SELGAUSS).
To select only a subset of all these Gaussians as a Gauss constraint, you should use
the command SELGAUSS.
Important notes: If no grid has been computed before, it will be created for the
active site.
Requirements: A receptor must have been loaded.
OVERLAP
Syntax: OVERLAP <gauss_constraint>
Description: For the selected Gauss constraint <gauss_constraint> a table is
printed which shows the added overlap volume (see paragraph 8.3 in section 8.5.2)
between the Gauss constraint and every docking solution.
Requirements: A Gauss constraint must have been selected with SELGAUSS or
dened with CCONSTR and docking solutions must be calculated.
8.5.6 Program parameters (conguration)
This section describes the required additional parameters in your conguration; the respec-
tive values can be found and modied in your conguration (see Sec. 10.1;pre-Release 3
versions: see cong.dat, below the commented title "# application GRID").
222 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Name (type): <DEFAULT_GAUSS_WIDTH> (oating point)
Description: Determines the default width that is taken for all the Gaussians.
Default value: 1.2
Reasonable range: 0.3 2.0
Name (type): <DEPTH_OF_LATTICE_POINTS> (integer)
Description: Affects the computation of Gaussian-broadened depth data (see also
p. 208) The range of the initial depth labels is 0-14. A value of 0 means that the grid
point is considered to be outside the protein. In contrast, a lattice point of depth 14
is considered to lie deep in the cavity of the active site. We recommend you should
always visualize the results of an alteration of these values. Also this parameter will
be strongly target-dependent.
Default value: 8
Reasonable range: 0 - 14
Name (type): <FILTER_GAUSSIAN> (integer)
Description: A switch for the Gaussian lter. If this value is set to 1 the lter
function is active. Otherwise the lter function is inactive.
Default value: 0
Reasonable range: 0,1
8.6. FLEXX-SCREEN 223
8.6 FlexX-Screen
FlexX-Screen is an extension of FlexX which enables speedy structure-based virtual screen-
ing. In order to run FlexX-Screen you need a special license key to activate the FlexX-
Screen module!
The FlexX-Screen module comprises three parts:
1. The evaluation of molecule properties (see Section 8.6.1),
2. Generation of interaction spots for the receptor (see Section 8.6.2),
3. Placebase caching (see Section 8.6.3).
8.6.1 Evaluating molecule properties
If you have a large library of molecules to be screened, you often do not want to dock all
of them against a particular target. Depending on the target it may make sense to rule out
molecules with unsuitable properties, i.e. compounds which are too large or too small for
the site, which are too exible or which contain certain substructures.
In the FlexX-Screen module a couple of molecular properties can be evaluated directly or
combined to form more complex logical expressions. In addition, the FILTER submenu pro-
vides the necessary commands to open a multi-molecule le, scan through the le according
to property constraints and load only those molecules into the FlexX workspace that fulll
the constraints. In addition there are a couple of predened macros for more complex l-
tering. Among these are a Lipinski-like lter rule, a toxicity lter, and a lter that rules out
reactive groups.
The following table contains the currently available molecule properties that can be calcu-
lated for a ligand:
224 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Syntax Return type Description
MOLECULAR PROPERTIES
nof_atoms numeric number of atoms
nof_heavy_atoms numeric number of non-hydrogen atoms
nof_bonds numeric number of bonds
rot_bonds numeric number of rotatable bonds
nof_rings numeric number of ring systems
max_ring_size numeric size of the largest ring
min_ring_size numeric size of the smallest ring
nof_hdon numeric number of hydrogen donors (#OH + #NH) (according to Lipin-
ski)
nof_hacc numeric number of hydrogen acceptors (#oxygen + #nitrogen) (according
to Lipinski)
ex_ia(<type>) numeric number of interactions of contact type <type>. A list of allowed
contact types is given in contype.dat.
nof_components numeric number of components
charge numeric calculates the total formal charges of the molecule
logp numeric logp value of the molecule
mass numeric calculates the molecule mass
smarts(<expression>) numeric gives the number of independent substructures in molecule
matching the given <expression>. A list of allowed SMARTS
TM
expressions is given in Section 11.13.
sas(<type>) numeric calculates the solvent accessible surface (SAS) of <type> for the
molecule. <type> can be: lipo (for lipophilic SAS), hydro (for
hydrophilic SAS) or total (for total SAS).
name(<sub>) bool checks if <sub> matches the molecule name of a compound, al-
lowed wildcard is *
The result of these properties is (mostly) numeric. The properties may be extended to logical
expressions with relations such as >, <, >= or <=. Various properties or logical expres-
sions may be connected with and, or and not for greater expression. The following table
contains a list of the combination options for properties.
Syntax Return type Description
LOGICAL EXPRESSIONS
A or B bool TRUE, if A or B is TRUE
A and B bool TRUE, if A and B are TRUE
not A bool TRUE, if A is not TRUE
COMPARING VALUES
A < B bool TRUE, if A less than B
A <= B bool TRUE, if A less than B or equals B
A > B bool TRUE, if A greater than B
A >= B bool TRUE, if A greater than B or equals B
A = B | A == B bool TRUE, if A equals B
A != B bool TRUE, if A is different from B
ARITHMETIC EXPRESSIONS
A + B numeric sum of A and B
A - B numeric difference between A and B
A * B numeric product of A and B
A / B numeric quotient of A and B
The following examples contain various logical expressions that can be evaluated:
8.6. FLEXX-SCREEN 225
Example
expression A: "mass > 200"
expression B: "smarts(c1ccccc1) > 0"
expression C: "(mass > 100) and (smarts(c1ccccc1) > 0)"
expression D: "(smarts(c1ccccc1) < 2) or (smarts(C(=O)C) > 1)"
expression E: "(mass > 200) and (mass < 400) and (smarts(NH1) < 0)"
expression F: "(mass > 200) and ((smarts(c1ccccc1) > 0) or \
((smarts(C(=O)C) > 0) and (smarts(NH1) < 0)))"
Expression A is true if the molecule mass is greater than 200.0
Expression B is true if the molecule contains at least one benzene ring
Expression C is true if the molecule mass is greater than 100.0 and the molecule contains at
least one benzene ring
Expression D is true if the molecule contains at most one benzene ring or at least two car-
bonyl groups
Expression E is true if the molecule mass is between 200.0 and 400.0 and the molecule does
not contain an amide group
Expression F is true if the molecule mass is greater than 200.0 and the molecule contains at
least one benzene ring or at least one carbonyl group and no amide group
8.6.1.1 Additional ligand commands (LIGAND submenu)
Evaluate if the ligand fullls a given expression (EVAL)
Syntax: EVAL <expression>
Description: Evaluates the given <expression>for the currently loaded molecule.
<expression> may be a basic molecular property or a complex logical expression
(see above). EVAL returns the result of the evaluation on the screen. In addition it
sets in the batch variable $(EVAL).
If <expression> is a molecular property $(EVAL) is the numeric value of the prop-
erty.
If <expression>is a (complex) logical expression $(EVAL) is either 0 or 1 depending
on whether the expression is false or true respectively.
Requirements: A ligand must have been previously loaded. In order to use the
command EVAL, you need a license for the SCREEN module!
226 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Example
for_each $(0) fromto 1 33742
ligand
read multi.mol2 $(0)
eval "((mass > 100) and (mass < 500) and (smarts(C(=O)O) > 0))"
output $(EVAL)
end
if $(EVAL) == 1
docking
selbas a
placebas 3
.....
end
endif
end_for
This script works in parallel with PVM.
8.6.1.2 Batch loop keyword INLIBRARY with molecular properties
Instead of an explicit loop with the command EVAL, you can also constrain the new
FOR_EACH loop keyword INLIBRARY with a logical expression.
Example
FOR_EACH $(idx) INLIBRARY multi.mol2 [<expression>]
LIGAND
read multi.mol2 $(idx)
END
DOCKING
...
END
END_FOR
For more details see Section 9.1.3. In order to use the batch loop keyword INLIBRARY
with a logical expression, you need a license for the SCREEN module! This script works
in parallel with PVM.
8.6. FLEXX-SCREEN 227
8.6.1.3 Menus and commands (FILTER)
The FILTER menu, a submenu of the LIGAND menu, is only available if the SCREENmodule
is activated with a valid license key.
Typing the submenu name brings you to the submenu, typing END returns you to the parent
menu. You can type commands and menu names in uppercase or lowercase letters.
The basic idea of the FILTER submenu is to go through a multi-molecule le and skip all
ligands that do not pass the given lter. So, you open a le, loop over the ligands within this
le and load only those ligands into the FlexX workspace that fulll the given expression.
The following examples show a script that works like this. The menu commands are ex-
plained directly afterwards.
The following batch script loads all ligands with a molecule mass greater than 200.0 which
contain at least one benzene ring or at least one carbonyl group and no amide group.
Example
LIGAND
FILTER
open multi.mol2
END
END
while( $(SCAN_INDEX) != -1 )
LIGAND
FILTER
SCAN "(mass > 200) and ((smarts(c1ccccc1) > 0) or \
((smarts(C(=O)C) > 0) and (smarts(NH1) < 0)))"
if $(SCAN_IDX) == -1 break
get
END
END
DOCKING
.....
end_for
LIGAND
FILTER
close
END
END
Please note that this script does not currently work in parallel with PVM.
Opening a multi-molecule le (OPEN)
Syntax: OPEN <lename>
Description: Opens a multi-molecule le. An index le is created for the molecule
le in the TEMP directory (see Section 10.1.1) and the internal molecule index
228 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
is set to 1. The internal molecule index is available through the batch variable
$(SCAN_INDEX).
Scanning in a multi-molecule le (SCAN)
Syntax: SCAN <expression>
Description: Scans the currently opened multi-molecule le for the next ligand
that fullls <expression>. <expression> must be a logical expression (see com-
mand EVAL in 8.6.1.1). The search starts with the ligand which corresponds to the
internal molecule index. If a ligand that fullls <expression> is found, the internal
molecule index is set to the corresponding index. Otherwise the internal molecule
index is set to -1. The internal molecule index is available through the batch variable
$(SCAN_INDEX).
Requirements: A multi-molecule le must have been opened.
Getting a ligand from a multi-molecule le (GET)
Syntax: GET
Description: Gets the ligand which corresponds to the internal molecule index into
FlexXs workspace. The internal molecule index is then incremented by 1.
The following operations are initiated in the following order:
1. Read the molecule from the le
2. Identify ring systems
3. Ring conformer generation (e.g. by CORINA)
4. Molecule initialization (see below)
5. Stereo descriptor and atom equivalence class computation
6. Torsion angles analysis for all acyclic single bonds
7. Interaction type and interaction geometry assignment
The molecule initialization comprises a preprocessing step, formal charge assign-
ment, an aromaticity analysis and much more. The initialization conguration can
be fully congured with the rules dened in the transform.dat static data le.
Please refer to Section 11.18 for details.
Note: If you use a verbosity level of 5 or higher, FlexX lists an overview of compo-
nents in its output (see Section 7.5.1).
Requirements: A multi-molecule le must have been opened.
Closing a multi-molecule le (CLOSE)
Syntax: CLOSE
Description: Closes the multi-molecule le which was opened with OPEN.
Requirements: A multi-molecule le must have been opened.
Setting the molecule index (SEEK)
Syntax: SEEK <index>
Description: Sets the internal molecule index to <index>.
8.6. FLEXX-SCREEN 229
Reading lter macro denitions from a le (READMACRO)
Syntax: READMACRO <lename>
Description: Reads lter macro denitions from a le. A lter macro allows com-
plex logical expressions of molecular properties to be predened. They are stored
in a simple ASCII le. You can nd examples of lter macros below. There are two
ways of dening macros.
The rst way is a general one, where all available molecule properties and relations
can be used. Each macro denition starts with the keyword @macro, followed by
the name of the macro and the expression to be evaluated. The macro name starts
with % and ends with parentheses which may contain variables. If the parentheses
contain any variable(s), the expression must contain each variable in the following
structure: { variable } . You may use known macros to dene lter macros.
The second one is a special case for long exclude lists of molecular subgroups. Each
denition starts with the keyword @excl_smarts and the name of the macro, which
starts with % and ends with a parentheses. The next lines contain the subgroups to
be excluded. Each line starts with the keyword smarts followed by a smarts expres-
sion for the molecular subgroup. The keyword end is used to end the denition.
Each loaded lter macro is available for the commands EVAL (see 8.6.1.1) and SCAN
and may be used as an expression.
Note: The static data le filter_macros.dat contains a set of lter macro def-
initions. These include a Lipinski-like lter rule, a toxicity lter, and a lter that
rules out reactive groups. In order to activate these macros, the le must be read
with this command.
Requirements: The sufx of the macro le must be .dat.
Example of a le with lter macro denitions:
Example
@macro %MASS() "mass"
@macro %NOFSUBSTR(A) "smarts({A})"
@macro %HSAS() "sas(hydro)"
@macro %LSAS() "sas(lipo)"
@macro %SUBSTR(A) "%NOFSUBSTR({A}) > 0"
@macro %AMIDE1() "%SUBSTR(NH1)
@macro %AMIDE2() "%SUBSTR([ND1H2])
@macro %BENZOL() "%SUBSTR(c1ccccc1)"
@macro %CARBONYL() "%SUBSTR(C(=O)C)"
@macro %MINM_SUB(A,B) "%MASS() > {A} and %SUBSTR({B})"
@macro %MAXM_SUB(A,B) "%MASS() < {A} and %SUBSTR({B})"
Example of a le with an exclude list:
230 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Example
@excl_smarts %excl_tox()
smarts a[N;X2]=O # aromatic nitroso
smarts CO[N;X2]=O # alkyl nitrite
...
end
The macro %excl_tox() is equal to the logical expression
NOT( (smarts(a[N;X2]=O)>0) or (smarts(CO[N;X2]=O)>0) or ...)
If reading a le with lter macro denitions, you may use the macros for EVAL and SCAN:
LEADIT/LIGAND/FILTER> readmacro macro.dat
Current process size: 49616 kB
LEADIT/LIGAND/FILTER> open bionet.mol2
>> Molecule file bionet.mol2 with 33762 compounds loaded .
Current process size: 50936 kB
LEADIT/LIGAND/FILTER> scan "(%MASS() > 200) and %CARBONYL()"
>> Compound #17 (bionet_17) matches filter constraints.
>> Ligand bionet_17 (index 17) fulfills the required property:
(%MASS() > 200) and %CARBONYL() .
Current process size: 50972 kB
LEADIT/LIGAND/FILTER> get
>> Applying transformation levels:
...
>> Ligand bionet_17 loaded from file bionet.mol2.
LEADIT/LIGAND> eval "%MINM_SUB(200,C(=O)C)"
>> Ligand bionet_17 fulfills property %MINM_SUB(200,C(=O)C)
Listing all lter macro denition (LISTMACRO)
Syntax: LISTMACRO
Description: Lists all currently loaded lter macro denitions.
8.6.2 Generating interaction spots
The FlexX-Scan approach [26] was especially designed to meet the requirements of high-
throughput structure-based virtual screening. Based on the incremental construction dock-
ing tool FlexX [22], a compact descriptor for representing favorable protein interaction spots
within the protein binding site has been developed. The descriptor is calculated using
special-purpose clustering techniques applied to the usual interaction points created by
FlexX. The algorithm automatically detects a small set of interaction spots in the binding
site for positioning ligand functional groups. The parameterizations of the base placement
and incremental construction algorithms have been adapted to the new interaction model.
8.6. FLEXX-SCREEN 231
8.6.2.1 Alternative FlexX-Scan docking approaches
The coarse-grained interaction model of FlexX-Scan requires a modication of the base frag-
ment placement algorithm of FlexX. For each base fragment, the algorithm searches for pos-
sible placements by superimposing two or three of the base fragment functional groups
with interaction dots of complementary protein functional groups. A pair of ligand func-
tional groups can be superimposed with a pair of dots of protein functional groups if the
distance of the protein interaction dots does not deviate more than 0.9 from the ligand
functional groups distance. In FlexX-Scan, protein interaction surfaces are represented by
far fewer interaction dots. In fact, the usual inter-dot distance on an interaction surface is
2 - 3 in FlexX-Scan and usually only 1.2 - 1.5 in standard FlexX. For this reason, we
increased the distance tolerance from 0.9 to 1.2 .
We implemented and investigated three different parameterizations and docking schemes
for working with FlexX-Scan. The three modes are:
1. Standard complex build-up
2. Fast complex build-up
3. Hybrid complex build-up
In standard complex build-up mode, FlexX-Scan uses the coarse-grained interaction spot
model and performs the same incremental construction method as in standard FlexX. This
includes the base fragment placement phase and the complex build-up phase. The faster
speed of this scan mode compared with standard FlexX is the result of a much faster base
fragment placement.
The fast complex build-up mode works the same way but speeds up the complex build-up
phase by using fewer intermediate placement solutions.
The hybrid complex build-up mode builds on the fast complex build-up mode. If
the estimated best achievable docking score is better than a user-dened threshold
<MIN_ENORM_BASEFRAG>, a fast complex build-up phase is performed. The value
<MIN_ENORM_BASEFRAG> is highly target-specic and must be set by the user when
working in the hybrid complex build-up mode.
In order to keep the handling of FlexX-Scan as simple as possible, we implemented a central
switching command SCANMODE in the docking submenu which can be used at any time to
switch the scan mode. (Note that this command is only available if you have a FlexX-Screen
license.)
In addition there are a couple of commands that allow you to generate and visualize the
interaction spots manually. They can all be found in the SPOTS submenu of the RECEPTOR
menu. The FlexX-Scan specic parameters are dened in chemapar.dat. They are listed at
the end of this section.
Important note: If you want to change the FlexX-Scan specic parameters, you must do
so before the scan mode is selected.
8.6.2.2 Additional or modied docking commands (DOCKING submenu)
SCANMODE
Syntax: SCANMODE <mode>
232 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Description: Activates parameterization for FlexX-Scan or activates standard
FlexX parameterization. <mode> may be
0 Standard FlexX parameterization, use of FlexX interaction dots.
1 FlexX-Scan parameterization for standard complex build-up, use of FlexX-
Scan interaction spots. The parameters <SOLUTIONS_PER_IT> and
<SOLUTIONS_PER_FRAG> are used for the placing of base fragments and
complex build-up phase.
2 FlexX-Scan parameterization for fast complex build-up, use
of FlexX-Scan interaction spots. Instead of the parameters
<SOLUTIONS_PER_IT> and <SOLUTIONS_PER_FRAG> the parameters
<SCAN_SOLUTIONS_PER_IT> and <SCAN_SOLUTIONS_PER_FRAG>
are used for the placing of base fragments and complex build-up phase.
3 FlexX-Scan parameterization for hybrid complex build-up, use
of FlexX-Scan interaction spots. Instead of the parameters
<SOLUTIONS_PER_IT> and <SOLUTIONS_PER_FRAG> the parameters
<SCAN_SOLUTIONS_PER_IT> and <SCAN_SOLUTIONS_PER_FRAG>
are used for the placing of base fragments and complex build-up
phase. In this mode only base fragment solutions with a minimum
estimated score below <MIN_ENORM_BASEFRAG> or less than
<MIN_FRACTION_BASEFRAG> percent ligand interaction centers en-
ter the complex build-up phase.
If the mode is 0, the standard receptor interaction dots are used for
the placing of base fragments and complex build-up phase. Other-
wise the receptor interaction spots are used (see GENERATE in 8.6.2.3).
In order to generate the triangle hash table (see TRIHASH in Section
7.6.9), the parameter <SCAN_TRIANGLE_BUCKET_SIZE> is used in-
stead of <TRIANGLE_BUCKET_SIZE> in mode 1-3. Also the parame-
ters <SCAN_TRIMATCH_D_EPS> and <SCAN_TRIMATCH_A_EPS> are
used during the triangle matching instead of <TRIMATCH_D_EPS> and
<TRIMATCH_A_EPS>.
Important notes: FlexX starts with mode 0. If the mode is not equal to 0 and if the
receptor interaction spots are not generated yet, the receptor spots will be generated
now. If the mode is changed from 0 to another mode or from another mode to 0,
a generated triangle hash table will be deleted. In order to use the command
SCANMODE, you need a license for the SCREEN module!
INFO
Syntax: INFO
Description: In table format, an additional line gives the base fragment score of
the docking run. In row format, there is a new option 5 for getting details on the
base fragment scoring results. The rst columns are the same as with other options
(receptor name, ligand name, number of base placement solutions). This is followed
by the base fragment score as used for virtual screening experiments, the number
of fragmentations, and the number of fragmentations with at least one solution.
Then four columns with detailed information for the highest scoring solution of
8.6. FLEXX-SCREEN 233
each fragmentation, ordered by interaction energy, follow. Each column contains
the number of the fragmentation, the interaction energy, the normalized interaction
energy, and the fraction of ligand interaction centers of this base fragment.
8.6.2.3 Menus and commands (SPOTS)
The SPOTS menu, a submenu of the RECEPTOR menu, is only available if the SCREENmod-
ule is activated with a valid license key.
Typing the submenu name brings you to the submenu, typing END returns you to the parent
menu. You can type commands and menu names in uppercase or lowercase letters.
Generating receptor interaction spots (GENERATE)
Syntax: GENERATE
Description: Calculates FlexX-Scan receptor interaction spots for the active site.
Requirements: The receptor must have been previously loaded.
Printing a table about the interaction spots (INFO)
Syntax: INFO
Description: Prints a table about the interaction spots. The columns of the table:
Spot ID The index of the interaction spot
Atom no The site atom no. to which the spot belongs
Atom name The site atom name
Spot type The type of interaction spot
Nof clusters nof clusters for this interaction spot
Nof dots nof dots for this interaction spot
Accessibility
Requirements: The interaction spots must have been calculated.
234 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Example
Site interaction spots:
======================
Spot | Atom | Atom | Spot | Nof | Nof | Access- |
ID | nr | name | type | clusters | dots | ibility |
-----+------+------+-------+----------+------+---------+
0 | 34 | O | hacc | 1 | 12 | 0.13 |
1 | 44 | N | hdon | 1 | 69 | 0.11 |
2 | 47 | O | hacc | 3 | 63 | 0.17 |
3 | 160 | NE1 | hdon | 1 | 50 | 0.32 |
4 | 185 | O | hacc | 1 | 18 | 0.46 |
5 | 200 | OD1 | hacc | 1 | 21 | 0.08 |
6 | 201 | OD2 | hacc | 2 | 31 | 0.17 |
7 | 205 | O | hacc | 1 | 16 | 0.22 |
8 | 213 | O | hacc | 1 | 13 | 0.35 |
9 | 248 | NZ | hdon | 1 | 54 | 0.75 |
10 | 248 | NZ | hdon | 1 | 54 | 0.53 |
11 | 248 | NZ | hdon | 1 | 54 | 0.79 |
12 | 348 | O | hacc | 1 | 12 | 0.27 |
13 | 360 | OG1 | hacc | 3 | 71 | 0.19 |
14 | 388 | O | hacc | 5 | 160 | 0.57 |
15 | 390 | OG | hacc | 3 | 76 | 0.37 |
16 | 390 | OG | hdon | 1 | 91 | 0.25 |
17 | 394 | O | hacc | 4 | 117 | 0.55 |
18 | 412 | NH1 | hdon | 1 | 103 | 0.70 |
19 | 412 | NH1 | hdon | 1 | 73 | 0.58 |
20 | 413 | NH2 | hdon | 1 | 103 | 0.68 |
21 | 413 | NH2 | hdon | 1 | 91 | 0.44 |
22 | 424 | O | hacc | 1 | 22 | 0.33 |
23 | 449 | NH1 | hdon | 1 | 51 | 0.15 |
24 | 450 | NH2 | hdon | 1 | 70 | 0.23 |
25 | 728 | O | hacc | 3 | 102 | 0.20 |
26 | 774 | OH | hdon | 1 | 89 | 0.15 |
27 | 1317 | O | hacc | 3 | 74 | 0.29 |
28 | 1317 | O | hdon | 1 | 39 | 0.33 |
29 | 1319 | O | hacc | 2 | 19 | 0.06 |
30 | -- | -- | hphob | 1 | 18 | 0.51 |
31 | -- | -- | hphob | 1 | 16 | 0.59 |
32 | -- | -- | hphob | 1 | 18 | 0.19 |
33 | -- | -- | hphob | 1 | 16 | 0.59 |
34 | -- | -- | hphob | 1 | 40 | 0.52 |
35 | -- | -- | hphob | 1 | 18 | 0.20 |
36 | -- | -- | hphob | 1 | 16 | 0.44 |
37 | -- | -- | hphob | 1 | 14 | 0.71 |
38 | -- | -- | hphob | 1 | 10 | 0.79 |
39 | -- | -- | hphob | 1 | 14 | 0.16 |
40 | -- | -- | hphob | 1 | 34 | 0.29 |
...
Number of interaction spots of site 4dfr:
H-Donor: 14 spots
H-Acceptor: 35 spots
Metal: 0 spots
Hydrophobic: 50 spots
8.6. FLEXX-SCREEN 235
Setting administration defaults for drawing ia-spots (SELADM)
Syntax: SELADM <graphics object number> [<start fo object>] [<end fo
object>] <temp le> <append>
Description: With SELADM you can specify the graphics object numbers used for
drawing interaction spots and you can determine whether the graphics les are
internal temporary les used only by FlexX or saved for further use. For yes/no
questions you can enter either y, yes or 1 for yes, and similarly n, no or 0 for
no.
<graphics object number> Enter integer:
(1255) The graphics created with the DRAW command will be displayed in
graphics object <graphics object number>.
0 fo mode the graphics generated by subsequent DRAW commands will be
sent to a range of graphics objects in a rst-drawn-rst-overwrite manner.
You will be asked to enter two more parameters:
<start fo object> The start graphics object for the fo range.
<end fo object> The end graphics object for the fo range.
<temp le> Yes/no answer:
yes The graphics are written in temporary les and removed after quitting
FlexX.
no The graphics are written to permanent les chosen by the user. You will be
asked for a lename at the end of each DRAW command (see DRAW below
for example).
<append> Yes/no answer:
yes Previous graphics les are not overwritten. Instead the current graphics
are appended to the previous one in the graphics le.
no The previous graphics le will be overwritten and all previous graphics
made with the DRAW command in this menu will be lost.
Setting default values for drawing ia-spots (SELGRA)
Syntax: SELGRA <all ia spot typs> [<spot type selection>] <dots> <spots>
Description: With SELGRA you can set specic default values for interaction spot
visualization. You can dene what types of interaction spots are visualized, and
whether interaction spots and/or dots are drawn. For yes/no questions you can
enter either y, yes or 1 for yes, and similarly n, no or 0 for no.
<all ia spot typs> Yes/no answer:
yes Spots or dots for all interaction spot types are drawn if <all ia spot typs>
is set to yes.
no Spots or dots are drawn for a selection of interaction spot types. You will
be asked to select types from a given list:
<spot type selection> Choose a list of types represented by integers. En-
ter the list as separate integers or as integer ranges (format a b) sep-
arated by , or blanks. Note that you need to enclose the expression in
quotation marks if it contains blanks, e.g. 0, 2.
236 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
<dots> Yes/no answer:
yes Draw the dots of the interaction spot.
no Do not draw dots.
<spots> Yes/no answer:
yes Draw the interaction spots.
no Do not draw spots.
Selecting the coloring mode for drawing interaction spots (SELCOL)
Syntax: SELCOL <hphob spot scaling> <hphil spot radius> <hdonor spots>
<haccept spots> <metal spots> <hphop spots>
Description: With SELCOL you can set the color modes and the size of sphere for
interaction spot drawing.
<hphob spot scaling> Scaling the spheres of hydrophobic spots. The radius of
spheres representing hydrophobic interaction spots corresponds to the num-
ber of hydrophobic interaction dots clustered together at this spot. This factor
times the number of clustered interaction dots gives the radius for the sphere.
<hphil spot radius> Radius of spheres of hydrophilic interaction spots.
The possible color modes are explained below.
<hdonor spots> Choose the color mode for drawing spots or dots of hydrogen
donors.
<haccept spots> Choose the color mode for drawing spots or dots of hydrogen
acceptors.
<metal spots> Choose the color mode for drawing metal spots or dots.
<hphob spots> Choose the color mode for drawing hydrophobic spots or dots.
Enter your chosen color as either an angle from the color circle (0 360 degrees: 0
is invisible, 1360 runs from dark blue, through red, yellow, green to blue), a color
name (as dened in the GRAPHIC static data le), or an RGB(A) value; 3 (4) oating-
point numbers separated by blanks or slashes:
Example
selcol .... "dark green" ....
selcol .... green ....
selcol .... "0.0 0.8 0.1" ....
selcol .... 0.0/0.8/0.1 ....
selcol .... 220 ...
(For a full explanation of how to dene colors see Sec. 11.22.1.)
Drawing the interaction spots (DRAW)
Syntax: DRAW [<lename>]
Description: DRAW generates a drawing of the interaction spots and sends it to a
le ready to be displayed on the graphics interface. For details about what exactly
is drawn see the SELGRA command.
8.6. FLEXX-SCREEN 237
[<lename> ] If the graphics are not to be stored in a temporary le (see SELADM),
enter the lename for storing the graphics here.
Requirements: The interaction spots must have been calculated.
Important notes: Drawings are not displayed automatically. Use DISPLAY to out-
put the drawing on the graphics device.
8.6.2.4 FlexX-Scan specic parameters (conguration)
Important note: If you want to change the FlexX-Scan specic parameters, you must do
so before the scan mode is selected (see Section 8.6.2.2).
Generating interaction spots
Name: <MAX_HPHOB_IA_DOT_DIST> (oating point)
Description: FlexX-Scan identies hydrophobic spots by rst detecting intersec-
tions of spheres around different hydrophobic receptor groups. Spheres are sam-
pled as sets of interaction dots. If two dots of two neighboring spheres are closer
than MAX_HPHOB_IA_DOT_DIST they represent a part of the sphere intersection.
The centroid of the dot pair is retained for later clustering.
Default value: 1.2
Reasonable range: 0.0 10.0
Name: <MIN_HPHOB_HPHIL_DOT_DIST> (oating point)
Description: If this distance threshold is set to a value greater than zero, all hy-
drophobic spots in close proximity to hydrophilic spots will be discarded.
Default value: 0.0 (deactivated)
Reasonable range: 0.0 10.0
Name: <HPHOB_SPOT_DIST> (oating point)
Description: After detecting intersections of spheres around hydrophobic receptor
groups, the resulting dot pair centroids (cf. variable MAX_HPHOB_IA_DOT_DIST)
will be clustered by means of a complete linkage clustering algorithm. This param-
eter represents the minimum inter-cluster distance to be reached. It corresponds to
the desired minimum distance of hydrophobic spots.
Default value: 3.0
Reasonable range: 0.0 10.0
Name: <HPHOB_SPOT_MIN_NOF_DOTS> (integer)
Description: If a cluster of dot pair centroids (this is the variable
MAX_HPHOB_IA_DOT_DIST) contains fewer dot pair centroids than speci-
ed here, it will be discarded.
Default value: 8
Reasonable range: 1 100
Name: <HPHIL_SPOT_MIN_NOF_DOTS> (integer)
Description: If a hydrophilic spot (donor or acceptor) represents fewer interaction
dots than specied here, it will be discarded.
238 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Default value: 3
Reasonable range: 1 100
Name: <SAREA_SPOT_MAX_RADIUS> (oating point)
Description: Interaction dots of spherical rectangular hydrogen acceptor groups
are divided into between zero and usually three clusters, depending on how many
clusters are necessary in order to have a radius for each cluster that is lower than
specied by this variable.
Default value: 0.7
Reasonable range: 0.0 10.0
Name: <HACC_CENTER_SPOT_MAX_RADIUS> (oating point)
Description: All interaction dots deviating less fromthe main interaction direction
of a hydrogen acceptor cone geometry than specied here are grouped together and
represent the center spot of this hydrogen acceptor group.
Default value: 1.2
Reasonable range: 0.0 10.0
Name: <HACC_OUTER_SPOT_MAX_RADIUS> (oating point)
Description: After grouping all central interaction dots of a hydrogen acceptor
with cone geometry, the remaining dots are clustered into between zero and usu-
ally four sets. The radius of each cluster must not exceed the value specied here.
Default value: 1.6
Reasonable range: 0.0 10.0
Name: <METAL_SPOT_MIN_DIAMETER> (oating point)
Description: The interaction dots of interaction spheres of receptor metals are clus-
tered by mean of a complete linkage clustering algorithm. This variable represents
the minimum distance of the resulting dot clusters.
Default value: 2.8
Reasonable range: 0.0 10.0
Name: <ACCESSIBILITY_PROBE_RADIUS> (oating point)
Description: A probe with the specied radius is placed on each interaction dot
and spot for assessing its solvent exposure. The solvent exposure corresponds to
the solvent accessible fraction of the probe sphere surface.
Default value: 6.0
Reasonable range: 0.0 10.0
Name: <HPHOB_MAX_ACCESSIBILITY> (oating point)
Description: All hydrophilic and metal interaction dots and spots with a solvent
exposure exceeding this threshold will be discarded.
Default value: 1.0 (deactivated)
Reasonable range: 0.0 10.0
Name: <HPHIL_METAL_MAX_ACCESSIBILITY> (oating point)
Description: All hydrophilic and metal interaction dots and spots with a solvent
exposure exceeding this threshold will be discarded.
Default value: 1.0 (deactivated)
Reasonable range: 0.0 10.0
8.6. FLEXX-SCREEN 239
Docking
Name: <SUFF_NOF_SOLUTIONS> (integer)
Description: Minimum number of base fragment placement solutions when using
the triangle matching algorithm. If there are fewer solutions, FlexX automatically
performs the line matching algorithm.
Default value: 8
Reasonable range: 1 300
Name: <SCAN_SUFF_NOF_SOLUTIONS> (integer)
Description: Minimum number of base fragment placement solutions when using
the triangle matching algorithmwith FlexX-Scan. If there are fewer solutions, FlexX-
Scan automatically performs the line matching algorithm (scan modes 1 through 4,
see Section 8.6.2.2).
Default value: 200
Reasonable range: 1 500
Name: <SCAN_TRIANGLE_BUCKET_SIZE> (oating point)
Description: FlexX-Scan uses a larger binning scheme for the triangle hash table
(scan modes 1 through 4, see Section 8.6.2.2).
Default value: 1.2
Reasonable range: 0.0 3.0
Name: <SCAN_TRIMATCH_D_EPS> (oating point)
Description: FlexX-Scan allows larger distance deviations from the ideal interac-
tion geometries.
Default value: 1.0
Reasonable range: 0.0 3.0
Name: <SCAN_TRIMATCH_A_EPS> (oating point)
Description: FlexX-Scan allows larger angle deviations from the ideal interaction
geometries.
Default value: 0.2 rad
Reasonable range: 0.0 0.5 rad
Name: <SCAN_SOLUTIONS_PER_IT> (integer)
Description: Number of overall best intermediate solutions retained during fast
complex build-up with FlexX-Scan (scan modes 2 and 3, see command SCANMOD in
Section 8.6.2.2).
Default value: 200
Reasonable range: 1 1000
Name: <SCAN_SOLUTIONS_PER_FRAG> (integer)
Description: Number of fragmentation-specic best intermediate solutions re-
tained during fast complex build-up with FlexX-Scan (scan modes 2 and 3, see Sec-
tion 8.6.2.2).
Default value: 75
Reasonable range: 1 1000
240 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Name: <MIN_ENORM_BASEFRAG> (oating point)
Description: Threshold value for estimated binding energy after base fragment
placement. This value is highly target-specic and must be set by the user when
working in scan mode 3 (see Section 8.6.2.2).
Default value: -40.0 (on scale of used scoring function)
Reasonable range:
Name: <MIN_FRACTION_BASEFRAG> (oating point)
Description: Base fragment solutions with an insufcient estimated score are still
retained if the base fragment contains a very low fraction of the compound inter-
action centers. This threshold species the required minimal fraction of compound
interaction centers for a bad scoring base fragment to be discarded.
Default value: 0.2
Reasonable range: 0.0 1.0
8.6.3 Placebase caching
Depending on the library of molecules, more than half of the complete docking time is used
for the initial placement of the base fragments. On the other hand, it is quite likely that
base fragments occur multiple times in a large library of molecules. So, the basic idea of
the placebase caching algorithm (PBC) is to store the poses of all placed base fragments and
reuse them if the same base fragments recur during screening.
PBC uses a client-server architecture for caching the placements. The server and the clients
use the Transmission Control Protocol (TCP) for the communication. One or more FlexXin-
stance(s) serve as server and store all base fragment poses. Before a docking client starts to
place a new base fragment it asks the server if there is already a list of poses available. Only
if this is not the case does the docking client compute the base fragment poses and pass them
to the server before building up the whole complex.
The PBC module distinguishes the base fragments by generating their USMILES taking into
account the chirality of the molecules. All comparisons are therefore performed using the
corresponding USMILES descriptor.
Note that the PBC module is only available if you have a FlexX-Screen license.
8.6.3.1 Conguring PBC
At present, this is only available for Linux.
!
First of all you must please dene (a list of) server(s) in the conguration (cp. Sec. 10.1) us-
ing File -> Global Preferences -> UltraHTS Configuration or the keyword
@PBC_SERVER in Parameters & Flags (see Section 10.1.6).
Each line contains the denition of one server. It consists of the host name (or IP address)
where the server will be executed, and the port which will be used for client-server commu-
nication. The range from 3000 to 65000 is available as the port entry.
8.6. FLEXX-SCREEN 241
Example
# -------------------------------------------------------------------
@PBC_SERVER
# -------------------------------------------------------------------
# DESCRIPTION:
# This section is for PBC module only. It describes where the pbc servers
# are executed. After each host name, the port which will be used for
# communication follows.
# -------------------------------------------------------------------
alpha 10000 # internal index 0
beta 10000 # internal index 1
gamma 10000 # internal index 2
Please note that you must not change the order of the servers in your conguration if you
want to reuse the cached base placements for another docking run as otherwise FlexX will
not be able to recover the cached base placements. For details of why this is so, please see
below. The servers are internally numbered serially according to the order in your congu-
ration starting with zero. This number is referred to as a server index below.
In addition FlexX needs a remote login shell in order to be able to start the server on an-
other machine. This must also be dened in your conguration (keyword SERVER_RSH, see
Sections 10.1.3 and 10.1). For example, we at BioSolveIT use the secure shell (ssh).
Finally you must dene a directory where FlexX should put the data les containing the
cached base fragments. The path is specied in the CACHING entry in your conguration..
Please note that these output les may become very large and that the server needs fast
access to the les. We therefore recommend using a local path on the server machine for
CACHING. Different servers do not share these les!
After conguring FlexX like this you can use the placebase caching algorithm (PBC) very
easily: you only need to set the ag PLACEBAS_CACHING to 1 in your conguration, start
the PBC server (see Section 8.6.3.2), and use FlexX as before. Everything else will be done
automatically. You do not need to change your batch scripts.
Please note the following:
1. The ag PLACEBAS_CACHING can only be set to 1 if a FlexX-Screen license is available.
2. For consistency reasons the ag PLACEBAS_CACHING is constant, i.e., it cannot be
changed interactively at the commandline or in a script. You must please modify your
conguration and restart FlexX.
3. You cannot use (read/write) placement description les (see Section 7.8.7) in PBC mode
(PLACEBAS_CACHING = 1).
The database with the base fragment poses is dumped to the le system and you can reuse
it for another docking into the same target.
The PBC server creates database les for storing the base fragment poses in the CACHING
directory (see Section 10.1.1). The names of these data les have the following structure:
CACHEDB.<hostname>_<port>_<internal_server_index>.<no>
A single database le has a maximum size of 1.1 GB. If this size limit is exceeded another
database le is automatically created and <no> is increased by one. If a PBC server is shut
242 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
down it dumps the hash table with accession indexes into an ASCII le. The names of this
index le have the following structure:
hashtable_<hostname>_<port>_<internal_server_index>.idx
If a PBC server is started it searches in the directory CACHING for these hash index les
and loads them - if they are present - into its main memory. This allows the dumped base
fragment poses to be reused.
Please note the following: Also for consistency reasons FlexX is very strict when reloading
dumped base fragment poses. The les must be loaded with the same servers in the same
order, so do not change the servers in your conguration if you want to resume a docking.
So what should you do if one of the former hosts is no longer available? Imagine, for ex-
ample, that the host beta is not reachable, but you want to reuse the data that was dumped
by the PBC server on beta on another machine delta. In this case you must replace the
server beta by delta in your conguration like this (note that we changed the name AND
port for demonstration purposes!):
Example
# --------------------------------------------------------------------
@PBC_SERVER # this in the Ultra HTS Configuration tab!
# --------------------------------------------------------------------
alpha 10000 # internal index 0
delta 20000 # internal index 1
gamma 10000 # internal index 2
Then you must copy (or rename) the hash index le in the CACHING directory from
hashtable_beta_10000_1.idx to hashtable_delta_20000_1.idx. (If you use a
local path for CACHING, you must copy the db les and the hash index le from beta to
delta.) Now the server on delta can read the data that originally stemmed from beta.
If you do not want to reuse the base fragment poses or if the target has changed, you must
delete the database les and the hash index le in the CACHING directory.
Note that you cannot reuse base fragment poses for a different target, because the position
in space will be wrong!!! If FlexX does not nd any matches to the protein it will skip the
pose and place the base fragment directly. Thus, a wrong target will not usually produce
incorrect results, but the speed gain will have been lost.
For special situations there are also some additional menu commands; these can all be found
in the PBC submenu.
8.6.3.2 Starting a PBC server
There are three ways to start a PBC server:
1. If you execute the command PBC/SERVER the current FlexX instance will become a PBC
server. You must specify the port. The default is 3768. In order to use this server,
you must add the host name and the port to the @PBC_SERVER section of your con-
guration used by the docking clients. This server uses the internal index 0. It must
therefore be the rst entry in the list.
8.6. FLEXX-SCREEN 243
2. With the command PBC/STARTSERV you can manually start the servers specied in the
@PBC_SERVER section of your conguration.
3. If the ag PLACEBAS_CACHING is set to 1 and you are running FlexX in PVM mode (i.e.
PVM is running), then FlexX will start all servers from the @PBC_SERVER entries in
your conguration automatically at the beginning of the PVM run. FlexX will wait
until all servers return the status ready. But if not all servers return the status ready
before a given timeout, the PVM run will be aborted. The timeout can be set with
PBC_WAIT_ITER. Otherwise the PVMrun will start. After nishing, the PVMrun will
also automatically shut down the servers. This of course requires correct conguration
of the PBC module (see Section 8.6.3.1).
If a PBC server is started, it searches in the directory CACHING (see Sections 10.1.1 and
8.6.3.1) for its respective hash index le and loads this le into its working space. If you
start a server with the PBC/STARTSERV command or if it is started automatically in PVM
mode, the PBC server creates a log le in the TEMP directory (see Section 10.1.1). The name
of the log le is generated as follows:
out_caching_server_<hostname>_port_<port>_ind
_<internal_server_index>_<date>_<tome>_<process-id>.log
You can control the amount of output in the log le with the verbosity ag in the
Parameters & Flags tab of the conguration dialog. Files that are older than 7 days
are automatically removed if a new server is started.
8.6.3.3 Shutting down a PBC server
Servers that were started with the commands SERVER or STARTSERV respectively must be
stopped manually with the command SERVERCOM and the parameters <internal index>
and @ShutDownServer. You can shut down all servers dened in the @PBC_SERVER sec-
tions at once with this command:
LEADIT/PBC> SERVCOM all @ShutDownServer
8.6.3.4 Menus and commands (PBC)
The PBC menu is only available if the SCREEN module is activated with a valid license key.
Typing the menu name brings you to the menu, typing END returns you to the parent menu.
You can type commands and menu names in uppercase or lowercase letters.
Using FlexX as a PBC server (SERVER)
Syntax: SERVER <id>
Description: This command enables the current FlexX instance to act as a server.
The server will communicate with clients one by one. It will save the poses of placed
base fragments and return those if requested. It also keeps track of the consistency
of the database by ensuring that only the server has permission to save and read
from it. You must specify the port <id>. The default value is 3768.
Requirements:
244 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
Important notes: Clients require the host name or the IP address of the server.
While starting the server, several addresses are given, also the name and the IP
address of the server. Ensure you take the name provided in the info and insert this
information with the port <id> in correspondence to the @PBC_SERVER entry of
your conguration (see Section 10.1.6).
Sending a command to one or more PBC servers (SERVCOM)
Syntax: SERVCOM [<id>] [<hostname> <port>] <COMMAND> [<indexle>]
Description: SERVCOM sends a command to a (set of) server(s). If servers are
dened in the @PBC_SERVER conguration you can specify the server with <id>.
If <id> is set to all, <COMMAND> is sent to all servers. Otherwise if <id> is set
to 0 or no servers are dened in @PBC_SERVER, you have specied <hostname>
and <port>.
The following <COMMAND>s are available:
@ShutDownServer Shut down a server
@ServerReadHT Load a hash table from <indexle>
@ServerSaveHT Save the current hash table to <indexle>
Requirements: None.
Important notes: The @ShutDownServer argument is the only way to shut the
server down correctly.
Starting all PBC servers from the list (STARTSERV)
Syntax: STARTSERV
Description: The command STARTSERV starts all servers listed in the
@PBC_SERVER conguration (see Sections 10.1 and 10.1.6).
Requirements: A list with servers must be dened in the @PBC_SERVER congu-
ration.
Checking all PBC servers from the list (PING)
Syntax: PING
Description: The command PING checks if all servers listed in the @PBC_SERVER
conguration (see Section 10.1.6) are running.
Requirements: A list with servers must be dened in the @PBC_SERVER congu-
ration.
8.6.3.5 PBC-specic parameters (conguration)
Name: <LEN_TCP_LISTEN_QUEUE> (integer)
Description: <LEN_TCP_LISTEN_QUEUE> is dened as the maximum value
for the sum of the two connection queues of TCP protocol.
Default value: 100
Reasonable range: 1 100
8.6. FLEXX-SCREEN 245
Name: <PBC_WAIT_ITER> (integer)
Description: <PBC_WAIT_ITER> is the maximum number of steps FlexX will
wait before timing out (see Section 8.6.3.2). In each step FlexX waits two seconds.
So if <PBC_WAIT_ITER> is set to 150, FlexX will wait overall up to 5 minutes to
verify that the PBC servers are running.
Default value: 300
Reasonable range: 100 1000
8.6.4 Compatibility with other modules
FlexX-Screen is compatible with the other FlexXmodules such as FlexE, FlexX
c
and FlexX-
Pharm.
246 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
8.7 PPI - The Pair Potential Interface
The Pair Potential Interface is a generic interface enabling users to plug in any scoring func-
tion which is built up by scoring pairwise interactions. A prominent example of this type of
scoring is the DrugScore family of potentials.
The PPI does not require an additional license. Depending on the scoring function you
would like to use, a separate, third party license may be necessary, e.g., if you would like to
use DrugScore
CSD
, a CSD license is necessary. You can recognize whether the PPI is com-
piled into your executable by FlexX uttering [PPI] at the start-up screen in your console.
To use the PPI, you will need four les which will be described in the following sections.
Note: The denitions in ppi_rec_templ_types.dat currently refer only to standard
amino acid residues. For hetero groups and non-standard-residues the denitions in
ppi_rec_substr_types.dat are used. If the protein is read from mol2, which is gener-
ated by FlexX, then the PPI types are used, which are dened in the @<TRIPOS>ALT_TYPE
section of the mol2. Otherwise no PPI types are used!
8.7.1 File 1: PPI atom types for the receptor
Make an entry PPI_REC_TEMPL_TYPES in your cong le pointing to the dat le, e.g.:
PPI_REC_TEMPL_TYPES static_data/amino_ppi.dat
The PPI types will be assigned to the respective atoms using templates. Therefore, ev-
ery PPI_REC_TEMPL_TYPE le needs a list of templates (see amino.dat for up-to-date
template denitions). For every template, the non-H atoms are listed and assigned to the
PPI type. Please note that only PPI types may be used here which are also dened in the
PPI_POTENTIAL le (cp. below).
Here is an example using the (Sybyl) atom types as PPI types:
@template alan+
_n N.am
_ca C.3
_c C.2
_o O.2
_cb C.3
For water particles you have to set the template
@particle water O.1
To ease your pain in setting up this le, the menu DATABASE has a command AMINO4PPI
(cp. below, p. 249). This command generates a le with a list of templates including its non-
H atoms which are currently loaded in FlexX. As an option, you can select whether FlexX
should write the element type, the (Sybyl) atom type, the atom name, or simply nothing as
the second entry in the list.
8.7.2 File 2: PPI types for small molecules, PPI_LIG_TYPES
This le is especially for the ligands you would like to dock. Make an entry
PPI_LIG_TYPES in your cong le pointing to the dat le, e.g.
8.7. PPI - THE PAIR POTENTIAL INTERFACE 247
PPI_LIG_TYPES static_data/ligand_ppi.dat
This le contains the PPI atom types for the ligands. The PPI types will now be assigned to
the molecules using subgraphs (in contrast to the pre-dened templates above). Therefore,
the PPI_LIG_TYPES le contains a list of subgraphs (cp. Sec. 11.10.2). As above, only PPI
types may be employed which are dened in the le specied by PPI_POTENTIAL (cp.
below). Here are example entries, again using Sybyl atom types as PPI type names:
@subgraph 0 10 N.3
atom 1 N.3
data
N.3
end
or
@subgraph 0 40 amidino_arginine_nitrogen
atom 1 N.
*
atom 2 C.cat
bond 1 2 un
data
N.pl3
end
8.7.3 File 3: PPI types for hetero groups and non-standard-residues,
PPI_REC_SUBSTR_TYPES
This le is especially for the hetero groups and non-standard-residues without templates in
amino.dat. Make an entry PPI_REC_SUBSTR_TYPES in your cong le pointing to the
dat le, e.g.
PPI_REC_SUBSTR_TYPES static_data/hetero_ppi.dat
The le have the same structure as the PPI_LIG_TYPES le.
8.7.4 File 4: The PPI potential le itself
This le is specied in your cong le using an entry PPI_POTENTIAL, e.g.:
PPI_POTENTIAL static_data/pot_ppi.dat.
The PPI potential le may have four sections:
Section No.1 ("@ppi_potential_name") contains the the name of the potential. The
name will be used as identiyer and consistent check for the PPI types in a MOL2
receptor le:
# Name of the potential will be used as identifiyer and consistent
# check for the PPI types in a MOL2 receptor file.
@ppi_potential_name <insert_your_potential_name_here>
Section No.2 ("@ppi_atom_types") contains a list of all valid PPI types. In
amino_ppi.dat, hetero.dat or ligand_ppi.dat, respectively, only types may
be used which are dened here:
248 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
@ppi_atom_types
C.3 C.2 C.1 C.ar C.cat
...
Section No.3 ("@ppi_pair_potentials") contains the pair potentials to be em-
ployed in PPI scoring. They are dened using two types and associated values (cp.
below).
The three values following the keyword @ppi_pair_potentials are:
minimum width, maximum width, bin width. The number of sampling points
for which values are expected naturally occurs as (max - min) / width.
The following lines, besides the ligand and receptor PPI types contain a value for ev-
ery sampling point. The very rst value species the energy attributed to distances
between min and min+bin width Here is an example:
# PPI pair potentials: The first three parameters are
# minimum value, maximum value, and bin width.
# Therefore, (max - min)/width values are expected,
# the first one is for distances between min and
# min+width. All potential values must be written
# in a single line preceded by the two ppi atom types.
# If there is no pair potential for an atom type defined,
# no contribution is expected without warning.
@ppi_pair_potentials 1 6 0.1
# LIG # REC
C.3 C.3 0 20.276 ... # (and 48 further values)
...
So, this example describes a C.3 C.3 interaction; it starts at 1, ends at 6, and goes
in steps of 0.1. The value between 1 and 1.1 is 0. The value at 1.2 is 20.276 and so
forth.
To activate PPI scoring, the scoring parameter G_ppi_pairs needs to be activated in
geometry.dat.
Section No.4: A scoring scheme which employs SAS weighting, the so-called PPI-SAS
Scoring, needs the section @ppi_single_potentials. The latter contains the nec-
essary potential specication. The syntax is analog to the above: The three values after
@ppi_single_potentials are: minimum width, maximum width, and bin width.
As above, the expected number of values is (max - min) / width. The following lines
contain ligand or receptor PPI atom types. A dash (-) is used as the void notation.
For every sample point, a value is expected. Here is an example:
# PPI single potentials: The first three parameters are
# minimum value, maximum value, and bin width.
# Therefore, (max - min)/width values are expected, the
# first is for distances between min and min+width.
# All potential values must be written in a single
# line preceded by a PPI atom type and a -.
# If there is no single potential for an atom type,
8.7. PPI - THE PAIR POTENTIAL INTERFACE 249
# no contribution is expected without warning.
@ppi_single_potentials 0 40 1
# Lig Rec
.3 - -4124.39 ... # (and 39 further values)
- C.3 -3735.98 ...
To activate the PPI-SAS Scoring, the scoring parameter G_ppi_sas in geometry.dat needs to
be activated.
8.7.5 Generating generic amino PPI static data les (AMINO4PPI)
Syntax: AMINO4PPI <lename> <mode>
Description: Generates a generic amino PPI static data le. The generated le will
contain a list of all currently known templates for non-hydrogen atoms.
<lename> species the name of the le to be generated. The default output path
path is the path specied as PREDICT in the conguration le (config
*
.dat).
<mode> species the second entry for each atom which can be one of:
(1) use short element name
(2) use the atom type from template
(3) use atom template name
(4) No second entry is used
Note: This command is to help generating an template amino static data le for
specication of a PPI_REC_TEMPL_TYPES le.
8.7.6 Activating PMF/DrugScore Support with Old Files
Several years ago, FlexX was shipped with DrugScore/PMF static data les.You can techni-
cally re-activate the respective scoring functions within the PPI framework, however, please
note that BioSolveIT does not take any responsibility for cases in which you do not have
a potentially necessary third-party permission to do so. For example, using DrugScore
CSD
,
you may need a CSD license. Technically, here is what you need to do:
1. Create an pendant to amino_pmf:
Technically, you would be able to read the ancient amino_pmf.dat le, however,
you will get many warnings and error messages; nally, there is no guarantee that all
receptor atoms actually receive a valid PPI type (due to missing templates in the old
amino_pmf.dat).
To ease the pain we suggest you use DATABASE/AMINO4PPI to create a new amino-
PPI type le using Sybyl atom types as the second entry. This command will dump
the currently valid amino templates in the necessary syntax; with slight adjustments
thereafter, this serves as a good starting point:
DATABASE/AMINO4PPI amino_ppi.dat 2
250 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
We have nowcreated a le amino_ppi.dat which will need some other adjustments:
To re-create a DrugScore environment, edit the created le such that
a) all metal templates have the type Met
b) the type for non-metals in all templates of co-factors (NAD, HEM, NAP, ...) are Du,
and that the type for iron in HEM is Met, too.
2. Adapt cong le:
To read the old/modied les into PPI, add the following lines to your cong le
(config
*
.dat):
PPI_REC_TEMPL_TYPES static_data/amino_ppi.dat
PPI_REC_SUBSTR_TYPES static_data/ligand_pmf.dat
PPI_LIG_TYPES static_data/ligand_pmf.dat
PPI_POTENTIAL static_data/pot_pmf.dat
This points your conguration to use the new static data le and the actual energy
les. Next, we will have to make minor syntax adjustments.
3. Adapt section names in pot_pmf.dat:
Add @ppi_potential_name DrugScore to the le.
Finally, please modify
@pmf_atom_types to @ppi_atom_types
@pmf_pair_potentials to @ppi_pair_potentials
@pmf_single_potentials to @ppi_single_potentials
4. Activate PPI scoring:
Finally, in your GEOMETRY le (geometry
*
.dat), please activate the scoring param-
eters
G_ppi_pairs (previously: G_pmf_pairs) and
G_ppi_sas (previously: G_sas).
Please see page 303 for details on how to do this.
Restarting FlexX will lead to using the traditional DrugScore scoring function.
8.7.7 Interesting PPI Potentials
DrugScore The original Gohlke DrugScore function is no longer available from Bio-
SolveIT. If you still want to access DrugScore in this form, you should please work with
the ancient les as mentioned above.
The original function had been set up using a pair potential and a SAS term. Both terms
were weighted 1:1 in geometry
*
.dat.
Seen from today, we do not recommend to use this function as a default, because com-
pared to the newer DrugScore variants - it has been derived from a comparably thin
8.7. PPI - THE PAIR POTENTIAL INTERFACE 251
database at that time. Also the SAS term has not shown signicant improvements in the
past.
DrugScore
CSD
as by Velec et al. This potential function has originally been derived from
the CSD by Velec during his doctorate at Marburg University. Because it relies on 50 bins
only, we do not recommend to use this version in FlexX docking studies, but instead to use:
DrugScore
CSD
ne This potential corresponds to the original DrugScore
CSD
version by
Velec with more bins. We can provide this function to customers who hold a valid CSD
license. It is recommended to be used especially in combination with (post-)optimization
routines but has shown good performance also as a rescoring function.
DrugScore
CSD
soft This potential corresponds to the former one but has been softened
using Gaussians so that the pose is more tolerant to slight deviations from the perfect min-
imum. We recommend this potential function for pure docking and/or rescoring scenarios
in which an optimization is not part of the workow.
All three DrugScore
CSD
variants do not incorporate SAS terms as mentioned above. The
versions fine and soft have terms which prevent atom collapse. The repulsive wall is
dened such that the rst maximum without correction is the longer distance end-point of
a linear function:
E
currentbin
= E
max
[(max currentbin) binsize +1] (8.8)
252 CHAPTER 8. ADDITIONAL MODULES FOR FLEXX
9
Scripting
9.1 *Scripts in FlexXs Language
Roughly speaking, a script consists of a sequence of FlexX commands together with their pa-
rameters (one command plus parameter list per line). Nevertheless, there are two important
features which give scripts more expressive power than simple sequences of commands.
The rst feature is the use of variables, the second and more important is the use of loops.
Examples of scripts are given in Appendix B.
Scripts are started by entering the command SCRIPT in the main menu (LEADIT) or by
starting LeadIT in batch mode (see Section 6.3.3).
A simple way to generate a script is to start FlexX at the commandline and use the log mode
!
(option -l, see Sec. 6.3.8). Then all the commands you type are stored in a log le, which in
turn can be used as a script.
The SCRIPT command itself is not allowed in scripts, so nested scripts are not possible.
9.1.1 Variables
Variables in scripts are either $(alphanum-string) or $0, $1, . . ., $9. The latter format was
the only one until FlexX 1.7.0 and is kept for compatibility reasons. Variables can replace
any parameter of a command but not a command itself. They can either be initialized by
a loop (see below), by interactive user commands INPUT or SELINP, or by the assignment
statement SETVAR (see also below).
Some variables are predened within FlexX and can be used in scripts in a read-only man-
ner. Currently, these are:
The time the program was called $(START_TIME), the format is a single string, e.g.
20031017_1107 for 11:07 hours, October 17th, 2003
The best score of all dockings $(BEST_SCORE)
The best RMSD of all dockings $(BEST_RMSD)
All directories dened in the conguration le, for example $(LIGAND)
All ags dened in the conguration le, for example $(RING_MODE)
All database lenames dened in the conguration le, for example $(AMINO)
All executables dened in the conguration le, for example $(RCGENERATOR)
253
254 CHAPTER 9. SCRIPTING
All integer and double parameters dened in the SETTINGS database le, for example
$(CLASH_FACTOR)
The name of the currently loaded ligand $(LIG_NAME) and the currently loaded re-
ceptor $(REC_NAME)
The number of selected base fragments $(NOF_BASE_FRAGS)
The current number of generated placements $(NOF_PLACEMENTS). Here, a place-
ment may also refer to an incompletely built up ligand.
The number of components for currently loaded ligand $(NOF_COMPONENTS).
The USMILES of the currently loaded ligand $(USMILES).
$(EVAL) contains the result of the command EVAL (see 8.6.1.1).
$(CLIB_READY) is set to TRUE, if a combinatorial library is available.
The index of the currently core group $(CORE_ID).
The number of instances for the currently core group $(NOF_CORE_INST).
The number of rgroups (without the core group) $(NOF_RGROUPS).
The status of the last executed command $(STATUS), this is either OK or ERROR.
The menu control commands MAIN and END do not change the value of $(STATUS).
The output of the last unix shell command executed with EXEC, $(UNIX_OUTP)
The torsion status of the currently loaded test ligand $(TORSION_STATUS):
value -1: the status is not dened.
value 0: torsion angles around the bonds applied according to either tor-
sion_standard.dat or torsion_ne.dat (see section 11.12).
value 1: torsion angles around the bonds applied according to either tor-
sion_standard.dat or torsion_ne.dat; if not found, the torsional prole is calculated
(see section 11.12).
value 2: no torsion angles found in both torsion_standard.dat and torsion_ne.dat for
at least one rotatable bond. The program applies a 30 degree grid with arbitrary refer-
ence atoms.
$(PVM_ID) is set to $PVM_VMID_<pvm task id> in FlexX-PVM (please refer to the
PVM section on page 147).
If a variable is used in a script the rst time for reading rather than for writing, it is set to an
empty string.
If the string assigned to a variable contains blanks, it must be written in double quotes to
avoid breaking the string into separate tokens. A variable is replaced in a string even if
the string is quoted. Variable replacement can be avoided by writing the variable in single
quotes. An empty string can be written as "".
9.1. *SCRIPTS IN FLEXXS LANGUAGE 255
Example
setvar $(number) 5
setvar $(protein) prot$(number)
output "The value of $(protein) is: $(protein)."
outputs: The value of $(protein) is: prot5.
9.1.2 Script parameter lists
A script can be executed with a parameter list. The parameter list consists of a list of script
variables with assigned values. Before execution of the rst script command, the variables
from the parameter list will be initialized with the given values. A parameter list can be
entered with the command line option -a or as the second parameter of the SCRIPT com-
mand.
Because the parameter list itself must be handled as a command parameter, there are some
syntactic constraints:
Each entry of the parameter list is separated by a semicolon (;); however, the parame-
ter list entry itself is not allowed to contain a semicolon (;).
The variable name must directly be followed by an equals sign (=); the string between
the = and either ; or the end of string is interpreted as the value for the preceding
variable.
The following example shows a FlexX call in batch mode with two parameters. The
script dock_one is given in Appendix B. The script loads a receptor description le named
4dfr.rdf, a ligand le named inhibit1.mol2 docks the ligand and writes the ten high-
est ranking solutions in a multi-mol2 le.
Example
leadit -b dock_one -a $(protein)=4dfr;$(ligand)=4dfr_min;$(nof_write)=10
Under Windows, please use double quotes (") instead of single quotes ():
!
Example
leadit -b dock_one -a "$(protein)=4dfr;$(ligand)=4dfr_min;$(nof_write)=10"
9.1.3 Loops: FOR_EACH/END_FOR, WHILE or FOREVER
A loop is a sequence of commands, braced by a pair of FOR_EACH and END_FOR instruc-
tions. Loops can be nested.
Syntax: FOR_EACH <n1> [<n2> [...]] IN "<lename>" [FROMTO <from> <to>]
<sequence of commands>
END_FOR
FOR_EACH <n1> FROMTO <from> <to>
256 CHAPTER 9. SCRIPTING
<sequence of commands>
END_FOR
FOR_EACH <n1> INLIBRARY "<lename>" ("<expression>")
<sequence of commands>
END_FOR
WHILE <condition>
<sequence of commands>
END_FOR
Description: In the rst variant, FlexX opens the specied le <lename>. If
<lename> is an absolute path (rst character is a slash /), it is taken as it is. If
it is a relative path (rst character anything other than a slash), it is taken relative to
the current directory.
After opening the le, FlexX reads a line at a time (comment lines and empty lines
will be skipped). FlexX expects to nd at least as many tokens (a token is any se-
quence of non-whitespace characters or a string enclosed in "") in that line as there
are loop variables in the FOR_EACH line. FlexX assigns the tokens from left to right
to the respective loop variables, and then it executes the <sequence of commands>
between FOR_EACH and END_FOR. Optionally, a subsection of the le can also be
specied with FROMTO. <from> and <to> dene the lines to be used (omitting
empty lines and comment lines). An example is given at the end of the command
description.
In the second variant, the loop variable successively gets the values <from>,
<from>+1, . . ., <to>, see also the example at the end of this description.
In the third variant, FlexX opens the specied multi-molecule le <lename>. If
<lename> is an absolute path (rst character is a slash /), it is taken as it is. If
it is a relative path (rst character anything other than a slash), it is taken relative
to the current directory. The loop variable successively gets the indices for each
molecule in <lename>. In order to evaluate the molecules from <lename>,
<expression> may be a basic molecular property or a complex logical expression
(see section 8.6.1 and 8.6.1.2). If a molecule fullls <expression>, then loop vari-
able gets the index, otherwise the index is skipped. In order to use the keyword
INLIBRARY with molecular properties, you need a license for SCREEN module!.
Finally, WHILE allows the denition of conditional loops. See the next section on
branches for how to dene a condition.
When END_FOR is encountered, FlexX jumps to the corresponding FOR_EACH and
reads the next line from the specied le. This process is repeated until no more
lines are found. After that, execution continues following END_FOR. <n1>, <n2>,
... are script variables following the rules summarized in section 9.1.1.
If the rst command of a batch le is FOREVER, the whole batch le repeatedly is
executed. This is (until now) only used for software demonstrations.
The commands BREAK and CONTINUE may be used to alter the sequence of com-
mands executed in a loop. On BREAK the execution jumps to after the END_FOR
command of the innermost loop containing this BREAK command. On CONTINUE
the execution jumps to the FOR_EACH command of the innermost loop containing
this CONTINUE command.
9.1. *SCRIPTS IN FLEXXS LANGUAGE 257
Example
FOR_EACH $(var1) $(var2) IN "variable.list"
...
END_FOR
The variables $(var1) and $(var2) are taken fromthe rst two columns in the le variable.list:
variable.list:
# molecule file ; nof compounds ; ...
testset.mol2 1000 ...
testset2.sdf 3000 ...
...
Example
FOR_EACH $(count) FROMTO 2 10
...
END_FOR
9.1.4 Branches: IF/ELSE/ENDIF
Testing operators enable the sequence of commands executed in a batch le to be altered.
Syntax: IF <condition>
<1st sequence of commands>
[ELSE
<2nd sequence of commands>]
ENDIF
Description: If <condition> evaluates as TRUE, the 1st sequence of commands is
executed. If not, the 2nd sequence of commands is executed instead. <condition>
allows variables or constants to be tested using a variety of operators. The operator
decides whether the operands are interpreted as values or strings. Numerical oper-
ators are: <, <= (), == (=), => (), >, and != (,=) may be used for comparison.
String operators eq and [] (contains operator) may be used.
Valid conditions are e.g.:
<condition> 1st seq. executed if 2nd seq. executed if Comment
$(val) < 5 $(val)<5 $(val)5 Numerical comparison
$(val) != 5 $(val),=5 $(val)=5 Numerical comparison
$(str) eq no $(str)=no $(str),=no String comparison
$(str) [] no $(str) contains $(str) does not String comparison
the string no contain no
9.1.5 One-of-n selection: SELINP
Syntax: SELINP <n1> [<n2> [...]] IN "<lename>" <no>
Description: The SELINP command can be used instead of a loop. Instead of all,
only row <no> is used to initialize the variables and no iteration takes place. The
meaning of <n1>, etc. is the same as for FOR_EACH.
258 CHAPTER 9. SCRIPTING
If <no> is missing, the user will be prompted for it interactively (assuming that the
batch le is started with the SCRIPT command). SELINP can therefore be used to
start a batch le with alternative parameter settings.
9.1.6 Special script command: SETVAR
Syntax: SETVAR <x> <value>
Description: Assigns the string <value> to the batch variable <x>.
9.1.7 Special batch le command: INPUT
Syntax: INPUT <n> <info text>
Description: Displays the <info text> on the screen and reads a string from the
keyboard (standard input). The string is then assigned to batch variable <n>.
9.1.8 Special script command: INCR
Syntax: INCR <var> <inc val>
Description: Increments the value of <var> by <inc val>. If <inc val> and the
former value of <var> are integers, the output will also be an integer, <var> will
contain the string of a oating-point number.
9.1.9 Special batch le command: OUTPUT and OUTERR
Syntax: OUTPUT <info text> [<info text> ...]
Description: Displays all <info text> parameters on the screen. <info text> can
be a string enclosed in "" or a variable. OUTERR outputs <info text> to standard
error instead of standard output.
9.1.10 Special batch le command: TIMER
Syntax: TIMER <command>
Description: Sends a <command> to the internal timer. <command> is either
start or stop. start resets the internal timer and starts counting, stop stops
counting and outputs the elapsed time.
9.1.11 Special batch le command: PROCSIZE
Syntax: PROCSIZE
Description: Displays the current program size on the screen. The used memory
of the machine is output in specied units.
9.1.12 Special batch le command: WAIT
Syntax: WAIT [<delay>]
Description: The WAIT command is designed for demonstrating FlexX. It delays
execution for <delay> seconds. Without the optional delay parameter, execution is
halted until the user presses the RETURN key.
9.2. INTERFACE TO PYTHON 259
During the time delay of the wait command, the complete batch le can be aborted
by pressing
Enable (1) or disable (0) the placebas caching module (see section
8.6.3).
10.1. CONFIGURING FLEXXS USAGE 271
PRINT_TIMES The processor time used to execute one command is output after each exe-
cution of one command (if equals 1).
PRINT_SIZE The current process size is output after each execution of one command (if
equals 1).
QUERY_SASTAB Enable (1) or disable (0) implicit sastab computation when querying
internal docking database, e.g. DOCKING/LISTSOL, or calling method ddb() in
PyFlexX. Default setting: disabled (0).
QUERY_BURIEDNESS Enable (1) or disable (0) implicit computation of buriedness when
querying internal docking database, e.g. DOCKING/LISTSOL, or calling method
ddb() in PyFlexX. Default setting: disabled (0).
RIGID_TORSIONS Flexible bonds are xed (if equals 1) to the nearest torsion angle in a
30 degree grid.
RING_MODE If equals 0, rings are considered rigid, only the loaded conformation is taken
into account. If equals 1, ring conformations are computed by CORINA and if equals
2, conformations are computed by CONFORT. Finally, if equals 3, the conformations
are computed by the built-in CORINA, and if equals 4, conformations are computed
by MOE. The default value is 3.
SDF_MOL_ID_NUM determines the eld from which the molecule ID in an sdf le is
taken, if SDF_MOL_ID_TYPE is set to 3. If equals x, molecule ID is read from the
x-th property line of type > <..>
SDF_MOL_ID_TYPE determines the eld from which the molecule ID of an sdf le is
taken for further processing:
Mode Meaning
0 First line of the header block
1 Property line with the name given by SDF_MOL_ID_STRING
2 Property line starting/ending with <ID.. / <id.. / ..ID> / ..id>
3 The SDF_MOL_ID_NUM-th eld in the data section
SECONDARY_TORSION_MODE This parameter controls the program behavior in cases
for which it was not possible to create any test points with the standard procedures
(controlled by TORSION_MODE, see above). If equals 0, a default grid is applied for
torsional energies (see above). If equals 1 and no torsion angles for a rotatable bond
are dened in the torsion database, the torsional potential is calculated from a force
eld (see Figure 10.1).
272 CHAPTER 10. ADVANCED SETUP
TORSION_MINIMA_CUTOFF
x x x x x x x x x x x
energy
torsion angle
torsion angle
energy
all points on a 5 grid which are below the threshold are test points (x)
a default grid at 30 width is applied
SECONDARY_TORSION_MODE
a threshold (TORSION_MINIMA_CUTOFF) is applied
torsion angle
energy
1
0
she shape of the torsional energy is sampled with selected terms from the Sybyl force field
Figure 10.1: The effects and meanings of the parameters SECONDARY_TORSION_MODE
and TORSION_MINIMA_CUTOFF.
10.1. CONFIGURING FLEXXS USAGE 273
SIZE_LIMIT After each (non-global) command, FlexX checks its own memory require-
ment. If this is higher than SIZE_LIMIT and SIZE_LIMIT is greater than zero, FlexX
automatically terminates. SIZE_LIMIT is measured in kilobytes. Memory checking is
currently only available under the Solaris OS.
STEREO_MODE FlexX is able to automatically ip stereo centers during docking calcula-
tions. Therefore only one docking run is necessary even if the enantiomer of the ligand
is unknown. The ag has values from 0 to 7 allowing different kinds of stereo centers
to change. We distinguish between three classes:
Class Meaning
1 Change non-planar nitrogens (Pseudo-R/S)
2 Change E/Z stereo centers at double bonds
3 Change R/S stereo centers
The ag value represents a combination of the three classes:
Mode Pseudo-R/S Z/E R/S
0 no no no
1 yes no no
2 no yes no
3 yes yes no
4 no no yes
5 yes no yes
6 no yes yes
7 yes yes yes
Inside FlexX, only acyclic stereo centers are modied. In ring systems, Pseudo-
R/S is supported, provided that CORINA is used for creating ring conformers (see
RING_MODE ag).
TORSION_MODE
0.0 i f e < e0
<G_conf_torsion>
ee0
e1e0
i f e0 e e1
<G_conf_torsion> i f e > e1
With <entropy_term>, the type of the ligand entropy termcan be selected. With boehm, the
term from the Bhm function is used, with chemscore a term closely related to the Chem-
Score function is used. <hydrophobic_denition> denes what atoms should be consid-
ered as hydrophobic. Currently, there are three different models:
NO_hydrophil Nitrogens and oxygens are hydrophilic, all others are hydrophobic.
C_hydrophob All carbon atoms are hydrophobic, all others are hydrophilic.
chemscore Denition closely related to the ChemScore function.
The following parameter is either surface or all and denes whether only surface atoms
or all atoms should contribute to the contact score.
11.7. *INTERACTION GEOMETRIES (GEOMETRY.DAT) 307
2
d
1
d
0
d 3
d
d(l,r)
1
-1
G_plp_rep G_plp_steric/ G_plp_hbond
Figure 11.2: Piecewise linear scaling function for PLP scoring. The x-value is the atom-
atom distance. For distances less than d
0
, <G_plp_rep> is used as a scaling factor,
<G_plp_steric> or <G_plp_hbond> otherwise.
Finally, the <sas_radius> denes the radius of the water sphere used for calculating the
solvent accessible surface. As reference and as examples we added a set of different
geometry_.dat les which are parameterized for different scoring functions:
geometry.dat The original FlexX score which is closely related to the Bhm function [3].
geometry_chemscore.dat This parameterization incorporates the ChemScore [8].
geometry_plp_score.dat This is an example for the PLP score [12, 10].
geometry_force_eld.dat Tripos force eld [4] terms can also be used for scoring.
geometry_screenscore.dat The ScreenScore was derived by Stahl & Rarey [28]. In this ex-
ample ScreenScore is used for both docking and nal scoring.
geometry_post_screenscore.dat In the original paper ScreenScore [28] was derived and
used for post-scoring. Docking was done with the original FlexX score. This example
shows this combination of the two scores. The FlexX score is used for partial solutions
and ScreenScore for the nal scoring.
11.7.2 Associating interaction geometries with molecular groups
The method of how interaction geometries are associated with molecular groups is essential
for understanding the records dening the interaction geometries. We will therefore explain
this rst in this section.
The assignment of an interaction geometry to a molecular group contains the list of up
to four atoms, named a
0
, . . . , a
3
in the following. These atoms have positions in space
c
0
, . . . ,c
3
, which will be used to derive two different bases (local coordinate systems) (see
also Figure 11.3):
308 CHAPTER 11. FILES AND FILE FORMATS
e
1
e
2
e
3
b
2
b
3
1
b 0
1
2
Figure 11.3: Denition of the e- and b-bases for a carboxylate group.
Base e e
1
=c
1
c
0
Base b
b
1
=e
1
e
2
=c
2
c
0
b
2
=e
1
e
2
e
3
=c
3
c
0
b
3
=
b
1
b
2
e
3
=e
1
e
2
if c
3
is undened
The origin of both bases is c
0
but can be redened by the center entry during the denition
of the interaction geometry. Note that only base b is guaranteed to be orthogonal by con-
struction. Furthermore, none of these 6 vectors are normalized in general. The three atoms
must not lie on a straight line. If only two atoms are specied, only e
1
and b
1
are dened. If
only one atom is specied, only the origin is dened.
11.7.3 Dening interaction geometries
An interaction geometry is always a part of a spherical surface. It can be constructed from
cones, cone sections, or spherical rectangles:
@geometry <geom_name>
radius <rad>
delta <polar incr> <azimuth incr>
[center <center vector>]
<geom specifier>
[<geom specifier> ...]
[exclude_from_clash <atom no>]
[surface_mode <mode>]
energy <opt energy>
[charge_scaling <c factor> <c threshold>]
[distance_scaling <d threshold1> <d threshold2>]
[angle_scaling <a threshold1> <a threshold2> <a threshold3>]
[access_scaling <threshold1> <threshold2>]
A <geom specier> can be one of the following lines:
sphere
cone <basis> <vec1> <polar1> <polar2>
11.7. *INTERACTION GEOMETRIES (GEOMETRY.DAT) 309
s_area <basis> <vec1> <vec2> <azimuth1> <azimuth2> <polar1> <polar2>
The name of the geometry <geom name> can be an arbitrary string and is used for refer-
encing the geometry in the amino.dat and contact.dat le (see section 11.8 and 11.11).
The entries of the @geometry record are:
radius denes the radius of the sphere where the interaction surfaces are located.
delta denes the step size for approximation of the surfaces by discrete point sets. The
values are the maximum arc lengths between two consecutive points in the polar di-
rection and azimuth direction respectively, measured in . The computation of the dis-
crete point set for an interaction surface works as follows: rst the polar angle interval
is divided by an odd number of circles having a distance of maximal <polar incr>
on the spherical surface. Each circle (circle section in the case of spherical rectangles)
is divided by an odd number of points having a distance of maximal <azimuth incr>
on the spherical surface. The distance from the border line of the interaction surface is
less than or equal to half of <polar incr> or <azimuth incr>, respectively.
center denes the center of the sphere. If not specied, the origin of the local coordinate
system is assumed to be the center. For the center denition the coordinate system e
with origin c
0
is used.
sphere denes the whole sphere to be the interaction surface. The <geom specier>s de-
ned in a @geometry record should describe disjoint surface parts of the sphere.
Thus, if the whole sphere is chosen, no other <geom specier> should occur in this
@geometry record.
cone denes a cone (-section) as the interaction surface. The axis of the cone is given by the
vector <vec1> in the local coordinate system <basis>. The cone section is delimited
by the angles <polar1> < <polar2>. Setting <polar1> to 0 yields a closed cone (see
Figure 11.4).
s_area denes a spherical rectangle as the interaction surface. <vec1>, <polar1>,
<polar2> dene a cone (-section) as above. <vec2> species the zero direction for
the azimuth and must not be collinear with <vec1>. The part of the cone section ly-
ing between <azimuth1> < <azimuth2> in a right-handed rotation around <vec1>
starting at <vec2> is the dened spherical rectangle (see Figure 11.4).
exclude_from_clash denes one of the base dening atoms which is not tested for overlap
with the exclude_from_clash atom of the countergroup.
surface_mode denes whether the interaction should be considered only at atoms on the
surface of the molecule. The default mode is 1, i.e. only at the surface. If set to 0, the
interaction is considered even if the atom is not at the surface.
energy denes the energy contribution of this interacting group to the interaction geometry
in the ideal case. Note that the interaction energy for a match is distributed among the
two interaction partners.
charge_scaling denes the factor <c factor> by which the energy is multiplied if the prod-
uct of the formal charges is less than the charge threshold <c threshold> (see section
11.7.4).
distance_scaling denes the two threshold values for the scaling factor for distance devia-
tions (see section 11.7.4).
310 CHAPTER 11. FILES AND FILE FORMATS
angle_scaling denes the two threshold values for the scaling factor for angle deviations
and an optional interaction surface scaling factor (see section 11.7.4).
access_scaling The accessibility gives an indication of the buriedness in the active site. The
cavity denition algorithm denes the accessability as a point P, around which 45
points are equally distributed around a unity sphere. A line running through one
of the sphere points and ending at P denes the direction along which an access to P
is either possible (1: totally exposed) or hindered (0: completely buried) by the protein
a full explanation can be found in [29]. The parameters describe how an interac-
tion with regard to accessibility is scaled. Setting e.g. the parameters to 0.3 and 0.7
(default values), results in the following scoring scheme: 0.00.3 full scoring, 0.30.7
linearly decreasing scoring contribution, from 0.7 no scoring contribution. From ver-
bosity level 5 on, the accessibility values are dislayed upon loading of the protein.
Note that lengths must be given in , angles in degrees, and energies in kJ/mol.
direction
zero
direction
main
center
main direction
1
1
Figure 11.4: Denition of cones and spherical rectangles. The main direction corresponds
to <vec1>, the zero direction to <vec2>, the angles to <polar>, and the angles to
<azimuth>. In the right gure, the main direction is perpendicular to the drawing plane.
11.7. *INTERACTION GEOMETRIES (GEOMETRY.DAT) 311
Example
@geometry coo- # Hydrogen acceptor geometry on a carboxylate oxygen
# c1=o, c2=c, c3=o
radius 1.8
delta 1.2 1.2
s_area b -1 0 0 0 1 0 -50 50 40 80
s_area b -1 0 0 0 1 0 130 230 40 80
exclude_from_clash 1
energy -2.35
charge_scaling 1.766 -0.15
distance_scaling 0.3 0.7
angle_scaling 20 60 1.0
access_scaling 0.3 0.7
@geometry o/nh # Hydrogen donor geometry
# c1=o/n, c2=h
radius 1.9
delta 1.2 1.2
center 1.0 0.0 0.0 # centered in H
cone b 1 0 0 0 50
exclude_from_clash 1
energy -2.35
charge_scaling 1.766 -0.15
distance_scaling 0.3 0.7
angle_scaling 0 50 0.6
access_scaling 0.3 0.7
11.7.4 Computing the energy contributions of matched interaction groups
The energy contribution of a match between two interaction geometries g
1
and g
2
is com-
puted as follows:
The ideal energy of the match is the sum of the <opt energy> values dened in the entry
energy of g
1
and g
2
. This value is scaled with 4 factors: a charge factor, a length deviation
factor, and two angle deviation factors. Let c
1
,c
2
be the centers of the interaction groups in
the global coordinate system.
If the product of the formal charges of the rst matched atoms a
1
in g
1
and g
2
, re-
spectively, is less than or equal to the charge threshold <c threshold>, the energy is
multiplied by the charge factor <c factor>.
The distance deviation is computed as
d
dev
= [[c
1
c
2
[ (g
1
.radius + g
2
.radius)/2[
The scaling factor is
f
d
=
1 : d
dev
d
1
1
d
dev
d
1
d
2
d
1
: d
1
< d
dev
< d
2
0 : d
dev
d
2
312 CHAPTER 11. FILES AND FILE FORMATS
where d
1
and d
2
are the distance thresholds <d threshold1>, <d threshold2> dened
in the entry distance_scaling of g
1
. The distance thresholds in g
1
and g
2
should be
the same.
The angle deviation is computed at each of the two interacting groups. Here the for-
mulas are given for geometry g
1
. cp is the point on the interaction surface closest to
the matched interaction center. Note that this is signicantly different fromthe original
Bhm function where scoring was performed by calculating the angle deviation from
a single main direction of interaction. The angle deviation is then dened to be the
angle between the cp and the interaction centerc
2
. With this model, all points lying on
the interaction surface get an angle deviation of zero. In order to differentiate in score
between these points, the interaction surface can be shrunk using the scaling factor
a
scale
optionally dened as the third parameter of angle_scaling of g
1
. A value of 0
shrinks the interaction surface to a single point emulating the behavior of the original
Bhm function.
The scaling factor itself is then computed as above:
f
a
=
1 : a
dev
a
1
1
a
dev
a
1
a
2
a
1
: a
1
< a
dev
< a
2
0 : a
dev
a
2
where a
1
, a
2
are the angle thresholds dened at g
1
in the entry angle_scaling of g
1
.
11.8 *Amino data (static data le AMINO)
The static data le stored in the AMINO environment variable describes the characteristics
of the amino acids, depending on their specic appearance in the protein, e.g. the number
of atoms, bonds and interaction sites may depend on where the amino acid is located inside
the chain (terminal or nonterminal) and on whether it is protonated or not (for FlexX pro-
tonation rules on the protein side see section 6.5.7). We have supplied you with another
amino le, namely amino_gen.dat, which is to be used in the context of FlexE (section 8.4)
whenever generic edf les (section 8.4.4) are to be used. This le contains the denition of
several further templates which are then used by the supplied generic edf les.
Each such specic appearance of an amino acid is handled by a separate record, and each
record denes a so-called template for the appearance. A new record in the amino.dat le
is started by the keyword @template, followed by a template name:
@template <template name>
The <template name> has the following syntax:
<xxx>[<list of modifiers>]
<xxx> is the three-letter code of the amino acid (lowercase letters!). Valid modiers and
their meaning:
1|2 # if <xxx> = his: tautomer of histidine
h # if <xxx> = cys: cysteine not involved in s-s bond
+|- # charged within side chain
n[+] # n-terminal, possibly charged at terminal amino group
c[-] # c-terminal, possibly charged at terminal carboxylate group
11.8. *AMINO DATA (STATIC DATA FILE AMINO) 313
~ # if <xxx> = ser, tyr, thr: the terminal OH group is free
# rotatable
The modiers must be added in this order.
Examples of valid <template names> are:
Example
arg+n+ # means positively charged arginine at a positively
# charged N-terminus.
ala- # means negatively charged alanine (nonterminal)
alan+ # means an uncharged alanine at a positively charged
# N-terminus.
ser~ # means no specific torsion angle for the terminal OH group
# within serine
Each record consists of a sequence of data lines. There are atom lines, bond lines and inter-
action lines. Each type of line has its own syntax:
atom <atom name> <SYBYL type> [<formal charge> <nof hydrogens>]
bond <atom name1> <atom name2> <SYBYL bond type>
iact <atom name1> [<atom name2>] [<atom name3>] [<atom name4>] <contact type> <IA geometry>
void
atom line The <atom name> parameter refers to the PDB le specied in the @pdb_file
record. For each atom, the correct SYBYL atom type must be dened. If the atom has a
non-zero formal charge, the formal charge itself and the number of bonded hydrogens
must be dened.
bond line The atom names <atom name1/2> must be one of the atom names specied
in the atom lines of the current template record. <SYBYL bond type> is one of the
SYBYL bond types (see SYBYL manual [32]). <atom name1> or <atom name2> (not
both together) may be one of the strings n(next) or c(prev) which describe the
backbone C-atom (backbone N-atom) of the previous (next) amino acid in the chain.
Bonds need not be dened in both directions, except bonds containing n(next) or
c(prev) as atom names.
iact line The atom names <atom name1/2/3> must be one of the atom names specied in
the atom lines of the current template record. The given atoms dene the basis with
respect to which the interaction geometry is dened (see section 11.7.2). If <atom
name2> and/or <atom name3> in an iact line is not specied, the corresponding
entry must be - instead. The <contact type> is one of the interaction types dened
in the static data le contype.dat (see section 11.6). The <interaction geometry>
is one of the interaction geometries dened in the static data le geometry.dat (see
section 11.7). The interaction is associated with the rst atom <atom name1>.
Here is an example of a complete section in the amino.dat le:
Example
@template gly
atom _n 1 trigonal p
atom _ca 2 tetrahedral a
continued
314 CHAPTER 11. FILES AND FILE FORMATS
Example (continued)
atom _c
atom _o
atom _h
bond _n _c(prev) am
bond _n _ca 1
bond _ca _c 1
bond _c _o 2
bond _c _n(next) am
bond _n _h 1
iact _o _c - h_acc co
iact _n _h - h_don o/nh
iact _ca - - ch2 sphere1
iact _c _n(next) _o amide amide
In addition, the AMINO data le contains the denition of particle types. The most impor-
tant particle type is of course a water molecule (for which this construct is made). Apossible
denition is shown in the example.
Example
@particle water
pdb_name _O HOH
sybyl_type O.spc
vdw_radius OXYGEN
iact h_don 0 water_don 1.0
iact h_acc 1 water_acc
iact metal_acc 1 metal_sp
max_contacts 2 2
delta 0.8
angle_bounds 70 170
receptor_contacts 2
hydrophobicity hydrophilic
energy 1.0 -1.0
angle_scaling 110 40 80
charge 0.0
pp_distance_factor 1.0 1.0 1.0
The syntax of a particle denition is as follows, the order of the lines is arbitrary:
@particle <particle name>
pdb_name <PDB atom identifier> <PDB group identifier>
sybyl_type <SYBYL type>
vdw_radius <radius def>
iact <contact type> <contact grp> <interaction geometry> <distance offset>
max_contacts <for grp 0> <for grp 1>
angle_bounds <min angle> <max angle>
receptor_contacts <min nof receptor contacts>
hydrophobicity <hydrophob def>
angle_scaling <opt angle> <scale from> <scale to>
energy <missing ia penalty> <placement attraction energy>
delta <surface approximation delta>
charge <formal charge>
11.8. *AMINO DATA (STATIC DATA FILE AMINO) 315
pp_distance_factor <factor neg> <factor zero> <factor pos>
pdb_name The PDB atom name and group name are used only for comparing predicted
particle locations with those found in the receptor le.
sybyl_type The SYBYL type of the particle is used only for writing placements with particle
locations to mol2 les. The default type is Du.
vdw_radius Particles are involved in overlap computations. The radius used is dened by
<radius def>. <radius def> can be a name of an element or a oating point dening
the radius directly. If the denition is made by specifying the name of an element,
the van der Waals radius of the element as dened in the static data le CHEMPAR is
used.
iact The particle is able to make a maximum of four different types of interactions, which
are each dened in one <iact> entry. It contains the contact type (<contact type>
as dened in the static data le CONTYPE), a contact group which must be 0 or 1,
the interaction geometry (<interaction geometry> as dened in the static data le
GEOMETRY), and a so-called <distance offset>. The interaction geometries must be
spherical. The distance offset can be used to shift the interaction center out of the
center of the particle. The contact group is used to restrict the number of interactions
of a specic set of interactions.
max_contacts This entry denes the maximum number of interactions for the two groups 0
and 1.
angle_bounds This entry denes an angular interval [<min angle>, <max angle>] in
which all pairwise enclosed angles between two interactions must lie. This enables
the denition of an approximately tetrahedral geometry.
receptor_contacts Particle positions are determined by clustering. All particle positions
with fewer than <min nof receptor contacts> are omitted after clustering.
hydrophobicity This entry denes how a contact between a ligand atom and the particle is
considered during ligand placement. It should be either hydrophobic or hydrophilic.
angle_scaling Because particle interactions are spherical, the angular penalty at the parti-
cle side must be dened differently. For a specic ligand-particle interaction, the mini-
mumangle between the interaction and a receptor-particle interaction of the same par-
ticle is calculated. This angle is compared with the <opt angle>. The angles <scale
from>, <scale to> have the same meaning as in the angle scaling denition in the
geometry database.
energy This entry denes two energy related parameters. The rst parameter <missing
ia penalty> is a penalty for each interaction possible but not formed at a particle.
The second parameter <placement attraction energy> is used only for placing the lig-
and molecule. It is the weight of the interaction in the superposition of the interacting
groups. Normally this is the energy contribution of the interaction itself. Because parti-
cles are only roughly placed, the inuence of ligand-particle interactions on the ligand
placement can be reduced by an independently set <placement attraction energy>.
delta This entry denes the point density on interaction surfaces during the generation of
particle positions.
316 CHAPTER 11. FILES AND FILE FORMATS
charge This entry denes the formal charge of a particle of this type. It is especially impor-
tant if metal ions are to be modeled.
pp_distance_factor All particles placed in a docking solution are free of overlap. In some
cases, there should be an increased minimum distance between two particles, for ex-
ample in the case of metal ions. With this item, the minimumdistance which is dened
by the van der Waals radii of the particle types can be modied by a factor depending
on whether the product of formal charges is negative, zero, or positive.
11.9 *Charges of receptor atoms (static data le CHARGES)
For each atom in an amino acid template (see description of static data le AMINO
above) the user can dene an atomic, partial charge value. Formal charges are dened
in amino_pcharges.dat which have the same syntax. The CHARGES le is structured in
records exactly as the static data le AMINO (see there). Each record consists of a sequence
of lines. The line format is as follows:
<atom name> <charge>
The <atom name> must be in the list of atom names specied by the atom lines of the
AMINO le record for the template with the same name (it must exist). The <charge>
should be a reasonable oating-point value.
Here is an example of a complete section in the static data le stored in the CHARGES
environment variable, the template ala must exist in the AMINO le and must contain
atom lines for the atoms _n, _ca etc.:
Example
@template ala
_n -0.261
_ca 0.128
_c 0.203
_o -0.394
_cb -0.024
11.10 *Assigning data to the ligand: the subgraph data les
In contrast to the protein, which is constructed from a small set of building blocks the
amino acids the ligand can be any organic molecule. Assigning additional physico-
chemical data to the ligand is therefore more complicated. In FlexX, a subgraph matcher is
provided for this. The patterns that can be matched to the ligand are dened in subgraph
data les.
There are different kinds of information which will be assigned by this mechanism: for-
mal charges to atoms (section 11.19), interaction types and geometries to interacting groups
(section 11.11), and torsion angle patterns to rotatable single bonds (section 11.12). Start-
ing from Release 2 additional extended cheminformatics features for rule-based structure
checking and modication of ligand structures are available, which will be described in the
11.10. *ASSIGNING DATA TO THE LIGAND: THE SUBGRAPH DATA FILES 317
SMARTS
TM
support section (11.13). Older versions use a FlexX-specic mechanism to de-
ne subgraphs which is highly dependent on SYBYL atom types.
This section contains a description of the denition of subgraphs which is independent of
the data assigned to them.
11.10.1 Dening groups of atoms
With the @defgroup records at the beginning of a subgraph le, you can combine atom
types to form groups.
@defgroup <grp name> <atom type 1> [<atom type 2> ...]
An atom type can be an element of more than one group. The group names used must be
unique. The atom types follow the SYBYL notation [32]. Three groups are predened: the
group R contains all atom types except hydrogen; the group RH represents an arbitrary
atom type; and the group RX contains all atom types except hydrogen and carbon. Note
that in subgraph les no wildcards are allowed, but you can use the asterisk character in the
denition of groups. Recursive denition of groups is not possible. The group mechanism
does not work for bonds.
Example
@defgroup N.
*
N.1 N.2 N.3 N.ar N.am N.pl3 N.4
@defgroup N2ar N.2 N.ar N.pl3
# hydrogen donor atom types
@defgroup DON O.2 O.3 N.1 N.2 N.3 N.ar N.am N.pl3 N.4
11.10.2 Dening subgraphs
The second part of a subgraph le contains the subgraph denitions themselves. Each such
denition starts with the keyword @subgraph followed by three parameters. The record
consists of a list of atoms, a list of bonds and the data part:
@subgraph <class> <priority> <subgraph name>
<atom specification>
[<atom specification> ...]
<bond specification>
[<bond specification> ...]
data
<data area>
end
The parameter <class> allows you to divide the dened subgraphs into a set of classes.
Denition of the classes can signicantly reduce the number of subgraph matches you need
to perform. Usage of the classication must be supported by the program part, which is
responsible for the evaluation of the data part. The classication scheme is therefore ex-
plained in the following sections, where the two instances of subgraph les in FlexX will be
explained.
318 CHAPTER 11. FILES AND FILE FORMATS
To use SMARTS
TM
in your subgraph denitions, just use the SMARTS
TM
pattern follow-
ing the keyword smarts instead of SYBYL type atom and bond specications. The order
of atoms in the SMARTS
TM
patterns is simply read from left to right, but note that recur-
sive SMARTS
TM
specify only one single atom (recursive one atom: [$(C(=O)N)], ordinary
C(=O)N three atoms) (for recursive SMARTS
TM
see section 11.13.7).
@subgraph <class> <priority> <subgraph name>
smarts <SMARTS pattern>
data
<data area>
end
Subgraph denitions need not be disjoint, i.e. a subgraph can be a subgraph of another sub-
graph. This is an important feature, because it enables you to dene more global subgraphs
like default rules and specializations of this. The subgraph matcher tries to match every
dened subgraph, even if the rst hit has occurred. This can be avoided by assigning pri-
ority numbers in the <priority> parameter. If a subgraph with a high priority is found,
subgraphs with a lower priority will not be matched anymore. Priority numbers are non-
negative values, the priority increases with the value, i.e. priority 0 is the lowest possible
priority.
The <subgraph name> is an arbitrary string (without newlines, blanks, tabs) which will be
written to FlexXs control output.
An atom specication consists of the keyword atom, a consecutive atom number, starting
with 1, a type specier and additional optional speciers.
atom <atom no> <type specifier> [charge <op> <charge>] \
[nof_bonds <op> <nof bonds] [excl_match]
The atom number is used to reference the atom in the following bond specications and in
the data area. The type specier can be a group, previously dened by a @defgroup record,
or a SYBYL atom type [32].
There are three types of optional additional speciers. The charge specier enables you to
restrict the matched atoms to atoms with a specic formal charge. <op> can be one of ==,
>=, <=, >, <, or ! = the given value. With the nof_bonds specier you can restrict the
set of matched atoms to atoms with the specied number of bonds. If excl_match (exclusive
match) is specied, the atom is blocked for subgraph matchings with the same subgraph.
The bonds are specied as follows:
bond <source atom no> <dest atom no> <bond type> [<dest atom no> <bond type>...]
The <source atom no> is the number of the rst atom to which the bond is attached. A list
of destination atom numbers alternated with bond types then follows. Thus, a set of bonds
starting at atom <source atom no> can be dened in one bond specication. The bond
types follow the SYBYL notation [32], no groups for bonds are allowed. The bond type un
can be matched to arbitrary bonds. Each bond must only be dened once, from atom a to
atom b or vice versa.
11.11. *LIGAND INTERACTION GROUPS (CONTACT.DAT) 319
Example
@subgraph 0 1 Acceptor_N
atom 1 N2ar CHARGE 0.0 NOF_BONDS 2
atom 2 R
atom 3 R
bond 1 2 UN 3 UN
data
...
end
Example
@subgraph 0 1 Phenyl_group
atom 1 C.ar
atom 2 C.ar EXCL_MATCH
atom 3 C.ar EXCL_MATCH
atom 4 C.ar EXCL_MATCH
atom 5 C.ar EXCL_MATCH
atom 6 C.ar EXCL_MATCH
bond 1 2 ar 6 ar
bond 3 2 ar 4 ar
bond 5 4 ar 6 ar
data
...
end
The contents of the <data area> depend on the instance of the subgraph le and will be
explained in the following two sections.
11.11 *Ligand interaction groups (contact.dat)
The rst application of the subgraph matcher is the detection of interaction groups in the
ligand. The subgraphs for this task are dened in the subgraph le contact.dat.
The data area in the subgraph denition consists of a list of interactions similar to the iact
entries in the amino database le amino.dat.
iact <atom no1> [<atom no2>] [<atom no3>] [<atom no4>] <contact type> \
<ia geometry>
The atom numbers refer to the matched atoms from the subgraph in question. The
<contact type> is one of the contact types dened in the static data le contype.dat.
The <ia geometry> is one of the interaction geometries dened in the static data le
geometry.dat. If the second or third atom is not dened, a --character must be typed
instead. If atoms should be explicitly excluded from any interaction, the keyword void can
be used instead of iact. Those atoms are matched, but no interaction is assigned, and they
cannot be part of any further interactions.
If the interaction geometry is match by a particular ligand, can e.g. be checked in the MOLINF
output (see section 7.5.6).
320 CHAPTER 11. FILES AND FILE FORMATS
11.12 *Ligand torsion database (torsion_standard.dat)
The second subgraph data le in FlexX is the torsion_standard.dat or the
torsion_fine.dat le. torsion_standard.dat contains subgraphs for the assign-
ment of energetically favorable torsion angles to acyclic single bonds and is based on the
old MIMUMBA model. torsion_fine.dat contains subgraphs for the assignment of 10
degree energy grids to acyclic single bonds and is based on the new MIMUMBA model
without preselection of torsion angles.
For your convenience, torsion_fine.dat is in the examples/ directory. You can paste
the contents of this le into the editor of your conguration dialog (cp. Sec. 10.1).
The data area in the subgraph denitions of this le looks as follows:
tangle/te <angle> <energy>
[ tangle/te <angle> <energy> ...]
period <period>
symmetry <sym>
The rst part is a list of torsion angles combined with energy values, each pair <angle>,
<energy> is preceded by the keyword tangle/te. Then the periodicity of torsion angles
(keyword period) and the symmetry are dened (keyword symmetry). The set of torsion
angles is extended in the following way. First, all angles are mirrored at the symmetry
value, then the angles are copied 360/<period> -1 times, in the i-th copy, and an offset of
i <period> is added to each torsion angle. An example can be found in Figure 11.5. All ex-
plicitly dened torsion angles (tangle/te) must be less than or equal to the symmetry value,
the period value must be a multiple of the symmetry value. Both values must be dened.
Note that the subgraph matcher cannot distinguish between different stereoisomers.
11.12.1 Constraining amides to planarity
For your convenience, with Release 2 we have predened (but NOT activated) the en-
forcement of planar amides, i.e. the torsional angle R-(NH)-(CO)-R
/
will be constrained to
amount to 0
or 180
. To activate the respective settings, you must comment out the respec-
tive lines in your currently active torsion data in your conguration. To nd out what you
have loaded, please go to File -> Global Preferences -> Parameters & Flags
and search for TORSION.
We restricted the denition to those amides which carry an H at the amide nitrogen atom.
You will nd the denitions at the end of the torsional database les. Note that if you
would like to extend or alter the denitions, you could as well use the SMARTS
TM
subgraph
language. Here, the rst three bonds of the SMARTS
TM
pattern dene the torsion angle.
11.12.2 Fixing torsional angles at specied values: a sample case
Let us clarify this by means of a simple example.
We assume we want to constrain the docking of a diketo compound anked by an aromatic
atom to planarity.
1
1
Since our case is a special extention of fragments already covered by the existing subgraph denitions in
torsion_standard.dat, we did not include it in the distribution.
11.12. *LIGAND TORSION DATABASE (TORSION_STANDARD.DAT) 321
Figure 11.5: Example for a torsion.dat entry: This plot shows the potential for the following
entry. The original values are plotted in black. These values are mirrored at 45 degrees
(outlined boxes) and the resulting potential is repeated every 90 degrees (grey boxes).
tangle/te 10 5
tangle/te 15 3
tangle/te 20 0
period 90
symmetry 45
322 CHAPTER 11. FILES AND FILE FORMATS
Figure 11.6: A simple diketo compound in mixed tautomeric form.
One way of doing this is to dene a suitable subgraph including the assignment of a few,
low-energy grid points on the grid of allowed torsion angles. In the following, we will
assume we want to work with the following settings: TORSION_MODE should equal 0, and
the value of TORSION (set in your conguration) should accordingly point to the contents
of the pendant to a traditional torsion_standard.dat static data set; please contact us
for more details. The idea is to tighten certain rotatable bonds.
It is a good idea to construct a sample compound. We have done so starting from a simple
SMILES string (O=C(C=C(O)c1ccccc1)c2ccncc2).
In Fig. 11.6 we show what the compound looks like after reading in with FlexX and calling
a drawing command with FlexV.
Our task is to constrain the rotatable bonds C6C4 and C2C12. We can trace our progress
by reading in the ligand with verbosity set to 10. The output contains lines such as:
>> Identification of rotatable bonds
...
bond C2|2 --- C12|12
/ 179 \
ref. C3|3 C17|17
-----------------------------------------------------------------------------
--> (period 180) : 0 30 60 120 150
energy : 0.0 1.0 9.0 9.0 1.0 0.0 1.0 9.0 9.0 1.0
<-- (period 360) : 0 30 60 120 150 180 210 240 300 330
...
indicating that the bond between C2 and C12 read from the input le is currently associ-
ated with a torsional angle of 179 degrees. Further, the currently valid torsion angular grid
points including the listed energies are given. Bear in mind that if there were no subgraph
matchings onto our fragment of interest, the default procedure would have been applied:
the torsion angle grid points were distributed on a 30 degree grid. In this case then, an
output at verbosity 10 would look like this:
11.12. *LIGAND TORSION DATABASE (TORSION_STANDARD.DAT) 323
>> WARNING: Empty list of torsion angles at bond XX|X --> YY|Y .
>> WARNING: No torsions, taking 30 degree grid with arbitrary reference atoms.
bond C4|4 --- O5|5
/ 143 \
ref. C6|6 H19|19
-------------------------------------------------------------------------------
--> (period 360) : 0 30 60 90 120 150 180 210 240 270 300 330
energy : 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<-- (period 360) : 0 30 60 90 120 150 180 210 240 270 300 330
Now we must dene a subgraph in such a way that the specied atoms match the desired
rotatable bond and insert tangle/te values to which the fragment build-up procedure will
be constrained. One possible solution is a subgraph weak enough to match both the keto
and the enol side of such a compound depicted in Fig. 11.6. The rst four atoms dene
the surrounding of the bond to be restricted to a planar environment. Some @defgroup
statements allow for compressing the denition.
@defgroup C2 C.2 C.ar
@defgroup Aratm C.2 C.ar N.ar N.2 S.2 O.2 N.pl3
@defgroup Oketoenol O.co2 O.2 O.3
@defgroup Ca_ketoenol C.2 C.3 C.ar
#
@subgraph special 6 diketo_compound_flanked_by_aromatics
atom 1 Aratm
atom 2 Aratm
atom 3 C2 # a C bonded to keto-O
atom 4 Ca_ketoenol # the C in between two CO groups. might be CH2, or CH
atom 5 Oketoenol # either an hydroxyl-O or carbonyl-O
atom 6 C2 # bonded to enol-O
atom 7 Oketoenol # either an hydroxyl-O or carbonyl-O
bond 1 2 ar
bond 2 3 1 # the rotatable bond
bond 3 4 un # 1 or ar
bond 3 5 un # 2 or ar
bond 4 6 un # 1 or ar
bond 6 7 un # 1 or 2 or ar, depending on keto-enol tautomer.
data
tangle/te 0 0
tangle/te 5 1
tangle/te 175 1
tangle/te 180 0
period 360
symmetry 180
end
The priority of the subgraph must be high enough to overrule existing subgraph denitions,
which is why we put a value of 6 in the @subgraph line (see section 11.10.2). Checking with
FlexX reveals that the subgraph has been assigned properly:
bond C2|2 --> C12|12
Subgraph: diketo_compound_flanked_by_aromatics
Direction:-b Number of matches (different sets of torsion angles): 2 ( 2)
1 2 3 4
Match 1: < C17|17 > < C12|12 > < C2|2 > < C3|3 >
Match 2: < C13|13 > < C12|12 > < C2|2 > < C3|3 >
324 CHAPTER 11. FILES AND FILE FORMATS
Figure 11.7: Allowed torsional angles and associated energies for the subgraph example.
The respective points are coloured in red; grid points above 180 degrees are generated due
to the symmetry entry in the subgraph denition.
bond C2|2 --- C12|12
/ 0 \
ref. C3|3 C13|13
----------------------------------------------------------------------------
--> (period 180) : 0 5 175 180 185 355
energy : 0.0 1.0 1.0 0.0 1.0 1.0
<-- (period 360) : 0 5 175 180 185 355
...
bond C4|4 --> C6|6
Subgraph: diketo_compound_flanked_by_aromatics
Direction:-b Number of matches (different sets of torsion angles): 2 ( 2)
1 2 3 4
Match 1: < C11|11 > < C6|6 > < C4|4 > < C3|3 >
Match 2: < C7|7 > < C6|6 > < C4|4 > < C3|3 >
bond C4|4 --- C6|6
/ 0 \
ref. C3|3 C7|7
----------------------------------------------------------------------------
--> (period 180) : 0 5 175 180 185 355
energy : 0.0 1.0 1.0 0.0 1.0 1.0
<-- (period 360) : 0 5 175 180 185 355
The subgraph denition in fact matches from both sides and assigns energy values and
torsion angle grid points as desired. We have plotted the allowed torsional grid values in
Fig. 11.7. Also we left a little exibility of 5 degrees which in practice has produced better
results and is probably also more realistic.
11.13. SMARTS
TM
SUPPORT 325
11.13 SMARTS
TM
support
Another mechanism to dene subgraphs is available from Release 2 of FlexX and is
based on the SMARTS
TM
syntax introduced by Daylight Chemical Information Systems Inc.
(www.daylight.com). Similar to SMILES, which can be used to dene small molecules using
a line notation, SMARTS
TM
is the corresponding code to dene subgraphs. As an exam-
ple, lets say [OH]c1ccccc1 denes phenole, the identical pattern used in SMARTS
TM
iden-
ties a phenole group in a molecule. In principle, SMARTS
TM
is an extension of SMILES.
Whereas a SMILES expression denes one specic molecule, SMARTS
TM
denes patterns
that match groups or families of molecules. This is achieved by a set of descriptors that
must be matched by a substructure to be recognized. In contrast to the few rules dened
in the classical FlexX subgraph denition language which is additionally highly dependent
on SYBYLs atom type specication, SMARTS
TM
is more chemistry based and has many
more rules to dene subgraphs of high complexity. At the moment, the implementation
of the SMARTS
TM
matching mechanism is incomplete in terms of chirality and directional
bonds. On the other hand there are some extensions to the standard SMARTS
TM
rules. In
the following sections, a complete overview of the actual SMARTS
TM
implementation in
FlexX is given. It is described how classical subgraph denitions may be redened using
SMARTS
TM
and how those patterns are used for atom type assignment, structure correction
and aromaticity perception.
11.13.1 Atomic primitives
The SMARTS
TM
denition used in FlexX is based on SMARTS
TM
version 4.83 and should
comply with most existing implementations in other software packages.
The following table gives an overview of the available atomic properties.
326 CHAPTER 11. FILES AND FILE FORMATS
Symbol Symbol name Default Atomic property requirements
* wildcard Any atom (including hydrogen)
#<n> atomic number none Element with number <n>, e.g.
[#6] means any carbon
{x} atom type [O.co2] Explicit SYBYL atom type
a aromatic Aromatic (see 11.13.3)
A aliphatic Aliphatic (see 11.13.3)
D<n> degree 1 <n> explicit connections, includ-
ing explicit hydrogens
H<n> total H count 1 <n> attached hydrogens (see
11.13.4)
h<n> implicit H count 1 <n> implicit hydrogens (see
11.13.4)
R<n> ring membership any In <n> SSSR rings (see 11.13.2)
r<n> smallest SSSR ring has
size <n>
any In smallest SSSR ring of size <n>
(see 11.13.2)
v<n> valence 1 Total bond order (see 11.13.4)
X<n> <n> total connections 1 Number of bonds including
bonds to implicit hydrogens (see
11.13.4)
-<n> formal charge == <n> -1 See 11.13.4
+<n> formal charge == <n> +1 See 11.13.4
@ chirality Anticlockwise (NOT IMPLE-
MENTED)
@@ chirality Clockwise (NOT IMPLE-
MENTED)
@<c><n> chirality Chiral class <c> chirality <n>
(NOT IMPLEMENTED)
@<c><n>? chirality Chirality <c><b> or unspecied
(NOT IMPLEMENTED)
<n> atomic mass Explicit atomic mass (NOT IM-
PLEMENTED)
^<x> hybridization state [^2] Explicit hybridization state out
of (s,1,2,3,ar) (SMARTS
TM
exten-
sion)
All atomic primitives except * must be written in square brackets[]. Atomic elements are
usually written without square brackets [], but if they are part of logical expressions the
brackets are necessary as well.
There is no special rule for grouping several properties, but if a rule is composed of several
properties like [Cv3X2AD3-2], the description of the basic element must be the rst state-
ment. This means that [C+] is OK, but [+C] is not allowed. This is to prevent things like
[CD3X2N]. Here we have a carbon as principal element of the atom, the nitrogen in the end
is not allowed in this context and will be omitted. If no symbol is dened like the pattern
[D1] then rule [*,H;D1] is used. Elementary operators are element symbols (C,N,O,...), *,
SYBYL types ({N.3},{O.co2},...) and element specications ([#n]).
11.13. SMARTS
TM
SUPPORT 327
11.13.2 Ring perception
Ring systems are written as dened in SMILES where atoms get labels for ring closure,
so C1CCCCC1 means a cyclohexane ring or N1=CCc2ccccc21 denes an indole system.
SMARTS
TM
uses identical rules, but any atom can be replaced by any SMARTS
TM
atom
description. The simple and identical rules [R] or [r] mean that the desired atom must be
part of a ring system. The rule [R2] will match any atomthat is part of two rings and the rule
[r5] means that the smallest possible ring that the atom belongs to has a size of ve atoms.
These primitive rules can be combined to [R2r5], which would match two atoms in indol for
example.
11.13.3 Aromaticity perception and hybridization states
Aromaticity is a more problematic atomic property, due to different existing denitions.
FlexX traditionally follows the SYBYL denition of aromaticity, where only six-membered
rings that consist of nitrogen and carbon atoms only are accepted to be aromatic if they
are planar with six aromatic bonds or alternating single and double bonds. Atoms in such
systems get the SYBYL atom type N.ar or C.ar.
Additionally, many other ring systems must be treated as aromatic not only for SMARTS
TM
,
but from the chemical point of view as well. A few examples are thiophene or furane and
other ve-membered and adjacent ring systems such as indole. FlexX treats all ring systems
that t Hueckels rule (4n+2 pi-electrons in a planar ring system) as aromatic as well. In
addition, planar ring systems that miss Hueckels rule are checked for special subgraphs
which should be treated as aromatic as well (refer to transform.dat).
To achieve this behavior, the aromatic property is represented by two possible attributes.
On the one hand, an atom can be simply aromatic irrespective of its assigned hybridization
state or SYBYL atom type. On the other hand, SYBYL atom types assign a hybridization
state to each atom, and if it is of type N.ar or C.ar the atom is aromatic as well.
Hybridization states are highly related to the atom types because a SYBYL atom type con-
sists in principle of an element and a hybridization state (refer to chempar.dat, where the
atom types are dened). SMARTS
TM
basically does not know anything about hybridization
states, but the information is sometimes useful so in FlexX it is possible to change or check
the hybridization state of an atom using the [^<type>] expression. This is a feature that is
not originally dened by Daylight, but occurs in other SMARTS
TM
implementations and
was included in FlexX too.
This property only makes sense in combination with specic atom types that handle infor-
mation about hybridization states. Permitted hybridization states are currently s, 1, 2, 3, ar.
The states 1, 2, 3 mean sp1, sp2 and sp3 hybridization respectively. But be careful, [^ar] has
a double meaning, because an atom can be aromatic but not necessarily have the hybridiza-
tion state aromatic. This is due to different representations of aromatic systems, where the
SMILES expression c1ccccc1 represents benzene as well as the kekule form C1-C=C-C=C-
C=1, but in the rst case all atoms get the hybridization state aromatic, in the second they
will have an SP2 hybridization, but all these atoms match the pattern [c]. So to get it right,
use [a] for Hueckel aromaticity and [^ar] for SYBYL aromaticity.
This behavior implies a special handling of bonds between aromatic atoms too. The sim-
ple case is that a bond has the bond type aromatic, which is always matched as an aromatic
bond. But to match something like a:a or aa in a thiophen, we must match a a because a thio-
328 CHAPTER 11. FILES AND FILE FORMATS
phene is usually represented by alternated single and double bonds. So a:a or aa matches
all bonds between two aromatic atoms irrespective of the actual bond type. Sometimes aro-
matic bonds are used to represent delocalized systems as well. For those cases it may be
useful to match an aromatic bond between aliphatic atoms, e.g. [A]:[A], or to nd a bond of
type aromatic in an aromatic system with a pattern like [a]:;!=;!-[a] that excludes single and
double bonds in aromatic systems.
11.13.4 Implicit hydrogens, valences and formal charges
SMARTS
TM
includes several rules that depend on connected hydrogens. Usually FlexX ex-
pects a completely correct protonated molecule as ligand.
If the transformation level 4 (H) is enabled (see command SELINIT (7.5.4)), then missing
hydrogens can be assigned by FlexX itself.
To do this correctly and to match rules that expect implicit hydrogens ([X] or [h]) that are not
explicitly given in the molecule structure, FlexX calculates the number of missing hydrogens
based on the number of adjacent heavy atoms and the assigned bond order. Every bond
contributes valence fractions to the overall bond order of an atom. The total valence of an
atom is calculated by
valence =
b
single
+
b
amide
+
b
double
2 +
b
aromatic
1.5 +
b
tri ple
3.
Odd numbers of aromatic bonds yield fractional valences and FlexX reports a warning for
those cases. For internal reasons fractional valences are usually summed up by 0.5 valence to
the next full valence (a single aromatic bond equals a valence of two, three aromatic bonds a
valence of 5 and so on). The valence can be directly checked by the [v<n>] atomic primitive.
For each chemical element a number of allowed valence states are dened in the static data
le chempar.dat in the @valence_states section, where additional properties such as the
resulting formal charge and the number of free electrons for each state are dened. Refer to
11.5.4 for more information.
It is assumed that negative formal charges will usually be found on acids or need to be com-
pensated by missing hydrogens so that the properties [C-1] and [Ch1] would have more or
less identical meaning. SMARTS
TM
provides a mechanism which allows you to work with
implicit hydrogens for a given number of bonds [X<n>]. So [CX4] matches any carbon that
is connected to four neighbors, including implicit hydrogens that were not necessarily given
in the molecule. So this pattern would match a terminal methyle group independent of the
number of assigned hydrogens in the range from 0-4 including methane and all negatively
charged derivatives (C,[C-4],[C-3],[C-2],[C-1]) and of course any carbon connected to four
heavy atoms. The number of implicit hydrogens is therefore the total number of addable
hydrogens to reach a neutral state of an atom.
Note this assumption is made for all atoms including nitrogen and oxygen, which means
that both C[O-] and C[O] will match the pattern [X2] (one explicit bond to carbon and in
the case of the alcoholate one potential/implicit hydrogen). Formal charges have a dou-
ble meaning in FlexX. On the one hand the formal charge is dened by the valence state,
which is usually correct. On the other hand FlexX uses so-called delocalized systems, where
integral formal charges are distributed over several atoms. It is highly recommended that
charge rules are used only for charges and not for missing hydrogens. As a FlexX-specic
11.13. SMARTS
TM
SUPPORT 329
extension to SMARTS
TM
, testing for fractional charges such as [O-0.5] is allowed. If no num-
ber is given, + and - characters are counted and summed to get the total charge. To give
some examples, [-] means any atom with charge -1 and [Ca++] means a calcium ion. But
[*+-+-] is identical to [+0] or [-0] and would match only uncharged atoms.
11.13.5 Bond primitives
The supported bond types are shown in the following table:
Symbol Atomic property requirements
- Single bond
= Double bond
# Triple bond
-^ Amide bond (FlexX-specic extension)
: Aromatic bond
~ Any bond
@ Any ring bond
. Not connected
The default or implicit bond type in SMARTS
TM
is single OR aromatic (-,:). As a FlexX-
specic extension the amide bond (-^) type has been introduced because FlexX uses this
bond type internally to represent the non-rotatable bond in amide groups. This SMARTS
TM
expression should only be used for FlexX-specic patterns and you should not expect to nd
it in any other SMARTS
TM
implementation.
The amide bond is usually compatible with a single bond, so to match it, it is sufcient to
write a single bond. In ring systems like C1CCCC1, the ring closing bond is usually taken
as any ring bond, whereas the expression C1CCCC@1 is absolutely identical. You can see
that a bond type given before the closing ring identier species the closing bond. The po-
sition of a ring label can appear before or after a bond descriptor. It is important to note that
a bond specier before the label species the type of the closing bond (e.g. C=1CCCCC1
equals C1CCCCC=1) whereas a bond specier after a label species the bond to the follow-
ing atom (e.g. C1=CCCCC1). The not-connected bond (.), the so-called component-level
grouping, is not really supported by FlexX, because salts and complexes that are not cova-
lently connected are currently not supported and such structures are rejected upon loading.
But in transformation patterns, this kind of bond may be used to cut molecules into frag-
ments, where the right part of the expression/molecule will be removed. As an example the
command
transform c1ccccc1C(=O)-OCC c1ccccc1C(=O)-[OH].CC
removes the ethyl group from the ester and yields to the free benzoic acid. The resulting
molecule from the right-hand side of the transform command will have the ethyl group
deleted.
11.13.6 Logical operators
SMARTS
TM
allows atomic primitives to be combined using a couple of logical operators
given in the table below.
330 CHAPTER 11. FILES AND FILE FORMATS
Symbol Expression Example, meaning
! not [!#6] means any atom, but not carbon
, or [#6,#7] means any carbon or any nitrogen
none and (highest) [X2H1] combines two expression to one tightly
& and (high) [a] aromatic carbon
; and (low) [a;#6,#7] aromatic carbon or any nitrogen
In contrast to most computer languages SMARTS
TM
does not use brackets to dene the
precedence of an expression. It uses two kinds of logical AND expressions. Additionally,
FlexX actually uses three ways to dene the AND expression. The most recommended type
is the implicit AND expression that has the highest precedence. Lets look at an example.
[a#6] and [a] have identical meaning, but different atomic primitives are stored in one
subgraph vertex. Atomic primitives separated by logical expressions are stored in a vari-
ants queue that is interpreted sequentially based on the precedence of the logical expres-
sion. However, the expression [X2H2] has a higher precedence than [X2&H2]. As another
example take [X2H2&a,C]. The highest precedence is assigned to [X2H2], after that [a] is
recognized and at least a logical OR test on carbon is performed.
Logical expressions may be given for bond types as well, the pattern *@;-,=&!#* matches two
atoms connected by a cyclic bond that may be a single or double bond, but not a triple bond.
The implicit AND combination is not allowed for bond types.
11.13.7 Recursive SMARTS
TM
Any SMARTS
TM
expression may be used to dene an atomic environment. The denition
of such recursive patterns is usually enclosed in $(). The specied atom must be the rst
atom of the recursive expression and represents an atomic property like any other atomic
primitive. As an example lets take a carbonyl group pattern C=O. To specify the carbonyl
carbon only, you can write [$(C=O)]. The resulting subgraph species only one atom, the
double bond to the carbonyl oxygen is just used as a further constraint. As another example
an amide nitrogen can be represented by [$(N C(=O))].
To use recursive SMARTS
TM
such as [$(C(O)O)] in a batch script, use the following
expression: " [$(C(O)O)] ". Note the blanks between " and !
11.13.8 Branches
Branches are represented by brackets "()". Acarboxylic acid can be represented by C(=O)O,
the branching bond (=O) must be included in the brackets; the expression C=(O)O would
be accepted as well, but the double bond will be assigned to the next atom. Thus the expres-
sion shown is identical to C(O)=O.
11.14 Dening subgraphs using SMARTS
TM
The most important application of SMARTS
TM
in FlexX is to enhance the subgraph de-
nition using a more standardized mechanism. As an example lets take the Acceptor_N
denition from the previous section.
@subgraph 0 1 Acceptor_N
11.15. USING TEMPLATES, VECTOR BINDINGS 331
atom 1 N2ar CHARGE 0.0 NOF_BONDS 2
atom 2 R
atom 3 R
bond 1 2 UN 3 UN
data
...
end
This denition can be written by SMARTS
TM
:
@subgraph 0 1 Acceptor_N
smarts [nD2+0](~
*
)~
*
data
...
end
The atomic order in SMILES matches the occurrence in the pattern.
11.15 Using templates, vector bindings
An extension of the recursive pattern recognition mechanism in SMARTS
TM
is the usage of
so-called vector bindings or substructure templates that can be predened by the keyword
@vector_binding in any static data le. Once dened, its name can be used as a template for
recursive expressions.
@vector_binding phenol [$(Oc1ccccc1)]
@vector_binding ester [$([OD2]-C(=O)
*
)]
@subgraph 1 0 some_oxygen
smarts [$phenol,$ester]
data
...
end
It is obvious that readability is much better with predened vector bindings. Usually vector
binding denitions are local, but if no appropriate vector binding is found in the local le,
FlexX searches in all other static data les for a vector binding denition with the given
name. Patterns for SYBYL atom types are dened as vector bindings as well. A classical
@defgroup statement denition like
@defgroup Nar N.2 N.ar N.pl3 N.am
can be redened as
@vector_binding Nar [$N.2, $N.ar, $N.pl3, $N.am]
@vector_binding Nar2 [{N.2},{N.ar},{N.pl3},{N.am}]
332 CHAPTER 11. FILES AND FILE FORMATS
@subgraph 1 0 my_subgraph
smarts $Nar
data
...
end
Note: The denition of Nar2 is only valid in the FlexX environment. To write compatible
SMARTS
TM
we recommend you use the vector bindings syntax which is more common in-
stead of direct SYBYL-type matching. However direct SYBYL-type matching is much faster
because the SYBYL types are represented internally by single numbers and no further sub-
graph matching is necessary.
11.16 Transforming molecules via SMARTS
TM
The subgraph recognition mechanism based on SMARTS
TM
is extended by a substructure
modication facility called transformation. The command transform in the mol A new
command transform in the ligand menu allows direct application of transformation rules.
The static data le transform.dat contains all rules used during molecule import.
A transformation rule is dened by two SMARTS
TM
patterns, a match pattern and a trans-
formation pattern. The match pattern describes a subgraph in a molecule, whereas the trans-
formation pattern describes properties that should be assigned to these recognized atoms.
A simple example is given below:
transform [CD3](~[OD1])[OD1] >>
*
(=[+0.0])-[-1.0]
The left side matches terminal carbonic acids. The right side denes explicitly some prop-
erties that should be set, and rst we found an asterisk that matches the carbon atom in the
match pattern. An asterisk in a transform pattern is just a placeholder, to keep the atom
enumeration identical to the match pattern and nothing is done to this atom. The bond be-
tween carbon and the rst oxygen should be a double bond and will be set to this type if
necessary. Then a formal charge of 0.0 should be assigned to the atom matching the rst
oxygen. This is the only property that will be changed. The bond between the carbon and
the second oxygen will become a single bond and a formal charge of -1.0 will be assigned.
All other properties of an atom that are not explicitly dened in the transform pattern will
not be affected.
11.16.1 Formal charges and hydrogens
A typical application of these transformation rules is to assign formal charges and proto-
nation states to an atom. Both the formal charge and the protonation state depend on each
other, so the transformation rules belowhave identical meaning, but they include each other
implicitly.
transform [CD3](~[OD1])[OD1] >>
*
(=[+0.0])-[O-]
transform [CD3](~[OD1])[OD1] >>
*
(=[+0.0])-[OH0]
The situation becomes more difcult if the target atomis not part of an acidic or basic group,
but just a carbon atom. So the rst rule below (1) neutralizes all groups, this is pretty clear,
11.16. TRANSFORMING MOLECULES VIA SMARTS
TM
333
and all atoms get as many hydrogens added to them to reach a formal charge of 0. But
the second rule (2) is not so obvious: not all atoms are parts of titratable groups, yet the
transform command would add as many hydrogens to all atoms until a formal charge of
+1 is reached for the respective atom. To prevent being trapped in such cases, it is a good
idea to specify the number of hydrogens together with the corresponding formal charge, just as
in example (3). Here we have a guanidinium-like carbon atom, and in these situations a
formal charge of +1 will be placed on the carbon which should then have the SYBYL type
C.cat.
(1) transform [
*
] >> [+0]
(2) transform [
*
] >> [+1]
(3) transform [$(C(N)(N)N)] >> [C.catH0+1]
So the [Hn] rule has priority over the charge assignment. If an explicit number of hydro-
gens is specied, an additional formal charge is just assigned without any inuence on the
protonation. If no protonation is specied, the number of hydrogens is adjusted to reach the
specied formal charge (refer to 11.5.4 for details). The example below shows a potential
rule to delocalize a carboxylic acid. Here all hydrogens are removed and a formal charge of
-0.5 will then be applied to both oxygen atoms.
transform C(~O)~O >> C(:[OH0-0.5]):[OH0-0.5]
Note: The transform rules are processed from left to right. The order of occurrence of the
labels must therefore be identical on both sides of the rule, e.g.
[C:1](=O)[C:2] >> [C:1](=O).[C:2] # correct order
[C:1](=O)[C:2] >> [C:2].[C:1](=O] # wrong order
Furthermore, the rules must be dened so that the bonds are cut rst and additional
atoms/linkers are added afterwards:
[C:1][N:2](-[C:3])[C:4] >> \
[C:1](-[1
*
]).[N:2](.[C:3](-[1
*
]))(-[2
*
])(-[2
*
])(-[2
*
]).[C:4](-[1
*
])
^ ^
cut bond add linkers => correct
[C:1][N:2](-[C:3])[C:4] >> \
[C:1](-[1
*
]).[N:2](-[2
*
])(-[2
*
])(-[2
*
]).[C:3](-[1
*
]).[C:4](-[1
*
])
^ ^
add linkers cut bond => wrong
Furthermore, if a rule should match more than once make sure that the SMARTS
TM
patterns
do not overlap, i.e. an atom can be matched by a SMARTS
TM
pattern only once. So make
sure that your matching rule does not match to many atoms. At best only the atoms directly
next to the bond to be cut are matched. Recursive SMARTS
TM
expressions can be used to
match the environment. For example, if you wanted to cut at an ester group, you could
write
[
*
:1][O:2][C:3](=[O:4])[
*
:5] >> [
*
:1][O:2][1
*
].[2
*
][C:3](=[O:4])[
*
:5]
334 CHAPTER 11. FILES AND FILE FORMATS
But in this case the match pattern is quite large. The molecule COC(=O)C(=O)COC could
not be shred twice because the pattern would have to match both carbonyl carbon twice.
Better would be to use recursive SMARTS
TM
expressions which match only the atoms next
to the cut:
[$([O:2]([
*
]))][$([C:3](=[O:4])[
*
]]) >> [O:2]([
*
])[1
*
].[2
*
][C:3](=[O:4])[
*
]
If even recursive SMARTS
TM
expressions do not work, you will have to dene a special rule
with higher priority for these particular cases.
11.16.2 Atom type assignment
As mentioned earlier, FlexX traditionally makes heavy use of SYBYL atom types. This spe-
cial property can be used for matching as well if the text denition of the atom type is
given in {}. The SMARTS
TM
expression {C.3} would match all atoms that are of type C.3. In
transformation rules, you can use the same expression to explicitly enforce a special SYBYL
type for a given atom. A SYBYL atom type is internally represented by a hybridization
state and the element. The element/hybridization state combinations are dened in the le
chempar.dat. The following example assigns a SYBYL type C.3 to all atoms that match
the SMARTS
TM
pattern on the left side of the expression. The changed properties are the
element, hybridization state and of course the SYBYL type number.
transform [#6X4] >> C.3
The following table describes in detail what kind of assignments are possible at the moment.
Element Chemical element
C.3 SYBYL type, element, hybridization state
[a],[A] Aromaticity
[+<n>],[-<n>] Formal charge
[H<n>] Number of hydrogens
bond-types All bond types can be inter-converted
Note: Transformation is a very powerful mechanism given to the user and should be used
very carefully because it cannot always be assured that the resulting molecule is correct
and can be docked correctly and all properties are still consistent. It is advisable to use
transformation rules in the context of the ligand initialization procedure during loading and
not in a batch script afterwards. Nevertheless the TRANSFORM function is quite useful for
testing a transformation rule before putting it into transform.dat.
11.17 Structure correction and atom type assignment
Traditionally, FlexX takes its ligands from les in SYBYL MOL2 format. All subgraph de-
nitions depend on le formats and the underlying atom type denition is further used for
torsion and interaction geometry assignment. To enable ligands to be imported from other
sources such as MDLs SD le format, or crystal structures such as the PDB format, correct
assignment of atom and bond types is necessary.
11.18. TRANSFORMATION RULES (TRANSFORM.DAT) 335
Depending on which transformation level is enabled, the transformation rules from trans-
form.dat in the static data directory are used during ligand import. There are several levels
of assignment during import of a ligand structure.
From this list it becomes clear that at the stage of structure correction, the complete range
of SMARTS
TM
properties is available. At the stage of aromaticity assignment it is clear that
using properties that depend on a correct assignment of aromaticity makes no sense.
11.18 Transformation rules (transform.dat)
The static data le transform.dat is new since release 2.0 and covers a lot of things that have
been distributed in different static data les or used to be hard-coded in earlier versions
of FlexX. This is a consequent extension to the principle of separating chemistry and al-
gorithms. Nearly everything FlexX knows about correct chemistry is placed in these static
data les, which makes it possible to give computational chemists the chance to reviewwhat
FlexX is doing.
The transformation rules placed in transform.dat cover the complete initialization process
during ligand loading which checks and adjusts their chemical properties, such as correction
of errors, valence check, aromaticity perception, atom type assignment, protonation and
formal charges.
For this purpose a new keyword @transform has been introduced. The syntax of a transfor-
mation rule is
@transform <class/level> <prio> matchpattern >> transformpattern
The syntax is similar to the transform command described above, but contains a class/level
and priority number as well. The class/level IDdenes the group to which a transformation
belongs. The priority number denes the order of application in a specic group. Rules
with higher priority values are applied prior to rules with lower numbers. But there is no
exclusion between different patterns.
No overlapping matches are allowed within an application of one single rule. As an example
lets take a bis-phosphate as a ligand, e.g. OP(=O)(-O)OP(=O)(-O)O, and a transformation
rule that matches one side of the phosphate groups OP(=O)(-O)[OD2]. At rst glance, you
would expect to match both phosphates, but the bridging oxygen between them would
be matched two times for the same subgraph, which is not allowed. Another example is
a subgraph for a phenyle ring c1ccccc1, this would match benzene six times and that is
denitely not what we want.
So the phosphate rule will only match one of the two PO
3
groups because the bridging oxy-
gen between them will not be matched a second time. It is usually a good idea to use recur-
sive SMARTS
TM
expressions if only certain atoms should be modied; complete subgraphs
are better if bonds need adjusting. The following example shows the difference between the
explicit subgraph and the recursive representation. The second line will be able to match all
four single-bonded oxygens in the bis-phosphate, the rst one will only match two of them.
Example
@transform 5 1 bad_RPO3 OP(=O)(-[OD1])[OD2] >> [O-]P(=O)(-[O-])O
@transform 5 1 good_RPO3 [$(OP(=O)(-[OD1])[OD2])] >> [O-]
336 CHAPTER 11. FILES AND FILE FORMATS
Summary: Within one rule overlaps are avoided, but different rules in a single level can
match and modify atoms several times. The priority only sets the order of the application of
rules.
It is highly advisable to test any newtranformation rule with the transformcommand before
inserting it into the transform.dat le. The transformation engine is intended to change
simple atom properties, modifying 3D coordinates upon bond type changes is not possible.
The resulting molecule can of course be passed through the built-in minimizer after the
transformation process, but this is not the default behavior.
A detailed description of the transform.dat le that comes with your current FlexX release
can be found in the transform.dat le in your static_data directory. Please consult
the latest rules and advice there.
Note: The transform rules are processed from left to right. The order of occurrence of the
labels must therefore be identical on both sides of the rule, e.g.
[C:1](=O)[C:2] >> [C:1](=O).[C:2] # correct order
[C:1](=O)[C:2] >> [C:2].[C:1](=O] # wrong order
Furthermore, the rules must be dened so that the bonds are cut rst and additional
atoms/linkers are added afterwards:
[C:1][N:2](-[C:3])[C:4] >> \
[C:1](-[1
*
]).[N:2](.[C:3](-[1
*
]))(-[2
*
])(-[2
*
])(-[2
*
]).[C:4](-[1
*
])
^ ^
cut bond add linkers => correct
[C:1][N:2](-[C:3])[C:4] >> \
[C:1](-[1
*
]).[N:2](-[2
*
])(-[2
*
])(-[2
*
]).[C:3](-[1
*
]).[C:4](-[1
*
])
^ ^
add linkers cut bond => wrong
11.19 *Ligand formal charges (fcharges.dat)
Note: The static data le fcharges.dat comes with FlexX 2.2 only for backward compatibility,
in further versions only the rules in transform.dat will be supported.
If the ag ASSIGN_FORMAL_CHARGES equals 1, subgraphs from the fcharges le are
mapped onto the ligand in order to assign formal charges to the atoms. The data section
of subgraphs in fcharges consists of a charge value which is assigned to the ligand atom
mapped to atom 1 of the subgraph.
11.20 *Automatic correction of localized systems (delocal-
ized.dat)
Note: The static data le delocalized.dat comes with FlexX 2.2 only for backward compati-
bility, in further versions only the rules in transform.dat will be supported.
If the ag ASSIGN_DELOCALIZED equals 1, subgraphs fromthe deloc le are mapped onto
the ligand in order to automatically change localized systems into delocalized ones. The
11.21. *DEFINING THE DESCRIPTORS FOR CALCULATING LOGP VALUES (LOGP.DAT) 337
subgraph le contains denitions for both directions (converting localized to delocalized
and vice versa). Subgraphs for converting from localized to delocalized belong to class 1,
otherwise to class 2.
The data section contains one line for each atom that must be modied:
<atom no> <new type> <new fcharge> [<bond to> <new bond type>]
*
<atom no> denes the atom in the subgraph, <new type> and <new fcharge> dene the
new type and formal charge values for this atom. The subsequent data pairs <bond to>,
<new bond type> dene a bond attached to <atom no> (and to <bond to>) and a new
type for it.
11.21 *Dening the descriptors for calculating logp values
(logp.dat)
The data area in the subgraph denition consists of the logp and the refractivity atom pa-
rameters (optional). The refractivity atom parameters are not used in FlexX.
data
logp <partial_lopP_value>
[ar_I <refractivity_I>]
[ar_III <refractivity_III>]
end
The parameter <LOGP_CLASS> is related to the parameter <class> of the subgraph de-
nition (see section 11.10.2).
The subgraphs of <LOGP_CLASS> = 1 contain the Wildman/Crippen descriptors [34] and
the subgraphs of <LOGP_CLASS> = 2 contain the Ghose/Crippen descriptors [13].
11.22 *Graphics (graphic.dat)
This le contains the default settings for variables relating to FlexX graphics. The descrip-
tion is not yet complete, but the le is self-explanatory. If you want to change any defaults
relating to graphics, therefore, do not hesitate to look in this le. The default graphics set-
tings sections are arranged very similarly to the parameters that must be entered for the
SELxxx commands.
This le also contains some entries which can only be changed directly in the le, for exam-
ple the denitions of colors.
The graphic.dat le is divided into sections, the name of each section must be preceded
by @. At the start of the le, colors are dened for certain objects found in FlexX, the
following sections set defaults for the SELxxx commands and nally at the end of the le
the actual denitions of colors can be found.
The skeleton of graphic.dat is:
@atom-colors
<element name> <color>
338 CHAPTER 11. FILES AND FILE FORMATS
@contact-colors
<contact type name> <color>
@particle-colors
<particle name> <color>
@ligand-defaults
@lig-ref-coords-defaults
@receptor-defaults
@docking-defaults
@combilib-defaults
@ensemble-defaults
@ens-rlig-defaults
@pharm-defaults
@grid-gaussian-defaults
@colors
<red_value> <green_value> <blue_value> [<lucent_value>] <COLORNAME>
The settings in sections @atom-colors, @contact-colors, @particle-colors and
@colors can only be changed in this le and cannot be accessed from FlexX itself.
Before we go into details about each section of the le, we will rst describe colors and color
modes.
11.22.1 Colors
A valid <color> is one of the following expressions:
A number between 0 and 360 interpreted as an angle in the color circle. In FlexX, the
value 0 represents invisible. Thus, it is possible to exclude parts of a drawing (e.g. main
directions, receptor surface) by setting their color to 0. Values 1360 run through the
color circle (from dark blue, through red, yellow, green to blue).
A color name dened under @colors. Each color name can be preceded with the
word "trans" for translucent color, for example "trans light green" gives a translucent
light green color. When entering color names at the SELCOL command prompts, note
the catch described in 6.1.3!
11.22. *GRAPHICS (GRAPHIC.DAT) 339
R
E
C
E
P
T
O
R
L
I
G
A
N
D
L
I
G
A
N
D
(
r
e
f
c
o
o
r
d
s
)
D
O
C
K
I
N
G
C
L
I
B
P
H
A
R
M
E
N
S
E
M
B
L
E
(
p
r
o
t
e
i
n
)
E
N
S
E
M
B
L
E
(
r
e
f
l
i
g
)
E
N
S
E
M
B
L
E
/
G
R
A
P
H
Color mode see e.g. description
ATOM x x x x x x 7.5.17
ACCESS x x 7.6.14
CENDIST x x x x x 7.5.17
COMPONENT x 8.4.3
CONTACT x x x x x x x 7.5.17
ENERGY
(of docking solution) x x x 7.5.17
ENERGY
(of interaction) x 7.8.24
FRAGMENT x x x 7.5.17
INVISIBLE x x x x x x x 7.5.17
OPT_ENERGY
(of interaction) x 7.8.24
POLYCOL x x 8.4.2
SECSTR x x 7.6.14
SURF_ATOM x x x x x 7.5.17
SURFPATCH x x x x x 7.5.17
UNIQUE x x x x x x x x x 7.5.17
Table 11.2: All available color modes for drawing various objects in FlexX. Each row shows
a color mode. The columns showthe menus (menu and drawing object) for which that color
mode can be used. In the last column a link will take you to an example where that color
mode is used there you will nd an explanatory text about the color mode.
Three (RGB) or four (RGBA) oating numbers with values between 0.0 and 1.0 rep-
resenting RGB(A) values (the fourth number for RGBA denes opacity, if only three
values are given, 1.0 is assumed for the fourth value for solid). In the le the values
are separated by blanks. For the SELCOL commands, the values may also be separated
by slashes, e.g. 0.8/0.33/0.0 or 0.8/0.33/0.0/1.0. When entering color names
at the SELCOL command prompts, note the catch described in 6.1.3!
11.22.2 Color modes
Color modes should not be confused with colors! Color modes describe the kind of coloring
scheme which should be used for drawing various objects. For example, the color mode
ATOM can be used for coloring molecules. This means the molecule is colored in many
colors which reect the various atom or element types. For different types of objects, there
are different color modes available. Table 11.2 shows all the available color modes and the
objects for which they are available. If you are viewing the PDF version of this documenta-
tion, a link in the table will take you to the denition of the color mode found in one of the
SELCOL commands.
We will now move on to look at the different sections of the graphic.dat le.
340 CHAPTER 11. FILES AND FILE FORMATS
11.22.3 Dening atom colors (@atom-colors)
Syntax: <element name> <color>
<element name> must be the chemical name of an element. These entries determine the
colors of atoms and bonds in color mode ATOM.
11.22.4 Dening colors for interaction (contact) types (@contact-colors)
Syntax: <interaction name> <color>
<interaction name> must be the name of an interaction (otherwise known as contact) type.
These colors determine the colors of interaction types in color mode CONTACT.
11.22.5 Dening defaults for the graphics settings
In the following sections of the graphic.dat le, defaults can be set for many of the se-
lections made with the commands SELADM, SELGRA, SELCOL, SELLAB and DRAW in the
various menus. The selections can in fact be categorized into several groups, which will be
described before moving on to detailed descriptions of these sections of the graphic.dat
le.
Colors
See above 11.22.1.
Color modes
See above 11.22.2.
Switches
Syntax: switch <name> <int_val>
Description: Switches take an integer value as a way of making a selection. The available
choices may be yes/no or a range of values. If you are working in FlexX, the choices
will be offered at the parameter prompt. For yes/no questions you can enter either y,
yes or 1 for yes, and similarly n, no or 0 for no. Otherwise, enter the integer
value for the answer in the le.
Scalars
Syntax: scalar <name> <double_val>
Description: Scalars take a oating-point value. These are used for example, for setting
range limits of docking solution energies for the color mode ENERGY. If you are work-
ing in FlexX, the parameter prompt will offer a minimum and maximum value that
you may enter for the oating-point number.
11.22. *GRAPHICS (GRAPHIC.DAT) 341
Lists
Syntax: list <name> <int> [<int> . . . ]
Description: Lists take a list of integers as their value. If you are working in FlexX, you may
enter the list as integers or integer ranges (format a b) separated by , or blanks (note
the catch described in 6.1.3). For example: 1,4,7-9. In le graphic.dat you must give
every integer explicitly, separated by blanks. For example: 1 4 7 8 9
Gauss mode
For describing the sections containing the default graphics settings for the commands
SELADM, SELGRA, SELCOL, SELLAB and DRAW, we will take just one section as an exam-
ple and run through in detail. There are sections in the graphic.dat le for:
@ligand-defaults
@lig-ref-coords-defaults
@receptor-defaults
@docking-defaults
@combilib-defaults
@ensemble-defaults
@ens-rlig-defaults
@pharm-defaults
@grid-gaussian-defaults
These sections in the graphic.dat le are arranged in a similar order to the parameter
ordering you will meet at the actual command in FlexX. Although the names of the entries
in the le are not exactly the same as the command parameter names, they should be easy
to recognize.
One important entry in the graphic.dat le in all these sections is the switch ORG_MODE
for the SELADM command. You will not come across this switch when working with FlexX.
All other entries in these sections correspond to parts of the SELxxx or DRAW commands.
We take as our example the SELxxx/DRAW commands from the RECEPTOR menu and com-
pare them with the graphic.dat section @receptor-defaults. The comparison can
be seen in Table 11.3. Not all command parameters have entries in the graphic.dat le.
If the usage or meaning of the le entry deviates from that of the command parameter an
additional explanation is offered. Refer to 7.6.14 for explanation of the entries where no text
is given.
# --------------------------------------------------------------------
# Definitions for the default RECEPTOR graphics settings
# --------------------------------------------------------------------
@receptor-defaults
# ---------------
# RECEPTOR SELADM
# ---------------
switch MOL_OBJ_NUMBER 4
switch ORG_MODE 1
342 CHAPTER 11. FILES AND FILE FORMATS
switch TMP_FILES 1
switch APPEND_MODE 0
# -----------------------
# RECEPTOR SELGRA
# -----------------------
switch MOL_DISP_MODE 1
switch DRAW_HYDROGEN 2
switch DRAW_ACTIVE_SITE 1
switch DRAW_COMPLETE_AMINO_ACIDS 1
switch DRAW_INTERACT_GEOMS 0
switch DRAW_IA_POINTS 0
switch DRAW_ALL_CONTACT_TYPES 1
list ACTIVE_CONTACT_TYPES 0 1 2 3 4 5 6 7
switch DRAW_SITE_PARTICLES 0
switch DRAW_SURFACE 0
switch DRAW_BACKBONE 0
# -----------------------
# RECEPTOR SELCOL
# -----------------------
# default color mode for drawing receptor
colormode RECEPTOR ATOM
# default color for UNIQUE color mode
color RECEPTOR blue
# default color mode for drawing geometries
colormode GEOMETRY CONTACT
# default color for UNIQUE color mode
color GEOMETRY blue
# default color mode for drawing surface
colormode SURFACE UNIQUE
# default color for UNIQUE color mode
color SURFACE light green
# default color for reentrant/concave patches
color SURFPATCH_COLOR_0 red
# default color for saddle patches
color SURFPATCH_COLOR_1 red
# default color for convex patches
color SURFPATCH_COLOR_2 red
# nof colors / start / end colors for CEN_DIST rainbow
switch N_NOF_CENDIST_COLORS 10
color N_CENDIST_COLOR_0 blue
color N_CENDIST_COLOR_1 red
# default color mode for drawing backbone
colormode BACKBONE SECSTR
# default color for UNIQUE color mode
color BACKBONE light green
# default color for helix
11.22. *GRAPHICS (GRAPHIC.DAT) 343
color SECSTR_COLOR_0 red
# default color for sheet
color SECSTR_COLOR_1 dark green
# default color for loop / turn
color SECSTR_COLOR_2 blue
# -----------------------
# RECEPTOR SELLAB
# -----------------------
switch ATOM_NAMES 1
switch INFILE_NUMBERS 0
switch FORMAL_CHARGES 0
switch PARTIAL_CHARGES 0
switch AMINO_ACID_NAMES 1
switch AMINO_ACID_NUMBERS 1
switch CHAIN_IDS 0
# -----------------------
# RECEPTOR DRAW
# -----------------------
344 CHAPTER 11. FILES AND FILE FORMATS
g
r
a
p
h
i
c
.
d
a
t
e
n
t
r
y
n
a
m
e
c
o
m
m
a
n
d
p
a
r
a
m
e
t
e
r
n
a
m
e
t
y
p
e
v
a
l
u
e
s
e
e
S
E
L
A
D
M
:
M
O
L
_
O
B
J
_
N
U
M
B
E
R
<
g
r
a
p
h
i
c
s
o
b
j
e
c
t
n
u
m
b
e
r
>
s
w
i
t
c
h
[
1
2
5
5
]
7
.
6
.
1
2
s
e
l
e
c
t
s
t
h
e
g
r
a
p
h
i
c
s
o
b
j
e
c
t
n
u
m
b
e
r
f
o
r
t
h
e
r
e
c
e
p
t
o
r
d
r
a
w
i
n
g
w
h
e
n
n
o
t
i
n
f
o
m
o
d
e
O
R
G
_
M
O
D
E
s
w
i
t
c
h
[
0
[
1
[
2
<
s
t
a
r
t
f
o
o
b
j
e
c
t
>
<
e
n
d
f
o
o
b
j
e
c
t
>
]
s
e
l
e
c
t
s
t
h
e
r
e
c
e
p
t
o
r
d
r
a
w
i
n
g
o
r
g
a
n
i
z
a
t
i
o
n
m
o
d
e
;
1
:
d
e
f
a
u
l
t
,
2
:
f
o
.
U
s
e
t
h
i
s
s
w
i
t
c
h
w
i
t
h
v
a
l
u
e
2
t
o
s
e
t
d
e
f
a
u
l
t
s
f
o
r
t
h
e
s
t
a
r
t
a
n
d
e
n
d
g
r
a
p
h
i
c
s
o
b
j
e
c
t
s
i
n
f
o
m
o
d
e
,
o
t
h
e
r
w
i
s
e
l
e
a
v
e
t
h
i
s
s
w
i
t
c
h
s
e
t
t
o
1
.
T
M
P
_
F
I
L
E
S
<
t
e
m
p
l
e
s
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
2
A
P
P
E
N
D
_
M
O
D
E
<
a
p
p
e
n
d
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
2
S
E
L
G
R
A
:
M
O
L
_
D
I
S
P
_
M
O
D
E
<
m
o
l
d
i
s
p
l
a
y
m
o
d
e
>
s
w
i
t
c
h
[
1
4
]
7
.
6
.
1
3
D
R
A
W
_
H
Y
D
R
O
G
E
N
<
h
y
d
r
o
>
s
w
i
t
c
h
[
0
2
]
7
.
6
.
1
3
D
R
A
W
_
A
C
T
I
V
E
_
S
I
T
E
<
a
c
t
i
v
e
s
i
t
e
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
3
D
R
A
W
_
C
O
M
P
L
E
T
E
_
A
M
I
N
O
_
A
C
I
D
S
<
c
o
m
p
l
e
t
e
a
a
>
s
w
i
t
c
h
[
0
[
1
[
2
]
7
.
6
.
1
3
D
R
A
W
_
I
N
T
E
R
A
C
T
_
G
E
O
M
S
<
i
n
t
e
r
a
c
t
g
e
o
m
s
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
3
D
R
A
W
_
I
A
_
P
O
I
N
T
S
<
i
a
p
o
i
n
t
s
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
3
D
R
A
W
_
A
L
L
_
C
O
N
T
A
C
T
_
T
Y
P
E
S
<
a
l
l
c
o
n
t
a
c
t
t
y
p
e
s
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
3
A
C
T
I
V
E
_
C
O
N
T
A
C
T
_
T
Y
P
E
S
<
c
o
n
t
a
c
t
t
y
p
e
s
e
l
e
c
t
i
o
n
>
l
i
s
t
[
0
1
4
]
7
.
6
.
1
3
D
R
A
W
_
S
I
T
E
_
P
A
R
T
I
C
L
E
S
<
p
a
r
t
i
c
l
e
s
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
3
D
R
A
W
_
S
U
R
F
A
C
E
<
s
u
r
f
>
s
w
i
t
c
h
[
0
3
]
7
.
6
.
1
3
D
R
A
W
_
B
A
C
K
B
O
N
E
<
b
a
c
k
b
o
n
e
>
s
w
i
t
c
h
[
0
5
]
7
.
6
.
1
3
S
E
L
C
O
L
:
R
E
C
E
P
T
O
R
<
r
e
c
e
p
t
o
r
c
o
l
o
r
m
o
d
e
>
c
o
l
o
r
m
o
d
e
[
I
N
V
I
S
I
B
L
E
[
A
T
O
M
[
U
N
I
Q
U
E
]
7
.
6
.
1
4
R
E
C
E
P
T
O
R
r
e
c
e
p
t
o
r
c
o
l
o
r
m
o
d
e
U
N
I
Q
U
E
:
<
c
o
l
o
r
>
c
o
l
o
r
<
c
o
l
o
r
n
a
m
e
>
7
.
6
.
1
4
G
E
O
M
E
T
R
Y
<
i
n
t
e
r
a
c
t
g
e
o
m
s
c
o
l
o
r
m
o
d
e
>
c
o
l
o
r
m
o
d
e
[
I
N
V
I
S
I
B
L
E
[
U
N
I
Q
U
E
[
C
O
N
T
A
C
T
[
A
C
C
E
S
S
]
7
.
6
.
1
4
G
E
O
M
E
T
R
Y
i
n
t
e
r
a
c
t
g
e
o
m
s
c
o
l
o
r
m
o
d
e
U
N
I
Q
U
E
:
<
c
o
l
o
r
>
c
o
l
o
r
<
c
o
l
o
r
n
a
m
e
>
7
.
6
.
1
4
S
U
R
F
A
C
E
<
s
u
r
f
a
c
e
c
o
l
o
r
m
o
d
e
>
c
o
l
o
r
m
o
d
e
[
I
N
V
I
S
I
B
L
E
[
U
N
I
Q
U
E
[
S
U
R
F
_
A
T
O
M
[
C
E
N
_
D
I
S
T
[
S
U
R
F
P
A
T
C
H
]
7
.
6
.
1
4
s
e
v
e
r
a
l
c
o
l
o
r
a
n
d
s
w
i
t
c
h
d
e
n
i
t
i
o
n
s
f
o
r
s
e
t
t
i
n
g
s
u
r
f
a
c
e
c
o
l
o
r
i
n
g
f
o
l
l
o
w
t
h
i
s
e
n
t
r
y
t
h
e
s
e
f
o
l
l
o
w
t
h
e
c
o
m
m
a
n
d
p
a
r
a
m
e
t
e
r
s
r
e
q
u
i
r
e
d
b
y
<
s
u
r
f
a
c
e
c
o
l
o
r
m
o
d
e
>
B
A
C
K
B
O
N
E
<
b
a
c
k
b
o
n
e
c
o
l
o
r
m
o
d
e
>
c
o
l
o
r
m
o
d
e
[
I
N
V
I
S
I
B
L
E
[
U
N
I
Q
U
E
[
S
E
C
S
T
R
]
7
.
6
.
1
4
s
e
v
e
r
a
l
c
o
l
o
r
a
n
d
s
w
i
t
c
h
d
e
n
i
t
i
o
n
s
f
o
r
s
e
t
t
i
n
g
b
a
c
k
b
o
n
e
c
o
l
o
r
i
n
g
f
o
l
l
o
w
t
h
i
s
e
n
t
r
y
t
h
e
s
e
f
o
l
l
o
w
t
h
e
c
o
m
m
a
n
d
p
a
r
a
m
e
t
e
r
s
r
e
q
u
i
r
e
d
b
y
<
s
u
r
f
a
c
e
c
o
l
o
r
m
o
d
e
>
S
E
L
L
A
B
:
A
T
O
M
_
N
A
M
E
S
<
a
t
o
m
n
a
m
e
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
5
I
N
F
I
L
E
_
N
U
M
B
E
R
S
<
i
n
l
e
n
u
m
b
e
r
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
5
F
O
R
M
A
L
_
C
H
A
R
G
E
S
<
f
o
r
m
a
l
c
h
a
r
g
e
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
5
P
A
R
T
I
A
L
_
C
H
A
R
G
E
S
<
p
a
r
t
i
a
l
c
h
a
r
g
e
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
5
A
M
I
N
O
_
A
C
I
D
_
N
A
M
E
S
<
a
a
n
a
m
e
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
5
A
M
I
N
O
_
A
C
I
D
_
N
U
M
B
E
R
S
<
a
a
n
u
m
b
e
r
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
5
C
H
A
I
N
_
I
D
S
<
c
h
a
i
n
I
D
>
s
w
i
t
c
h
[
0
[
1
]
7
.
6
.
1
5
T
a
b
l
e
1
1
.
3
:
T
h
i
s
t
a
b
l
e
c
o
m
p
a
r
e
s
t
h
e
e
n
t
r
i
e
s
f
o
r
r
e
c
e
p
t
o
r
-
d
e
f
a
u
l
t
s
i
n
g
r
a
p
h
i
c
.
d
a
t
w
i
t
h
t
h
e
c
o
m
m
a
n
d
p
a
r
a
m
e
t
e
r
s
f
o
r
t
h
e
S
E
L
x
x
x
c
o
m
m
a
n
d
s
i
n
t
h
e
R
E
C
E
P
T
O
R
m
e
n
u
.
T
h
e
o
t
h
e
r
s
e
c
t
i
o
n
s
f
o
r
s
e
t
t
i
n
g
g
r
a
p
h
i
c
d
e
f
a
u
l
t
s
i
n
g
r
a
p
h
i
c
.
d
a
t
a
r
e
v
e
r
y
s
i
m
i
l
a
r
a
n
d
o
f
t
e
n
f
o
l
l
o
w
t
h
e
o
r
d
e
r
o
f
t
h
e
c
o
m
m
a
n
d
p
a
r
a
m
e
t
e
r
s
d
e
v
i
a
t
i
o
n
s
f
r
o
m
t
h
i
s
a
r
e
e
x
p
l
a
i
n
e
d
i
n
t
h
e
t
a
b
l
e
.
11.22. *GRAPHICS (GRAPHIC.DAT) 345
Deviations from standard entry denitions
ORG_MODE in combilib-defaults: Note that ORG_MODE 2 has a special mean-
ing for combilib-defaults. If you look at the SELADM command in the CLIB menu
8.2.2, you will see that <graphics object number> set to 0 does not mean fo mode as
it usually does. Therefore, you will notice that the entry for ORG_MODE in the section
combilib-defaults:
switch ORG_MODE 2 0 0
does not follow the rule shown in Table 11.3.
11.22.6 Dening colors @colors
In this section color names are assigned to RGB(A) values. These color names are then
available to use in this le and in FlexX.
Syntax: <red value> <green value> <blue value> <[lucent value]> <COLORNAME>
See also 11.22.1 for more information about colors in FlexX.
346 CHAPTER 11. FILES AND FILE FORMATS
12
*Program interfaces
12.1 Interface to MOE
12.1.1 Triggering FlexX From Within MOE
With Release 2.1, FlexX is compatible with the Molecular Open Environment MOE
TM
which can be obtained from the Chemical Computing Group (www.chemcomp.com). For
the installation details, please see page 22. Should you stumble, please contact us at
www.biosolveit.de/support.
12.1.2 MOE as a Ring Conformer Generator
MOE can be used to generate ring conformers for you. This procedure is a little slower than
using special ring conformer tools. However, technically, we have prepared FlexX to do so.
To this end, please nd an SVL script which is called bsit_rcgenerator.svl and resides
in the installation directory. If you want to use MOE as RCGenerator,
open
File Global Preferences
in the GUI, switch to tab Ring Conformer Generator and select MOE in the combobox.
FlexX will nd the SVL script automatically and adds it to the argument list. There are no
further adjustments to be made if you work in GUI mode. After pressing the Apply button,
FlexX will use this setting as default in the GUI and in the commandline interface.
Although we recommend you use the GUI to dene the RCGenerator as decribed above,
you can change this setting manually in the commandline interface or in batch scripts. To
do this, you have to execute the following commands:
SET RING_MODE 4
SET RCGENERATOR <path_to_moebatch> "-d flexx -run <path_to_svl_script>"
Note that in this case, FlexX will not save these settings as default. They are just set for the
current session.
347
348 CHAPTER 12. *PROGRAM INTERFACES
12.2 Interface to Sybyl
FlexX is compatible with Sybyl
TM
from Tripos.
Please contact us at www.biosolveit.de/support.
12.3 Interface to WHATIF
Since version 1.11, the FlexX - WHATIF interface is no longer maintained.
12.4 Interface to SCA
The interface to SCA no longer exists.
12.5 Interface to CORINA
CORINA [9, 25] is a 3D structure generator used by FlexX to generate ring conformations
or to clean up ligand molecule structures. FlexX requires CORINA versions no older than
v2.6.
In the latest versions of CORINA, the driver option nh is no longer supported. If your
CORINA is that new, please remove the nh from the respective CORINA call of RCGEN-
ERATOR in your cong.dat.
Ring conformation generation:
External usage of CORINA: The value of <RING_MODE> must be set to 1, and the
ag value of <RCGENERATOR> must point to the CORINA executable to use.
CORINA has a specially tailored interface for FlexX which is activated by the driver
option -d exx. Every ring system will then be written into corina_in_
*
.mol2 in
the specied temporary directory in the conguration le. CORINAcreates conforma-
tions which are written into les named corina_out_
*
.mol2 (and some temporary
les). FlexX subsequently processes these les. Tracing information and error out-
put will be written into the standard CORINA trace les named corina.trc in the
current directory.
For the generation of stereoisomers of rings (which is switched off by default), it is nec-
essary to activate the respective switch for CORINA. This is done by adding stergen
to the list of CORINA ags in the line starting with <RCGENERATOR> in your con-
guration.
With Release 3.1, FlexX comes with an integrated version of CORINA which can op-
tionally be used to compute ring conformers for docking.
FlexX can be congured to use this built-in CORINA by adjustment in File
Global Preferences .
12.6. INTERFACE TO CONFORT 349
The RING_MODE is set automatically to 3 in this case, whereas you have to set it explic-
itly to 3 in the commandline interface.
Please note that stereoisomerism in rings is currently not supported with the internal
!
version of CORINA.
Please note: To use the built-in CORINA you need a CORINA_F license which is
available by BioSolveIT.
General compound cleanup (3D coordinate generation):
Here the value of <3D_GEN> must be set to 1 and <3DGENERATOR> must point to the
CORINA executable to be used. For this purpose CORINA has a specially tailored interface
for FlexX switched on by the driver option -i t=sdf -o t=sdf. Every ligand molecule struc-
ture will be written into flexclean__in_
*
.sdf in the specied temporary directory (see
above). CORINA creates its output structures in les named flexclean__out_
*
.sdf.
Error output will again be written into flexclean__trc_
*
.trc in your temporary direc-
tory.
Please note that for different le formats, it may be wise to switch on the <3D_GEN> ag
while for others it may well not be. . .
Similarly, your <3DGENERATOR> may not be appropriate for the respective le format
you plan to employ. We therefore advise you to carefully consider before you change or
(de)activate these ags. Please be aware that different versions of CORINA may produce
different results with FlexX. For example in the case of 1ivf, versions v2.4 and 2.61 deliver
the following scores:
+---+-------+-------+------+------+------+------+-----+------+------+------+------+-----+
|No.|Total |Match- |Lipo- |Ambig-|Clash-|Rot- |RMS- |Simil.|#Match|Avg. |Max. |Frag.|
| |Score |Score |Score |Score |Score |Score |Value|Index | |Volume|Volume|No. |
+---+-------+-------+------+------+------+------+-----+------+------+------+------+-----+
| SunOS corina 2.4 |
| 1|-34.500|-44.177|-2.196|-5.420| 0.693|11.200|6.968| 5.240| 9| 0.032| 0.365| 0|
| SunOS corina 2.61 |
| 1|-37.079|-43.640|-4.892|-7.124| 1.977|11.200|1.204| 0.798| 10| 0.143| 1.266| 2|
-----------------------------------------------------------------------------------------
12.6 Interface to CONFORT
The interface to CONFORT works in the same as the above described CORINA interface.
12.7 The FlexV graphical interface
Your alternative to the native GUI visualisation when working with FlexX is the combina-
tion of the Commandline Mode and FlexV. FlexV supports all graphical features in FlexX
and is a generic viewer for more or less all BioSolveIT products. The following therefore
applies to the Commandline Mode only.
!
When you execute a display command, a FlexV viewer is started and linked to FlexX au-
tomatically. The linking is done via named pipes (_pipe_
*
). The pipes are used for sending
commands only. The graphics are stored in .gdf les. If the graphics les are temporary,
they are written into the directory specied with parameter TEMP with flexv_tmp_
*
.gdf
as lenames.
350 CHAPTER 12. *PROGRAM INTERFACES
Sometimes it is useful to be able to have two FlexV windows open at the sime time, for
example when comparing two results. This can easily be done: type toflexv b at the FlexX
prompt. This command, toflexv, sends commands to FlexV in the same way as by the
internal drawing functions. The parameter b stands for break the pipes. As a result, your
running FlexVwill be disconnected from FlexX and stays separate from FlexX. When you
now draw your second object and type display, FlexX will recognize that your previous
instance of FlexV has been disconnected from FlexX and will start a new one. Admittedly
the drawback of this is that you cannot connect to the rst FlexV later on. (For more on
commands like this, please refer to the FlexV documentation which is freely available from
the BioSolveIT web site http://www.biosolveit.de/download
Remember that FlexV is only a visualization tool and knows (almost) nothing about
molecules. It is therefore not possible to change the coloring of atoms or bonds or change
the labels. All these actions must be done in FlexX before executing the DRAW commands.
IV
APPENDIX
351
A
Default rdf and edf les
A.1 The receptor description: An rdf le
#
# receptor descriptor file
#
# ------------------------------------------------------------------------
# default.rdf: This file contains all default settings for a receptor
# description file. Please fill in filenames at locations marked as
# <file> and go through all records for further modifications.
# General remarks:
# - Rules in one record are always processed from top to bottom,
# a rule may overwrite previous rules. Therefore specific rules
# must always be added at the end of a record. See the User
# Guide for more detailed information (sections 6 and 8.3).
# - The _ character is used instead of in the PDB file
# - Wildcards (
*
) are allowed for several parameters, BUT NOT REGULAR
# EXPRESSIONS OR EXTENSIONS like AS
*
for ASP and ASN. Exceptions
# are listed in the comments.
# ------------------------------------------------------------------------
# PDB file from which the receptor should be read.
# ------------------------------------------------------------------------
# The default directory is PDB (see config.dat).
@pdb_file <file>
# Selection of atoms to be read from the PDB file.
# ------------------------------------------------------------------------
# Rules have the format:
#
# include/exclude <atom name> <amino acid name> <chain ID> <amino acid number>
#
# Wildcards (
*
) are allowed at all positions. For atom names the construct
# E
*
can be used to describe all atoms of element E.
# The underscore represents the empty character which is an allowed chain ID in PDB.
@atoms
include
* * * *
# Name of files containing the active site definition
# ------------------------------------------------------------------------
# The default directory is SITE (see config.dat). This rule gives a list
# of pocket files. All together, these files describe the active site of the
# protein. However, for base placement, the search can be limited to one
# pocket (see TRIHASH command). The former rule @active_site_file still
# works for compatibility reasons. If no pocket is defined, an active
# site can be defined during loading if a ligand with reference coordinates
# was previously loaded.
353
354 APPENDIX A. DEFAULT RDF AND EDF FILES
@pockets
<pocket name> <pocket file>
# Name of file containing the protein surface.
# ------------------------------------------------------------------------
# The default directory is SURF (see config.dat). This file is optional,
# the surface will be computed if not present. The surface file can then
# be written afterwards.
@surface_file <file>
# Initial probe location
# ------------------------------------------------------------------------
# If the active site cannot be reached from the exterior, the surface at
# the active site is not correctly computed. In this case, the probe for
# calculating the surface can be manually placed to the interior of the
# active site. Parameters are the x,y,z coordinates of the initial probe
# location.
# @probe_location <x> <y> <z>
# Assignment of amino acid templates.
# ------------------------------------------------------------------------
# The following list of rules specifies how to associate a given
# amino acid with a specific amino acid template. The templates are
# defined in the static data file AMINO (see config.dat). Its purpose is
# to fix additional degrees of freedom such as protonation and the state of
# charge, and to set additional information such as atom types and bond types.
# Template names consist of amino acid 3-letter code plus modifier(s).
# Meaning of the modifiers of the amino acid three-letter codes:
#
# h cysteine not involved in s-s bond
# 1/2 tautomers of his
# +/- state of charge within side chargeable chain
# n(+) n-terminal, possibly charged at terminal amino group
# c(-) c-terminal, possibly charged at terminal carboxylate group
#
# Modifiers are added to the 3-letter code in the order given above. For
# a more detailed description of templates see the AMINO data file.
# Format of rules:
#
# <amino acid name> <chain ID> <amino acid number> <template_name>
#
# Wildcards are allowed for the first three parameters. The state of charge
# of the terminal amino acids is defined by the final two rules.
#
# WARNING: check the HIS tautomers carefully!
@templates
ALA
* *
ala
ARG
* *
arg+
ASN
* *
asn
ASP
* *
asp-
CYS
* *
cys
GLN
* *
gln
GLU
* *
glu-
GLY
* *
gly
HIS
* *
his2
ILE
* *
ile
LEU
* *
leu
LYS
* *
lys+
MET
* *
met
PHE
* *
phe
PRO
* *
pro
SER
* *
ser
A.1. THE RECEPTOR DESCRIPTION: AN RDF FILE 355
THR
* *
thr
TRP
* *
trp
TYR
* *
tyr
VAL
* *
val
# Templates for N/C-termini. MUST be specified!
# nterm = take template for amino acid with uncharged backbone amino group
# nterm+ = take template for amino acid with positively charged backbone amino
# group
# cterm see above
# cterm- see above
#
* *
first nterm+
* *
last cterm-
# Hetero atoms to be loaded as part of the receptor
# ------------------------------------------------------------------------
# There are two ways of adding a hetero group to the receptor:
# 1. @hetero_atoms
# For simple or frequently occurring molecules like metal ions, water,
# a heme, the coordinates can be taken directly from the PDB file by
# specifying the hetero group ID. The format is:
#
# in/exclude <HET-ID> <chain> <nr>
#
# Wildcards are allowed for all parameters.
# NOTE: A template must be defined for the hetero group (see @template
# above).
# 2. @hetero_files
# In order to avoid template generation, a hetero group can be loaded
# directly from a mol2 file. The coordinates in the file must describe
# the location of the hetero group relative to the protein. Atom types,
# bond types, hydrogens, charges must be set appropriately (as for the
# preparation of ligand molecules). Each filename is a single rule.
#
@hetero_atoms
exclude
* * *
#@hetero_files
# <file>
# Setting alternate locations
# ------------------------------------------------------------------------
# Decide in favor of a certain alternate location indicator in
# column 17 of a pdb ATOM or HETATM record. The format is:
#
# <aa code> <chain> <nr> <indicator>
#
# Wildcards are allowed for the first three parameters.
@alternate_locations
* * *
A
# Torsion angles at terminal hydrogen atoms
# ------------------------------------------------------------------------
# Rules for specifying torsion angles to terminal hydrogen atoms not
# contained in the PDB file. The format is:
#
# <amino acid template> <chain ID> <amino acid number> \
# <atom 1> <atom 2> <atom 3> <torsion angle>
356 APPENDIX A. DEFAULT RDF AND EDF FILES
#
# Wildcards are allowed for the first three parameters. For template names,
# the wildcard character can also be used for terminal characters.
# For <amino acid number> first and last can also be used.
# Currently only one torsion angle per definition is allowed, i.e. the
# hydrogens are kept rigid during docking. The rules are applied only if
# the corresponding hydrogen atom is not contained in the PDB file.
#
# WARNING: Check hydrogens at hydroxy groups (TYR and THR) carefully!
@h_torsions
nterm
* *
first _c _ca _n 180.
cterm
*
last _o _c _oxt 0.
arg
* * *
_ne _cz _nh1 180.
arg
* * *
_ne _cz _nh2 180.
asn
* * *
_od1 _cg _nd2 0.
asp
* *
_od2 _cg _od1 0.
aspn
* * *
_od2 _cg _od1 0.
aspc
* * *
_od2 _cg _od1 0.
cysh
* * *
_ca _cb _sg 180.
gln
* * *
_oe1 _cd _ne2 0.
glu
* *
_oe2 _cd _oe1 0.
glun
* * *
_oe2 _cd _oe1 0.
gluc
* * *
_oe2 _cd _oe1 0.
lys
* * *
_cd _ce _nz 180.
ser
* * *
_ca _cb _og 180.
tyr
* * *
_ce1 _cz _oh 180.
thr
* * *
_ca _cb _og1 180.
# Atom type ambiguity
# ------------------------------------------------------------------------
# There may be some unidentified atom types in a pdb file. This lack of
# information is explicitly indicated in a PDB file by special atom names
# beginning with the letter A (concerns amino acids HIS, GLN, ASN). With
#"@assign" it is
# possible to lift this ambiguity. Note that no alternate location indicator
# is given for such cases. Format:
#
# <aa code> <chain> <nr> <old_atom_name1> <new_atom_name1> .....
#
# Wildcards are allowed for the second and third parameter.
@assign
ASN
* *
_AD1 _OD1 _AD2 _ND2
GLN
* *
_AE1 _OE1 _AE2 _NE2
HIS
* *
_AD1 _ND1 _AE1 _CE1
HIS
* *
_AD2 _CD2 _AE2 _NE2
End of sample rdf le.
A.2. THE ENSEMBLE DESCRIPTION: AN EDF FILE 357
A.2 The ensemble description: An edf le
#
# ensemble descriptor file
#
# ------------------------------------------------------------------------
# default.edf: This file contains all default settings for an ensemble
# description file. Please fill in filenames at locations marked as
# <file> and go through all records for further modifications.
# General remarks:
# - Rules in one record are always processed from top to bottom,
# a rule may overwrite previous rules. Therefore specific rules
# must always be added at the end of a record.
# - The _ character is used instead of in the PDB file
# - Wildcards (
*
) are allowed for several parameters, BUT NOT REGULAR
# EXPRESSIONS OR EXTENSIONS like AS
*
for ASP and ASN. Exceptions
# are listed in the comments.
# - The united protein description is addressed as slot 0. You cannot
# assign rules for slot 0.
# - A selection format is allowed for some parameters
# (e.g. ensemble slots):
# - No whitespaces (blanks, tabs) are allowed, because
# they delimit the different parameters
# - Wildcards (
*
) are allowed
# - Commas (,) are used for enumerations
# - Dashes (-) are used for intervals where appropriate
# - Combinations of enumerations and intervals are possible
# Example: 1,2,3,4,5 or 1-6 or 3-6,9,11-14
# - It is possible to generate filenames which depend on the
# corresponding pdb filename. There are two generators available:
# ~ = : is replaced by the corresponding pdb filename
# ~ # : is replaced by the slot id
# ------------------------------------------------------------------------
# PDB file from which the receptor should be read.
# ------------------------------------------------------------------------
# The default directory is PDB (see config.dat).
# <slot> : slot id 1-30, non-ambiguous reference to the structure
# (need not be sorted or continuous, but should be for clarity)
# <file> : name of the PDB file
#
# REMARK : Slot 1 must be given. It is used as reference. If you want
# to exclude a slot from the ensemble temporarily it is
# sufficient to comment out the pdb file here. All corresponding
# rules will be ignored then.
@pdb_files
<slot> <file>
# Selection of atoms to be read from the PDB file.
# ------------------------------------------------------------------------
# Rules have the format:
#
# include/exclude <slot> <atom name> <amino acid name> \
# <chain ID> <amino acid number>
#
# Wildcards (
*
) are allowed at all positions. For atom names the construct
# E
*
can be used to describe all atoms of element E.
# <slot> : id selection, format see above
@atoms
include
* * *
_
*
358 APPENDIX A. DEFAULT RDF AND EDF FILES
# Active site definition
# ------------------------------------------------------------------------
# The default directory is SITE (see config.dat). If no pocket is defined,
# an active site can be defined during loading by reference ligands that
# can be defined below.
@pockets
<slot> <pocket name> <pocket file>
# Reference ligands
# ------------------------------------------------------------------------
# The default directory is LIGAND (see config.dat).
#
@ref_lig_files
<slot> <ref_lig file>
# Name of file containing the protein surface.
# ------------------------------------------------------------------------
# The default directory is SURF (see config.dat). This file is optional,
# the surface will be computed if not present. The surface file can then
# be written afterwards.
@surface_files
<slot> <surface file>
# Alignment of atoms
# ------------------------------------------------------------------------
# This rule defines which atoms of the particular ensemble structures are
# matched onto which atoms of the reference structure (1). This alignment
# is used for superimposing the structures and for aligning instances for
# the united protein description.
#
# Format:
# <slot> <atom> <aa> <chain> <aa_nr> <atom> <aa> <chain> <aa_nr> [<offset>]
#| ensemble structures 2-30 | matched on reference structure 1 |
#
# Wildcards (
*
) are allowed at all positions, they refer to ANY atom, amino
# acid etc. In contrast, the wildcard (~) can be used for the reference
# structure to refer to CORRESPONDING atoms, amino acids etc.
@align
# for ensemble structure 2-30 | matched on reference structure 1
# slot atm aa chain aa_nr atm aa chain aa_nr offset
* * * * *
~
*
~ ~ 0
# Initial probe location
# ------------------------------------------------------------------------
# If the active site cannot be reached from the exterior, the surface at
# the active site is not correctly computed. In this case, the probe for
# calculating the surface can be manually placed to the interior of the
# active site. Parameters are the x,y,z coordinates of the initial probe
# location. Refer to the reference structure in slot 0!
# @probe_location <x> <y> <z>
# Assignment of amino acid templates.
# ------------------------------------------------------------------------
# The following list of rules specifies how to associate a given
# amino acid with a specific amino acid template. The templates are
# defined in the static data file AMINO (see config.dat). Its purpose is
# to fix additional degrees of freedom such as protonation and the state of
A.2. THE ENSEMBLE DESCRIPTION: AN EDF FILE 359
# charge, and to set additional information such as atom types and bond types.
# Template names consist of amino acid 3-letter code plus modifier(s).
# Meaning of the modifiers of the amino acid three-letter codes:
#
# h cysteine not involved in s-s bond
# 1/2 tautomers of his
# +/- state of charge within side chargeable chain
# n(+) n-terminal, possibly charged at terminal amino group
# c(-) c-terminal, possibly charged at terminal carboxylate group
#
# Modifiers are added to the 3-letter code in the order given above. For
# a more detailed description of templates see the AMINO data file.
# Format of rules:
#
# <amino acid name> <slot> <chain ID> <amino acid number> <template_name>
#
# Wildcards are allowed for the first three parameters. The state of charge
# of the terminal amino acids is defined by the final two rules.
# <slot> : id selection, format see above
#
# WARNING: check the HIS tautomers carefully!
@templates
ALA
* * *
ala
ARG
* * *
arg+
ASN
* * *
asn
ASP
* * *
asp-
CYS
* * *
cys
GLN
* * *
gln
GLU
* * *
glu-
GLY
* * *
gly
HIS
* * *
his2
ILE
* * *
ile
LEU
* * *
leu
LYS
* * *
lys+
MET
* * *
met
PHE
* * *
phe
PRO
* * *
pro
SER
* * *
ser
THR
* * *
thr
TRP
* * *
trp
TYR
* * *
tyr
VAL
* * *
val
# Templates for N/C-termini. MUST be specified!
# nterm = take template for amino acid with uncharged backbone amino group
# nterm+ = take template for amino acid with positively charged backbone amino
# group
# cterm see above
# cterm- see above
#
* * *
first nterm+
* * *
last cterm-
# Hetero atoms to be loaded as part of the receptor
# ------------------------------------------------------------------------
# There are two ways of adding a hetero group to the receptor:
# 1. @hetero_atoms
# For simple or frequently occurring molecules like metal ions, water,
# a heme, the coordinates can be taken directly from the PDB file by
# specifying the hetero group ID. The format is:
#
# in/exclude <slot> <HET-ID> <chain> <nr>
360 APPENDIX A. DEFAULT RDF AND EDF FILES
#
# Wildcards are allowed for all parameters.
# NOTE: A template must be defined for the hetero group (see @template
# above).
# <slot> : id selection, format see above
# 2. @hetero_files
# In order to avoid template generation, a hetero group can be loaded
# directly from a mol2 file. The coordinates in the file must describe
# the location of the hetero group relative to the protein. Atom types,
# bond types, hydrogens, charges must be set appropriately (as for the
# preparation of ligand molecules). Each filename is a single rule.
#
@hetero_atoms
exclude
* * * *
#@hetero_files
#<slot> <file>
# Setting alternate locations
# ------------------------------------------------------------------------
# Decide in favor of a certain alternate location indicator in
# column 17 of a pdb ATOM or HETATM record. The format is:
#
# <slot> <aa code> <chain> <nr> <indicator>
#
# Wildcards are allowed for all parameters.
# <slot> : id selection, format see above
@alternate_locations
* * * *
A
# Torsion angles at terminal hydrogen atoms
# ------------------------------------------------------------------------
# Rules for specifying torsion angles to terminal hydrogen atoms not
# contained in the PDB file. The format is:
#
# <amino acid template> <slot> <chain ID> <amino acid number> \
# <atom 1> <atom 2> <atom 3> <torsion angle>
#
# <slot> : id selection, format (see above)
# Wildcards are allowed for the first three parameters. For template names,
# the wildcard character can also be used for terminal characters.
# For <amino acid number> first and last can also be used.
# Currently only one torsion angle per definition is allowed, i.e. the
# hydrogens are kept rigid during docking. The rules are applied only if
# the corresponding hydrogen atom is not contained in the PDB file.
#
# WARNING: Check hydrogens at hydroxy groups (TYR and THR) carefully!
@h_torsions
nterm
* * *
first _c _ca _n 180.
cterm
* *
last _o _c _oxt 0.
arg
* * * *
_ne _cz _nh1 180.
arg
* * * *
_ne _cz _nh2 180.
asn
* * * *
_od1 _cg _nd2 0.
asp
* * *
_od2 _cg _od1 0.
aspn
* * * *
_od2 _cg _od1 0.
aspc
* * * *
_od2 _cg _od1 0.
cysh
* * * *
_ca _cb _sg 180.
gln
* * * *
_oe1 _cd _ne2 0.
glu
* * *
_oe2 _cd _oe1 0.
glun
* * * *
_oe2 _cd _oe1 0.
gluc
* * * *
_oe2 _cd _oe1 0.
A.2. THE ENSEMBLE DESCRIPTION: AN EDF FILE 361
lys
* * * *
_cd _ce _nz 180.
ser
* * * *
_ca _cb _og 180.
tyr
* * * *
_ce1 _cz _oh 180.
thr
* * * *
_ca _cb _og1 180.
# Atom type ambiguity
# ------------------------------------------------------------------------
# There may be some unidentified atom types in a pdb file. This lack of
# information is explicitly indicated in a PDB file by special atom names
# beginning with the letter A (concerns amino acids HIS, GLN, ASN). With
# "@assign" it is
# possible to lift this ambiguity. Note that no alternate location indicator
# is given for such cases. Format:
#
# <aa code> <slot> <chain> <nr> <old_atom_name1> <new_atom_name1> .....
#
# Wildcards are allowed for the second and third parameter.
# <slot> : id selection, format see above
@assign
ASN
* * *
_AD1 _OD1 _AD2 _ND2
GLN
* * *
_AE1 _OE1 _AE2 _NE2
HIS
* * *
_AD1 _ND1 _AE1 _CE1
HIS
* * *
_AD2 _CD2 _AE2 _NE2
End of sample edf le.
362 APPENDIX A. DEFAULT RDF AND EDF FILES
B
Examples of script les
B.1 Script 1: dock_one
The rst example is a simple script which loads a protein and a ligand and writes a list of
ligand placements into a multi-mol2 le. The variables $(protein), $(ligand), $(nof_write)
are parameters.
Example
# SCRIPT: Dock a single ligand and generate output file
# parameters are: $(protein) = name of protein rdf file
# $(ligand) = name of ligand mol2 file
# $(nof_write) = number of placements to write
output " >> Docking " $(ligand) " into " $(protein)
output "-----------------------------------------------------------"
# Part 1: Loading data
receptor
read $(protein)
end
ligand
read $(ligand)
end
# Part 2: compute placements
docking
selbas a # automatically select base frag.
placebas 3 # place base frag. with triangle alg.
complex all # add all fragments
# Part 3: results
info y 0 # output a summary table
listsol $(nof_write) # output a table of first $(nof_write) placements
end
ligand
write $(ligand)_pred y y 1-$(nof_write) n
end
output " >> Done. Result written to " $(PREDICT) $(ligand) "_pred.mol2."
# Part 4: cleanup and quit
delall y # delete data
quit y
363
364 APPENDIX B. EXAMPLES OF SCRIPT FILES
B.2 Script 2: dock_list
The second example goes through a list of mol2 ligand les and docks them into the same
receptor. A table with the docking score for each ligand molecule is written to a le.
Example
#
# SCRIPT: Dock a set of ligands into one receptor and generate a docking
# score table
#
# parameters are: $(protein) = name of protein rdf file
# $(liglist) = name of the ligand list file
# $(outfile) = name of the output file
output " >> Docking ligands from " $(liglist) " into " $(protein)
output "-----------------------------------------------------------"
# Part 1: Loading data
receptor
read $(protein)
end
# Part 2: go through the list of ligand molecules
for_each $(lig) in $(liglist)
output " --- Processing ligand " $(lig)
ligand
read $(lig) # load the ligand
end
docking
selbas a # automatically select base frag.
placebas 3 # place base frag. with triangle alg.
complex all # add all fragments
seloutp $(outfile) a # open output file
info n 0 # generate one-liner about solution
seloutp screen # switch back to screen
delete # delete placements
end
ligand
delete # delete the ligand
end
end_for
The le $(liglist) must be located in the directory where FlexX is executed. It contains
one line per ligand with the ligand mol2 lename.
C
Additional copyright
notes
The following software/data is used in/with FlexX:
Base software: Copyright c _2001 by Fraunhofer Gesellschaft (FhI-SCAI)
getline library: Copyright c _1993 by Chris Thewalt
PVM library version 3.4: Parallel Virtual Machine System University of Tennessee,
Knoxville TN. Oak Ridge National Laboratory, Oak Ridge TN. Emory University, At-
lanta GA. Authors: J. J. Dongarra, G. E. Fagg, G. A. Geist, J. A. Kohl, R. J. Manchek,
P. Mucci, P. M. Papadopoulos, S. L. Scott, and V. S. Sunderam c _1997 All Rights Re-
served
1
Python library: Copyright 1991-1995 by Stichting Mathematisch Centrum, Amster-
dam, The Netherlands. All Rights Reserved
2
1
PVM copyright notice: Permission to use, copy, modify, and distribute this software and its documentation
for any purpose and without fee is hereby granted provided that the above copyright notice appears in all copies
and that both the copyright notice and this permission notice appear in supporting documentation.
Neither the Institutions (Emory University, Oak Ridge National Laboratory, and University of Tennessee) nor
the Authors make any representations about the suitability of this software for any purpose. This software is
provided as is without express or implied warranty. PVM version 3 was funded in part by the U.S. Depart-
ment of Energy, the National Science Foundation and the State of Tennessee.
2
Python copyright notice: Permission to use, copy, modify, and distribute this software and its documenta-
tion for any purpose and without fee is hereby granted, provided that the above copyright notice appears in all
copies and that both that copyright notice and this permission notice appear in supporting documentation, and
that the names of Stichting Mathematisch Centrum or CWI or Corporation for National Research Initiatives or
CNRI not be used in advertising or publicity pertaining to distribution of the software without specic, written
prior permission.
While CWI is the initial source for this software, a modied version is made available by the Corporation for
National Research Initiatives (CNRI) at the Internet address ftp://ftp.python.org.
STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH REGARD
TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS,
IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL,
INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
THIS SOFTWARE.
365
366 APPENDIX C. ADDITIONAL COPYRIGHT NOTES
Anti-Grain Geometry library version 2.3: Copyright c _20022005 MaximShemanarev
(http://www.antigrain.com)
3
zlib library version 1.2.3: Copyright c _19952005 Jean-loup Gailly and Mark Adler
minizip library version 1.01e: Copyright c _1998-2005 Gilles Vollant
libxml2 library. Copyright c _19982003 Daniel Veillard. All Rights Reserved.
4
SMARTS
TM
may be a registered trademark of Daylight Chemical Information Sys-
tems.
The torsion angle data (torsion_standard.dat/torsion_ne.dat) is derived from the
Cambridge Structural Database. The copyright c _ of these les is shared by GMD
Forschungszentrum Informationstechnik GmbH, the Cambridge Crystallographic
Data Center (CCDC), and BASF AG, Ludwigshafen.
3
Anti-Grain Geometry copyright notice: Permission to copy, use, modify, sell and distribute this software is
granted provided this copyright notice appears in all copies. This software is provided "as is" without express
or implied warranty, and with no claim as to its suitability for any purpose.
4
libxml2 copyright notice: Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation les (the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is fur- nished to do so, subject to the
following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions
of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FIT- NESS FOR A PAR-
TICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE DANIEL VEILLARD BE LI-
ABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CON- NECTION WITH THE SOFTWARE OR THE
USE OR OTHER DEALINGS IN THE SOFTWARE.
Except as contained in this notice, the name of Daniel Veillard shall not be used in advertising or otherwise to
promote the sale, use or other deal- ings in this Software without prior written authorization from him.
Bibliography
[1] F.C. Bernstein, T.F. Koetzle, G.J.B. Williams, E.F. Jr. Meyer, M.D. Brice, J.R. Rodgers,
O. Kennard, T. Shimanouchi, and M. Tasumi. The protein data bank: a computer based
archival le for macromolecular structures. Journal of Molecular Biology, 112:535542,
1977. 92, 266, 278, 280
[2] H.-J. Bhm. LUDI: rule-based automatic design of new substituents for enzyme in-
hibitor leads. Journal of Computer-Aided Molecular Design, 6:593606, 1992. 17
[3] H.-J. Bhm. The development of a simple empirical scoring function to estimate the
binding constant for a protein-ligand complex of known three-dimensional structure.
Journal of Computer-Aided Molecular Design, 8:243256, 1994. 17, 307
[4] M. Clark, R.D. Cramer, and N.V. Opdenbosch. Validation of the general purpose tripos
5.2 force eld. Journal of Computational Chemistry, 10:9821012, 1989. 304, 307
[5] M. Clark and N. Opdenbosch R. D. Cramer III. Validation of the general purpose tripos
5.2 force eld. IEEE Transactions on Pattern Analysis and Machine Intelligence, Journal of
Chemical Information and Computer Science:9821012, 1989. 293, 294
[6] H. Clauen, C. Buning, M. Rarey, and T. Lengauer. FlexE: Efcient Molecular Dock-
ing Considering Protein Structure Variations. Journal of Molecular Biology, 308:377395,
2001. 191
[7] H. Clauen, C. Buning, M. Rarey, and T. Lengauer. Molecular Docking into the Flexible
Active Site of Aldose Reductase Using FlexE. In Rational Approaches to Drug Design:
Proceedings of 13th European Symposium on Quantitative Structure-Activity Relationships.
Prous Science, Barcelona, 2001. 191
[8] M.D. Eldridge et al. Empirical scoring functions i: Development of a fast empirical
scoring function to estimate the binding afnities of ligands in receptor complexes.
Journal of Computer-Aided Molecular Design, 11:425445, 1997. 304, 307
[9] J. Gasteiger, C. Rudolph, and J. Sadowski. Automatic generation of 3d-atomic coor-
dinates for organic molecules. Tetrahedron Computer Methodology, 3:537547, 1990. 21,
348
[10] D.K. Gehlhaar, G.M. Verkhivker, P.A. Rejto, C.J. Sherman, D.B. Fogel, L.J. Fogel, and
S.T. Freer. Molecular recognition of the inhibitor ag-1343 by HIV-1 protease: conforma-
tionally exible docking by evolutionary programming. Chemistry & Biology, 2:317324,
1995. 304, 307
367
368 BIBLIOGRAPHY
[11] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V.Sunderam. PVM:
Parallel Virtual Machine. A Users Guide and Tutorial for Networked Parallel Computing.
The MIT Press, Cambridge, Massachusetts, 1994. http://www.netlib.org/pvm3/
book/pvm-book.html. 142
[12] D.K. Gelhaar, G. Verkhivker, P.A. Rejto, D.B. Fogel, L.J. Fogel, and S.T. Freer. Dock-
ing conformationally exible small molecules into a protein binding site through evo-
lutionary programming. In J.R. McDonnell, R.G. Reynolds, and D.B. Fogel, editors,
Proceedings of the Fourth Annual Conference on Evolutionary Programming, pages 615627,
1995. 304, 307
[13] A.K. Ghose and G.M. Crippen. Atomic physicochemical parameters for three-
dimensional-structure-directed quantitative structure-activity relationships. 2. model-
ing dispersive and hydrophobic interactions. Journal of Chemical Information and Com-
puter Science, 27:2135, 1987. 268, 298, 337
[14] G. Klebe and T. Mietzner. Correlation of crystal data to analyze and predict lig-
and/receptor interactions. In D. W. Jones, editor, Organic Crystal Chemistry. Oxford
University Press, Oxford, UK, 1992. 17
[15] G. Klebe and T. Mietzner. A fast and efcient method to generate biologically relevant
conformations. Journal of Computer-Aided Molecular Design, 8:583606, 1994. 21
[16] B. Kramer, M. Rarey, and T. Lengauer. Casp-2 experiences with docking exible ligands
using exx. PROTEINS: Structure, Function and Genetics, Suppl 1:1(1):221225, 1997. 17
[17] B. Kramer, M. Rarey, and T. Lengauer. Evaluation of the exx incremental construction
algorithm for protein-ligand docking. Proteins: Structure, Function, and Genetic, 37:114,
1999. 17
[18] M. Rarey. Rechnergesttzte Vorhersage von Rezeptor-Ligand-Wechselwirkungen, volume 268
of GMD-Bericht. Oldenbourg Verlag, Munich, Germany, 1996. 17
[19] M. Rarey, B. Kramer, and T. Lengauer. Multiple automatic base selection: Protein-
ligand docking based on incremental construction without manual intervention. 1997.
17
[20] M. Rarey, B. Kramer, and T. Lengauer. Docking of hydrophobic ligands with
interaction-based matching algorithms. Bioinformatics, 15:243250, 1999. 17
[21] M. Rarey, B. Kramer, and T. Lengauer. The particle concept: Placing discrete water
molecules during protein-ligand docking predictions. PROTEINS: Structure, Function
and Genetics, 34(1):1728, 1999. 17
[22] M. Rarey, B. Kramer, T. Lengauer, and G. Klebe. Afast exible docking method using an
incremental construction algorithm. Journal of Molecular Biology, 261(3):470489, 1996.
17, 230
[23] M. Rarey and T. Lengauer. A recursive algorithm for efcient combinatorial library
docking. Perspectives in Drug Discovery and Design, 20:6381, 2000. 166
BIBLIOGRAPHY 369
[24] M. Rarey, S. Weng, and T. Lengauer. Placement of medium-sized molecular fragments
into active sites of proteins. Journal of Computer-Aided Molecular Design, 10:4154, 1996.
17
[25] J. Sadowski, J. Gasteiger, and G. Klebe. Comparison of automatic three-dimensional
model builders using 639 x-ray structures. Journal of Chemical Information and Computer
Science, 34:10001008, 1994. 21, 348
[26] I. Schellhammer and M. Rarey. Flexx-scan: Fast structure-based virtual screening.
PROTEINS: Structure, Function and Bioinformatics, 57:504517, 2004. 230
[27] M. Stahl and H.J. Bhm. Development of lter functions for protein-ligand docking.
Journal of Molecular Graphics and Modelling, 16:121132, 1998. 208
[28] M. Stahl and M. Rarey. Detailed Analysis of Scoring Functions for Virtual Screening.
Journal of Medicinal Chemistry, 44:10351042, 2001. 307
[29] Martin Stahl. Modications of the scoring function in FlexX for virtual screening appli-
cations. Perspectives in Drug Discovery and Design, 20:8398, 2000. 310
[30] V. Sunderam, J. Dongarra, A. Geist, and R. Manchek. The pvm concurrent computing
system: Evolution, experiences, and trends. Parallel Computing, 20(4), 1994. 21, 276
[31] Symyx Technologies Inc., www.symyx.com, 2440 Camino Ramon, Suite 300, San Ra-
mon, CA 94583. CTFile Formats November 2007, 2007. 92, 278
[32] TRIPOS Associates, Inc., St. Louis, Missouri, USA. SYBYL Molecular Modeling Software
Version 6.x, 1994. 69, 70, 87, 91, 92, 93, 266, 278, 313, 317, 318
[33] A.C. Wallace, R.A. Laskowski, and J.M. Thornton. LIGPLOT: a program to generate
schematic diagrams of protein-ligand interactions. Protein Engineering, 8(2):127134,
1995. 138
[34] S.A. Wildman and G.M. Crippen. Prediction of physicochemical parameters by atomic
contributions. Journal of Chemical Information and Computer Science, 39:868873, 1999.
268, 297, 337
Index
@align, 193
@pdb_les, 192
@ref_lig_les, 192
[FLEXIBLE_OPTIMIZE, 269, 271
[KEEP_RCGEN_FILES, 270
[MOL_NAME, 270
[PLACE_PARTICLES, 270
[RING_MODE, 271
[SDF_MOL_ID_NUM, 271
[SIZE_LIMIT, 273
[STEREO_MODE, 273
FlexV, 75
Docking , 50
Ligands , 48
Receptor , 35, 37
3D View, 36
3D_GEN, 269
3D_GEN_FORMAT, 269
ACNT, 208
active, 43
active site, 37, 280
denition, 72
write out, 65, 105
active site le, 103
additional modules, 141
admin setting
receptor, 107
admin settings
dockings, 126
ligand, 94
placements, 126
protein, 107
advanced setup, 265
aliases, 276
alternate locations, 37
alternative directory, 57
amides, 320
amino acids
mapping, 281
amino.dat, 202, 246, 312
AMINO4PPI, 246
amino_gen.dat, 202
aminogen.dat, 312
Appendix, 353
Asn, 40
assignment, 39
ASTEX dataset, 103
atom type assignment, 334
atom types, 69, 299
atomic charges, 70
automated docking, 86
base fragments
placing of, 115
Base Placement, 50
base placement, 66
bat, 278
batch
branch, 257
command
FOR_EACH/END_FOR, 255
FOREVER, 255
IF/ELSE/ENDIF, 257
INCR, 258
INPUT, 258
OUTERR, 258
OUTPUT, 258
PROCSIZE, 258
SELINP, 257
SETVAR, 258
TIMER, 258
WAIT, 258
WHILE, 255
input, 258
loops, 255
output, 258
progsize, 258
timer, 258
variables, 257
wait, 258
batch mode, 56, 57
arguments, 57
binding site, 37, 38
bond, 21, 320
bond length
heavy atoms, 298
hydrogens, 299
SYBYL, 299
bond types, 69
370
INDEX 371
buried waters, 42
buriedness, 208, 220
Bhm function, 303, 312
case sensitivity, 55
cavities, 138
CCG, 22, 347
Cdocking parameters, 294
chain, 38
charges
formal, 267
partial, 267
charges.dat, 202, 316
Chemical Computing Group, 347
chemistry, 30
chempar.dat, 298
combinatorial libraries
introduction, 148
command
?, 80
2DPLOT, 138
ACNT, 209
ACTIVE, 104, 196
AMINO4PPI, 249
ATLIST, 104
AUTODOCK, 86
BUILD, 196
CAND, 217
CAVITY, 138
CCONSTR, 221
CGRID, 221
CHECKPDB, 204
CLASH, 102
CLOSE, 150, 228
CLUSTER, 118
CLUSTERIA, 204
COC(=O)C(=O)COC, 334
COMPGA, 210
COMPLEX, 116, 187
COMPLEX ALL, 66
CONTACT, 134
CPHARM, 161
DECRYPT, 113
DEEPSITE, 107
DELALL, 85
DELETE, 93, 106, 119, 152, 167, 184, 195
DELETEFG, 218
DELETEG, 211
DELETESC, 220
DELPHARM, 162
DISPLAY, 84
DRAW, 100, 112, 130, 160, 187, 198, 202, 215,
236
DRAWFG, 218
DRAWSC, 220
EDIT, 106, 184, 195
EDITGEN, 204
END, 80
ENUM, 155
ERASE, 85
EVAL, 225
EXPORT, 125
EXTEND, 155
EXTENDCORE, 153
EXTENDMR, 166
EXTENDR, 165
EXTRACT, 154, 167
EXTRACTTOP, 168
FILTER, 184, 218
FIXRMSD, 102
FLIPSTER, 102
FROMPDB, 63, 88
GENERATE, 233
GENRDF1, 203
GENRDF2, 203
GET, 228
GRAINF, 101, 112, 131, 161, 199, 202
GRID, 210
HELP, 68, 80
INFO, 90, 106, 119, 152, 184, 188, 232, 233
INFOALIGN, 195
INFOENS, 195
INFOGEN, 203
INFORMSD, 195
LIGAND/SELINIT, 277
LIST, 81
LISTALL, 123
LISTFG, 217
LISTG, 211
LISTINST, 123, 204
LISTMACRO, 230
LISTMAT, 122
LISTONE, 123
LISTP, 167
LISTRMS, 122
LISTSC, 220
LISTSCO, 113
LISTSOL, 120
MAIN, 80
MANUAL, 80
MAPREF, 93, 152
MATCHING, 138
MDRAW, 100, 130
MINCONF, 101
MINFO, 155
MINIMIZE, 101
372 INDEX
OPEN, 227
OPTC, 164
OPTIMIZE, 117
OVERLAP, 133, 221
PDBINFO, 62, 103
PERMUTE, 150
PICKPH, 184
PING, 244
PLACEBAS, 115, 187
PLACEC, 163
PLACER, 164
PLACESEQ, 165
PLP, 132
PRINTSOL, 125
QHIST, 125
QUERY, 123
QUIT, 80
READ, 86, 87, 103, 119, 149, 183, 184, 193
READC, 164
READG, 210
READMACRO, 229
READPDB, 203
READRDF, 194
READREF, 93, 151
RELEASE, 154, 167
RELEXTCORE, 154
RESETCORE, 154
RGROUP, 153
RMSDMATRIX, 196
RMSHIST, 136
RUNDOCK, 87
SAS, 94, 107, 135
SASTAB, 134
SCALE, 210
SCAN, 228
SCANMODE, 231
SCORE, 131
SCRIPT, 86
SEEK, 228
SELADM, 94, 107, 126, 155, 185, 196, 199,
212, 235
SELBAS, 114, 163
SELCOL, 96, 110, 128, 157, 186, 197, 200, 215,
236
SELECT, 118, 152
SELECTR, 166
SELENS, 204
SELGAUSS, 216
SELGRA, 95, 108, 127, 156, 185, 197, 200, 213,
235
SELLAB, 99, 111, 129, 159, 198, 201
SELOUTP, 81, 147
SELPHARM, 219
SELSCO, 113
SERVCOM, 244
SERVER, 243
SET, 81, 86
SETFILTER, 151
SETREF, 94, 152
SMARTS, 91
SMILES, 88
SOLTAB, 120
SORT, 119
STARTSERV, 244
SUPER, 196
SWITCH, 153
TOFLEXV, 82
TOPHARM, 219
TRANSFORM, 91
TRIHASH, 106
WATER, 136
WRITE, 65, 91, 105, 118, 194
WRITEC, 164
WRITECFG, 81
WRITEG, 210
WRITESC, 220
WRITESOL, 167
WRITONE, 101
WRITRAND, 102
command escape, 56
command line options, 55
command parameters, 55
commandline, 56, 60
arguments, 57
switches, 57
commandline options, 255
commandline options and their arguments, 57
compatibility
MOE, 22
complex optimization, 117
conguration, 29
hierarchy of levels, 29
static_data, 266
conformation energy, 287
conformations, 286
CONFORT, 349
constrained docking, 70
constraints
essential, 44
interactions, 43
logical expressions, 44
optional, 44
pharmacophore, 43
spatial, 43
contact.dat, 319
contype, 19
INDEX 373
CORINA, 268, 269, 348
deprecated driver options, 348
corina, 19
crypted les, 113
csv
EXPORT, 125
Cygwin, 23
dat, 278
data preparation
ligand, 69
protein, 72
DATABASE, 246
DDB, 305
Debian, 23
decryption, 113
degrees of freedom
constraining, 70
delete all, 85
delocalized.dat, 336
depth, 220
depth calculation, 208
directory paths, 266
discretization, 208
displaceable, 42
displaced, 43
DISPLAY, 76
docking, 50, 58, 66
constrained, 70
deletion, 119
export solutions, 52
LISTINST, 204
pose export, 52
SCANMODE, 231
docking algorithm, 114, 291
docking database, 305
docking results
output destination, 81
dockings
color settings, 128
drawing, 130
graphics settings, 127
label settings, 129
DRAW, 76
draw grid points, 215
drawing, 76
DRUGSCORE, 208
DrugScore, 246, 249
edf le, 192
default, 353
editing
RDF, 106
email, 26
EMBED_SETTINGS, 269
EMBED_SETTINGS_PP, 269
enantiomer, 102
encryption, 113
ensemble, 191
ACTIVE, 196
BUILD, 196
DELETE, 195
DRAW, 198
EDIT, 195
GRAINF, 199
INFOALIGN, 195
INFOENS, 195
INFORMSD, 195
READ, 193
READRDF, 194
RMSDMATRIX, 196
SELADM, 196
SELCOL, 197
SELGRA, 197
SELLAB, 198
slot, 192, 193
SUPER, 196
WRITE, 194
ensemble description le, 192
ensembles
generation, 202
environment variable
strings, 275
errors, 59
exit option, 59
exit FlexX, 69
explanation-to-aminopchargesgen.dat, 267
export, 52, 53
PipelinePilot, 53
poses in CSV, 52
export poses, 58
extensions, 278
FAQ, 26
MOE and FlexX, 22
fcharges.dat, 336
le extensions, 278
le format, 277
le sufxes, 278
les
mol2, 37
SD, 37
xml, 53
lter
CLOSE, 228
GET, 228
374 INDEX
LISTMACRO, 230
OPEN, 227
READMACRO, 229
SCAN, 228
SEEK, 228
lter functions, 293
rewall, 35
rst steps, 60
xing ring conformations, 70
xing torsional angles, 70
ags, 275
FlexE
introduction, 191
exible ring systems, 21
FlexV, 349
atom coloring, 350
rst use, 62
graphical interface, 76, 349
libraries needed, 23
two instances with one FlexX, 349
FlexX-Pharm, 178
constraint types, 178
Gaussian constraints denition, 219
interaction constraints, 179
spatial constraints, 180
force eld parameters, 293
fxx, 30, 278
g(x), 208
Gaussians, 208
as lters, 216
pharmacophore constraints, 219
selection, 216
type, 211
gdf, 278
Generating ensembles, 202
genrdf
CHECKPDB, 204
EDITGEN, 204
GENRDF1, 203
GENRDF2, 203
INFOGEN, 203
READPDB, 203
getting started, 35
Gln, 40
global commands, 80
graph
DRAW, 202
GRAINF, 202
SELADM, 199
SELCOL, 200
SELGRA, 200
SELLAB, 201
graphical user interface, 19
graphics, 75
color modes, 339
colors, 338
graphics objects, 76
graphics problems, 23
graphics.dat, 337
GRID, 208
grids, 208
visualization, 215
GUI, 19
H-atoms, 74, 270
help, 25, 68
His, 40
host id, 57
hydrogen
torsions, 37
hydrogen atoms
protein, 74, 270
hydrogens, 105
angles, 299
I/O
table output, 125
ID, 45, 47
importing PDB, 88
info, 68
INIT_MOL2_RECEPTOR, 270
initialization procedure, 88
installation, 19
admin tools, 29
MOE, 22
insufcient memory, 23
interaction
geometry, 307, 308
interaction constraints, 179
interaction geometries, 307
denition, 308
interaction types, 301
interactive mode, 55
interface options, 58
interfaces
corina, 348
MOE, 347
Sybyl, 348
internal clash test, 102
introduction, 17
ions, 36
KEEP_3D_GEN_FILES, 270
known limitations, 22
lattice energy, 208
INDEX 375
library, 48
license
PipelinePilot remote protocol, 54
ligand, 61
color settings, 96
delete, 93
drawing of, 100
EVAL, 225
graphics settings, 95
information, 90
label settings, 99
minimization, 90, 101
random conformation, 102
read, 87
reading reference, 93
reference, 37
SAS
calculation of, 94
specic conformation, 101
write, 91
ligand preparation, 35
ligands, 48
leaving them untouched, 48
limitations in docking, 48
mol2 input, 48
protonation, 48
SD input, 49
Linux distributions
Ubuntu, Debian, 23
lipophilic contact area, 134
loading Projects, 57
loading receptor les, 57
log, 278
logging
session, 57
logical expressions, 44, 47
LogP, 337
logp.dat, 337
MDRAW, 76
menu, 79
DATABASE, 246
PVM, 141
MIMUMBA, 306
models, 38
MOE, 22, 347
mol, 278
mol2, 37, 69, 74, 270, 278
molecule cleanup, 89
molecule initialization, 277
nested scripts, 253
NMR structures, 38
number of poses to export, 58
OpenGL, 23
overlap
computation between ligand and protein,
221
overlap volume, 133
pair potentials, 246
parallel, 276
parallel computing, 20, 54
parallel script execution, 21
parameters, 29
partial charges, 61
Particles
active, 43
displaced, 43
phantom, 43
particles, 42
PBC
conguring PBC, 240
PING, 244
SERVCOM, 244
SERVER, 243
server, 276
shutdown a server, 243
starting a server, 242
STARTSERV, 244
pcharges_gen.dat, 202
PDB, 74
pdb, 278
pdb le
read receptor directly from pdb le, 64
pdf, 278
PERMUTE tutorial, 169
phantom, 43
pharmacophore constraints, 43
pharmacophores, 50, 53
Boolean expressions, 44
ID, 45, 47
interaction constraints, 43
logical expressions, 44
spatial constraints, 43
PipelinePilot, 53
PLACEBAS_CACHING
, 270
placement
lists, 125
placements
color settings, 128
drawing, 130
graphics settings, 127
label settings, 129
PLP scoring, 305
376 INDEX
PMF, 249
poses
export, 52
post processing, 52
PPI, 246, 249
AMINO4PPI, 249
using old potential les, 249
PRINT_SIZE, 271
PRINT_TIMES, 271
program parameters, 285
project
read, 86
rundock, 87
Project File, 30
protein
active site, 104
assignment, 39
chain, 38
charges, 316
color settings, 110
delete, 106
drawing of, 112
from NMR structures, 38
graphics settings, 108
H-torsions, 37
hydrogens, 74, 270
label settings, 111
MOL2 format, 74, 105
PDB contents, 103
protonation, 37, 40
reading of, 103
SAS
calculation of, 107
tautomers, 37, 40
writing of, 105
protein ensembles
introduction, 191
protein preparation, 35
protonation, 40, 74, 270
protonation state, 277
proxy conguration, 35
PVM, 20, 54, 141
aborting and recovering, 145
batch les, 143
command
ADD, 143
INFO, 142
OFMERGE, 145
RECOVER, 145
REMOVE, 143
TOPVM, 143
conguring PVM, 142
lenames in scripts, 82, 93, 125, 126, 140, 147
kill work process, 146
merging of les, 82, 93, 125, 126, 140, 147
parallel, 276
preliminaries, 141
problems, 82, 93, 125, 126, 140, 142, 146, 147
starting PVM, 142
working with PVM, 146
pxx, 278
Python, 259
PyFlexX, 259
ask for params, 260
getting started, 259
special commands, 260
start-up conguration, 261
working with PyFlexX, 259
query history, 125
QUERY_BURIEDNESS, 271
QUERY_SASTAB, 271
RCGENERATOR, 347, 348
rdf, 64, 278
rdf le, 279
default, 353
receptor, 36, 64
ACTIVE, 104
charges, 316
CLUSTERIA, 204
PDB contents, 103
rdf le, 64
SELENS, 204
writing of, 105
receptor description le (rdf), 64
generation of, 64
receptor ligand overlap, 286
reference ligand, 37, 61
protonation, 61
RIGID_TORSIONS, 271
ring conformer generator, 19, 347
RMSD, 61
RMSD calculation, 102
rotamers, 37
SAS, 134
SASTAB, 134
SCA, 348
score
PLP, 132
score table, 58
scoring, 268
differences, 132
licensing, 246
pair potentials, 246
INDEX 377
scoring function
adjusting, 113
factors, 305
parameters, 303
script le, 86
examples, 363
scripts, 56, 253
parameters, 255
variables in, 255
sdf, 37, 278
SECONDARY_TORSION_MODE, 271
SELADM, 77
SELCOL, 77
SELGRA, 77
SELLAB, 77
session logging, 57
settings
installation level, 30
setup
advanced, 265
shell, 55
shell command, 85
Single Interaction Scan, 50
SIS, 50
slot, 192
SMARTS, 317, 325
aromaticity perception, 327
hydrogens, 328
logical operators, 329
recursive, 330
ring perception, 327
subgraphs, 330
smi, 278
SMILES, 277
SMILES parsing, 88
solutions
clustering, 118
export, 52
list all, 123
list one, 123
list query, 123
selection of, 118
solutions table, 58, 124
spatial constraints, 180
sphere, 281
spots
DRAW, 236
GENERATE, 233
INFO, 233
SELADM, 235
SELCOL, 236
SELGRA, 235
ssh, 53
start-up, 55
startup options, 57, 255
static data les, 266
decrypting, 113
structure correction, 334
subgraph matching, 93
support, 26
surface
calculation, 281
description le, 278
probe radius, 296
radius of water sphere, 307
switches, 255
Sybyl, 348
SYBYL atom types, 69
system id, 57
tautomers, 40
technical reference, 265
templates
missing hydrogens, 74
partial charges, 268
use of, 331
Token not numeric, 23
torsion angles, 21, 320
torsion status, 254
TORSION_MODE
, 273
torsions, 37
Tree View, 36
triangle hash table, 106
Tripos, 348
Tripos force eld, 304
troubleshooting, 22
MOE and FlexX, 22
tutorial, 35, 60
docking, 66
info, 68
ligand, 61
receptor, 64
receptor description le, 64
Ubuntu, 23
USE_PVM_FEATURE, 275
user guide, 29
valence states, 300
van der Waals radii, 298
verbosity, 58
visualization, 75
warnings, 59
water, 36
water locations, 136
water molecules, 42
378 INDEX
WHATIF, 348
Windows, 23
insufcient memory, 23
writing conguration le, 81
xml, 53