Use of 3D QSAR Models For Database Screening: A Feasibility Study
Use of 3D QSAR Models For Database Screening: A Feasibility Study
Use of 3D QSAR Models For Database Screening: A Feasibility Study
The applicability and scope of 3D QSAR methods (CoMFA, CoMSIA) to screen databases are examined.
A protocol requiring minimal user intervention has been established to align training and test set molecules
using FlexS. As model system isozymes of human carbonic anhydrase (hCA) are used, all results are
exemplified studying affinity toward hCA II and selectivity between hCA I and II. The predictive power of
the obtained models is assessed through prediction of 663 compounds not included in the training set and
compared to 2D QSAR models derived from fragment (MACCS) or property (VSA) based descriptors. The
predictive power is evaluated with respect to the following criteria: the numerical, concerning the absolute
accuracy of prediction, and the categorical, characterizing the ability to assign a compound to the correct
activity class.
Chart 1. Formulas of the Nine Scaffolds Contained in Training Set and Test Set Ligands.
mous time effort makes such an approach inappropriate for drase (hCA) isozymes. CAs are zinc containing hydrolases
large screening scenarios. Thus, we decided to develop a (EC 4.2.1.1), which catalyze the reversible hydration of
protocol which employs the property- and interaction-based carbon dioxide to bicarbonate under the release of one
FlexS22 program as alignment engine. FlexS is fast enough proton.29,30 They are involved in a variety of important
for screening purposes, easily extensible to combinatorial physiological processes, such as pH and CO2 regulation, bone
libraries, and uses a physicochemical description of the resorption, calcification, metabolic reactions, tumorigenicity,
molecules in terms of Gaussian functions which is consistent and electrolyte secretion.31,32 Therefore, inhibitors of carbonic
with the methodology applied in CoMSIA. The statistical anhydrases offer the opportunity to treat several physiological
performance of the models obtained by this fully automated disorders, e.g., as drugs against glaucoma, mountain sickness,
alignment procedure is compared to the results of a study and epilepsy.33 Due to the high structural similarity and the
previously conducted in our laboratory.23 There, the same broad tissue distribution of the different isoforms of CAs
data set was used, but the molecules were manually aligned selectivity is an issue of major concern in the design of CA
and minimized in the binding pocket. inhibitors. In the present study we used the pKi value
In addition to 3D QSAR we also derived 2D models and measured against hCA II (in the following sections referred
applied both to an external test set of 663 molecules. Two to as pKi(II)) as an example for affinity prediction and the
kinds of 2D descriptors are used: The public version of the difference of the pKi values toward hCA I and hCA II (in
well-known MDL MACCS keys,24,25 which encode the the following sections referred to as ∆pKi(I-II)) as an
presence or absence of distinct structural fragments in a example for selectivity prediction.
molecule, and the 32 partitioned van der Waals surface area
(VSA) descriptors developed by Labute,26 which are property METHODS
based. The performance in terms of external predictivity is
compared on a numerical and categorical basis: predictive All 3D QSAR studies and all PLS analyses were per-
r2 and Spearman’s r2 assess the correlation between experi- formed using SYBYL7.1.34 MACCS keys24 and VSA
mental and calculated activity values, Spearman’s rank descriptors26 were calculated using MOE 2005.06.35 VSA
correlation coefficient characterizing the ability to correctly descriptor selection was accomplished by the SVL script
predict the rank order of the compounds according to their (Scientific Vector Language) AutoQSAR.36 Automation of
activity. While these parameters characterize the overall ligand alignment was realized via the Python interface
numerical accuracy of the models, the ability to enrich “pyflexs” running FlexS 1.20.2.22 Compound prediction was
compounds with a desired activity is even more important implemented in SPL (Sybyl Programming Language).
for database screening purposes. In contrast to other virtual Data Set and Preparation. In order to obtain a suitable
screening approaches such as docking/scoring or pharma- training set 144 ligands were taken from our previous 3D
cophore matching, however, one limitation of QSAR is that QSAR study and subjected to the FlexS alignment procedure.
it works only on a rather limited piece of chemical space For details about selection criteria and composition of this
covered by the training set compounds. Hence, only focused data set we refer to ref 16. The following procedure was
libraries can be screened reliably, and in order to obtain pursued to automate ligand alignment: For each of the nine
enrichment or classification ratings a certain activity thresh- scaffolds (Chart 1, throughout this paper we will refer to
old has to be predefined for class assignment (e.g., if pKi > these scaffolds by the corresponding capital letters A to I)
7, compound is classified “active”). In this study, we report comprised in the training and the test sets two items had to
sensitivity and specificity, classical enrichment plots, receiver be defined: (1) a so-called MAPREF which is an anchoring
operating characteristic (ROC) curves, the area under the fragment used to initiate the incremental construction
ROC curve (AUC), and hit rates to compare different models algorithm of FlexS and (2) a reference ligand for spatial
from an objective point of view.27,28 alignment. The assignment of atoms in each ligand to the
All studies reported herein are based on pKi values of corresponding MAPREF substructure is achieved by graph
sulfonamide-type inhibitors toward human carbonic anhy- matching and does not require any user intervention. The
386 J. Chem. Inf. Model., Vol. 48, No. 2, 2008 HILLEBRECHT AND KLEBE
second item to be predefined is one complete reference ligand fragment in a molecule. This protocol allows better dif-
onto which the respective candidate molecules are superim- ferentiating molecules possessing not only qualitatively the
posed. This reference should be at best a compound with same substructures but differing in counts. This is often the
maximum spatial extensions adopting a “representative” case in focused libraries.
conformation for the individual chemical class, if possible In addition to such fragment oriented models also the
extracted from a crystal structure. The utilization of a property based VSA descriptors were used to derive 1D
MAPREF substructure as alignment template ensures internal QSAR models. Each of the 32 VSA descriptors is calculated
consistence of the produced alignment. Without such con- as the sum of all atomic contributions to the approximated
straints placements lacking proper superimposition of the van der Waals surface areas with the respective property
anchoring fragments with that of the reference ligand might falling into a certain range. In MOE this type of descriptor
achieve higher similarity scores because the overall similarity can be calculated for three properties: SlogP (calculated
comprising various properties of the entire molecule of a octanol/water partition coefficient), SMR (calculated molar
distinct pose can be higher than the score obtained for the refractivity), and PEOE (Gasteiger-Marsili partial charges).
desired pose where common substructures are mutually These descriptors show a rather low degree of correlation
overlaid. Using this procedure all except six molecules could among each other, capture different aspects of protein-ligand
be aligned ending up with a final training set size of 138 interactions and transport phenomena, and their broad
molecules. applicability to many QSAR related problems has been
The training set protocol was applied to 663 compounds shown. In order to select a suitable subset of descriptors the
serving as a real life test example. It has to be noted that AutoQSAR36 procedure was applied. It identifies the “best”
this test set represents a rather unbalanced sample since the model based on leave-one-out cross-validation. This proce-
only selection criteria were availability of experimental data dure should only generate rather crude 1D QSAR models.
to be predicted and affiliation to one of the crude chemo- It will not evaluate the capacity of VSA descriptors
types A to I considered in the training set. Most of the exhaustively, and the obtained models should rather serve
compounds could be aligned applying default settings in as a first overview for comparison to get an idea as to what
FlexS, only for some rather rigid and large molecules the can be achieved by a crude “quick and dirty” 1D model.
threshold for the minimum van der Waals overlap volume
had to be reduced from 0.6 to 0.4 in order to obtain a Evaluation of Predictivity. In order to assess and compare
superimposition solution. the predictive power of the different models, several statisti-
cal parameters as well as plots are reported: The predictive
3D QSAR Analyses. For CoMFA16 analysis the interac-
r2 value is usually used to characterize the performance in
tion energies between a probe atom and the ligand atoms
were calculated using a grid box of 26 × 34 × 25 points terms of external predictivity of QSAR models. Additionally,
with 1 Å spacing, embedding all ligands with a margin of at Spearman’s rank correlation coefficient was calculated which
least 4 Å in each direction. The same box dimensions were quantifies the ability of a model to correctly predict the
also used for CoMSIA18 studies. A positively charged sp3- relative order instead of the absolute numeric value of the
carbon atom was used as a probe atom for calculating steric modeled variable.
and electrostatic CoMFA fields applying SYBYL standard Since our study is intended to assess the predictive power
parameters (TRIPOS standard field, dielectric constant 1/r, of QSAR models for database screening purposes it is
cutoff 30 kcal/mol). CoMSIA fields were computed for steric, reasonable to apply also figures-of-merit commonly used in
electrostatic, hydrophobic, and hydrogen-bonding properties, virtual screening, where the correct prediction of class
using a probe of charge +1, radius of 1, hydrophobicity and memberships is even more important. Therefore, a somewhat
hydrogen-bonding properties of +1, and an attenuation factor arbitrary threshold had to be defined which determines class
R of 0.3 for the Gaussian distance-dependent function. All affiliation: For pKi(II) the 2% or 5% of molecules with the
fields were scaled with the CoMFA_STD scaling procedure, highest (or lowest, respectively) activity were considered as
assigning equal weights to each field. The response variables the “high activity” (or “low activity”) class, whereas the
(pKi(II), ∆pKi(I-II)) were correlated with the field descrip- remaining part of the molecules is assigned to the comple-
tors using SAMPLS37 in a leave-one-out cross-validation mentary class. This setting simulates screening experiments
analysis. The optimal number of PLS components was where one wants to enrich compounds which possess a
determined by subsequently extracting one more latent remarkably high affinity toward the target of interest,
variable until the corresponding q2 value is not further compared to the bulk of the training set. Defining the 2% or
increased by more than five percent.38 Afterward, a PLS 5% of lowest activity as the class of interest, the situation
analysis39 was performed without cross-validation using the represents antitarget modeling, where a distinct receptor must
optimal number of components, applying no column filtering. not be inhibited (this holds particularly for, e.g., cytochrome
1D and 2D QSAR Analyses. The MDL MACCS keys P450 or hERG channel blockers). The same aspects are
have already proved useful for screening purposes in examined for the activity difference ∆pKi(I -II). This
conjunction with PLS Discriminant Analysis (PLS-DA) corresponds to a screening scenario with the aim of enriching
models.40 We decided to use them for building numeric 2D compounds selective toward hCA I or hCA II, respectively.
QSAR models. The public version was used that evaluates As direct measures for correct class prediction the sensitivity
the presence of 166 distinct molecular fragments. Instead of Se (also called “true positives rate” or “recall”), the specific-
computing binary fingerprints which store the information ity Sp, and the hit rate H (or “precision”) are reported27,28
about the presence or the absence of one particular substruc-
ture the “counted” version was used resulting in 166 integers TP
Se ) × 100%
each capturing the exact number of occurrences of the TP + FN
3D QSAR FOR DATABASE SCREENING J. Chem. Inf. Model., Vol. 48, No. 2, 2008 387
TN
Sp ) × 100%
TN + FP
TP
H) × 100%
TP + FP
where FP and FN are the number of false positives and false
negatives, and TP and TN are the number of true positives
and negatives, respectively.
To give an even more comprehensive illustration of the
results several kinds of plots are reported: (1) A plot
displaying the predicted activity/selectivity value of a Figure 1. Comparison of performance of different models evalu-
molecule on the ordinate versus its experimental one on the ated via cross-validation. The leave-one-out-q2 is shown for
abscissa. (2) A classical enrichment plot. On the x-axis this prediction of pKi(II) (left) and ∆pKi(I-II) (right), respectively.
graph displays the amount of database entries screened, and Table 1. Statistical Results of the Different QSAR Analyses
on the y-axis it shows the amount of actives retrieved from
method CoMFA CoMSIA MACCS VSA
the database. (3) A receiver operating characteristic (ROC)
curve. This type of plot is quite popular in many scientific alignment manual FlexS manual FlexS
areas like psychology, medicine, acoustics, or criminology dep. var. pKi(II)
to assess the ability of a diagnostic system to distinguish q2 0.853 0.798 0.860 0.822 0.818 0.790
sPRESS 0.504 0.593 0.489 0.552 0.556 0.629
signal from noise. On the x-axis a ROC plot displays the r2 0.949 0.880 0.943 0.867 0.884 0.837
term 1 - specificity which corresponds to the “noise” in S 0.297 0.457 0.313 0.476 0.444 0.553
the data set, and on the y-axis it shows the sensitivity which F 423.840 243.854 453.840 441.585 264.532 27.33
can be thought of as the “signal” that is to be identified by no. comp. 6 4 5 2 4 17
dep. var. ∆pKi(I-II)
the ranking procedure. Nevertheless, its application in drug q2 0.758 0.715 0.786 0.784 0.670 0.584
design and especially in virtual screening is still not standard sPRESS 0.598 0.629 0.552 0.548 0.684 0.799
although it exhibits some advantages compared to the usually r2 0.977 0.851 0.950 0.905 0.802 0.660
applied enrichment curves.28,41 Most important the area under S 0.184 0.455 0.368 0.364 0.529 0.720
F 633.455 190.350 330.962 316.402 188.962 14.028
the curve (AUC) of a ROC plot can be used to directly no. comp. 9 4 4 4 3 13
compare the achieved accuracy of a computer test. Further-
more, the shape of the “ideal curve” of an enrichment plot Table 2. Numerical Measures of Predictivity for Different QSAR
depends on the ratio of actives to inactives in the database, Methods
which is not the case for the ROC curves. Finally, enrichment CoMFA CoMSIA MACCS VSA
plots only capture one aspect of a screening experiment, pred. r2 pKi(II) 0.454 0.482 0.302 -0.710
namely the power to retrieve actives, i.e., the sensitivity, ∆pKi(I-II) -0.079 0.001 -0.476 -0.893
whereas ROC curves also illustrate the second important Spearman’s r2 pKi(II) 0.407 0.443 0.393 0.288
aspect, the ability to discard inactives, i.e., the specificity. ∆pKi(I-II) 0.115 0.118 0.069 0.000
Figure 2. Plots of experimental versus predicted pKi(II) values for the four QSAR approaches considered in this study. A: CoMFA; B:
CoMSIA; C: MACCS; D: VSA. Dashed lines mark a range of (1 logarithmic unit deviation; the solid line indicates perfect correlation.
The shape of the points indicates the individual scaffold class of the respective ligands, the capital letters corresponding to the structural
assignment given in Chart 1. The bold circle in part D marks two extreme outliers discussed in the text.
of the model quality can still be estimated based on the most of the training set molecules does not exceed an upper
achieved q2 in our study. Obviously all approaches are not limit of 160 but takes values of 212 and 231 for both ill-
able to give accurate numerical predictions for the selectivity predicted compounds, respectively. Obviously, the field- and
variable ∆pKi(I-II). Nevertheless, we will show in the fragment-based descriptors are more robust with respect to
following sections that these models are not completely structural extrapolation, at least in the case of our data set.
useless with respect to screening purposes for which accurate It has to be noted, however, that in CoMFA/CoMSIA a
numerical predictions are less important than correct assign- descriptor used to evaluate a compound contains several
ment to affinity classes. thousand values, however, for MACCS only 166 and for
External Numerical Predictivity: Experimental versus VSA even only 32, respectively. Thus, it is not too surprising
Predicted Plots. Figure 2 shows the plots of experimental that a set of VSA values is only a crude approximative
versus predicted pKi(II) values for all four approaches. Each description of a molecule. Moreover, no information about
point corresponds to one molecule and indicates its member- the spatial distribution of the properties encoded by the VSA
ship to a chemical class by point shape. The distribution of descriptors about the molecules is contained in the latter
points along the abscissa (i.e., the experimental pKi(II)) descriptors.
reflects again the imbalance of the test set. Compounds with As expected deviations from correct predictions are
pKi(II) values below 6.5 are rarely found, whereas those generally worse for selectivity prediction of ∆pKi(I-II) as
possessing pKi(II) between 6.5 and 9.0 are clearly over- shown in Figure 3. However, despite the poor predictive r2
represented. The plots demonstrate the value of CoMFA for all models the overall appearance of the plots suggests
(Figure 2A) and CoMSIA (Figure 2B) for this large scale that the 3D QSAR models can give rough estimates in terms
application since most of the test compounds are predicted of selectivity. The group of thiadiazolesulfonamides (scaffold
correctly within (1 logarithmic unit. The same holds for A, Chart 1) with experimental ∆pKi(I-II) > -1 is remark-
the MACCS key approach (Figure 2C) albeit with some more ably underpredicted by all four models. This can be easily
molecules falling outside this tolerance. Closer inspection explained by the fact that in the training set this chemical
of the plots reveals a tendency of the MACCS keys and even class has a mean ∆pKi(I-II) of -1.91 with a maximum of
more of the VSA descriptors (Figure 2D) to overpredict some -0.71. The poorly predicted subset, however, exhibits a
of the thiadiazolsulfonamides (scaffold A, Chart 1) and the mean ∆pKi(I-II) of -0.52. Correct prediction would require
benzothiazolsulfonamides (C). In order to shed some light clear extrapolation. In the case of benzothiazolsulfonamides
on the reason for this finding, we pick two molecules marked (scaffold C) the MACCS and VSA models perform signifi-
by a bold circle in Figure 2D whose affinity is predicted cantly worse compared to the 3D models.
more than 3 logarithmic units too high by the VSA model, External Categorical Predictivity: Sensitivity, Specific-
whereas the other approaches provide a reasonable estimate. ity, and Hit Rates. In order to assess the categorical external
A projection of these molecules into the PCA space of the predictivity of the established models we will first report
training set calculated using VSA descriptors reveals them the resulting sensitivities (Se), specificities (Sp), and hit rates
as extreme structural outliers. They contain very bulky (H) after a distinct threshold for the modeled variable has
phenylpyridinium groups which are not present in any of been defined. Each compound is labeled “positive” or
the training set compounds. The most relevant descriptor “negative” according to the above-mentioned arbitrarily
responsible for the overprediction is SMR_VSA5 (strongly chosen threshold. The thresholds are selected such that
correlated to molecule volume and polarizability) which for compounds with the highest/lowest 2% or 5% of pKi(II)/
390 J. Chem. Inf. Model., Vol. 48, No. 2, 2008 HILLEBRECHT AND KLEBE
3D QSAR FOR DATABASE SCREENING J. Chem. Inf. Model., Vol. 48, No. 2, 2008 391
Figure 3. Plots of experimental versus predicted ∆pKi(I-II) for the four QSAR approaches considered in this study. A: CoMFA; B:
CoMSIA; C: MACCS; D: VSA. Dashed lines mark a range of ( 1 logarithmic unit deviation; the solid line indicates perfect correlation.
The shape of the points indicates the individual scaffold class of the respective ligands, the capital letters corresponding to the structural
assignment given in Chart 1.
∆pKi(I-II) are retrieved. In a screening scenario one is corresponding results are obtained for the difference ∆pKi-
usually only interested in identifying compounds with (I-II).
“extreme” activities either to find those with high affinity External Categorical Predictivity: ROC Plots. Figure
for a particular target, with low affinity for an antitarget, or 5 shows two examples of ROC plots monitoring the
with extraordinary selectivity profiles. In a real life situation screening progress. The main diagonal corresponds to a
an even lower amount (e1%) would be of interest due to random classifier unable to discriminate signal from noise.
the large size of databases screened, but for our present study Thus, for any possible threshold the same percentage for
this would result in a very small absolute number of sensitivity (signal) and 1 - specificity (noise) is achieved.
molecules. Most likely rather unstable statistical results would Its AUC is 50.0%, and any classifier better than random
be suggested. The same threshold will then be applied to therefore has to produce an AUC above this lower limit. The
the calculated values, and the comparison with the experi-
curve of an ideal classifier coincides with the left and the
mentally determined classification yields the assignment to
top edge of the coordinate system and encloses an AUC of
“true positives/negatives” and “false positives/negatives” (TP,
100.0%.
TN, FP, FN). The main difference between this kind of
evaluation and the ROC and enrichment plots described Figure 5A shows the ROC plots for the retrieval of the
below is that the latter methods monitor the evolution of Se 5% compounds with lowest pKi(II). For this example all four
or Sp in dependency on a variable threshold, whereas the methods perform similarly well. For a real life scenario the
approach applied here analyzes the classification performance left section of the plot is most interesting since it indicates
taking the predicted values “as is”. how much signal (actives) can be identified by the model
The resulting sensitivities, specificities, and hit rates are still discarding most of the noise (inactives). In our case about
shown in Figure 4 for the various approaches and thresholds. 35% of actives can be retrieved without extracting false
The plots reveal a trend toward the 1D/2D methods exhibit- positives. For the remaining part of the screening, the
ing a higher sensitivity, i.e., they tend to omit less actives MACCS descriptors perform slightly worse compared to the
than the 3D models. This is achieved at the cost of a reduced other models. At higher noise levels ([1 - specificty])0.4)
specificity as they also label many inactives as actives. This the 3D methods slightly outperform the 1D and 2D ap-
results in most cases in higher hit rates for the 3D QSAR proaches.
approaches. The MACCS keys-based models perform sig- In Figure 5B, the ROC curves to identify the 5%
nificantly better compared to the VSA models; they yield compounds with lowest ∆pKi(I-II) (i.e., the most selective
high sensitivities in conjunction with reasonable specificities. ones for hCA II) are shown. Here, the differences between
The plots also show that the results are generally better to the four approaches are more pronounced. The worst
identify compounds with minimal pKi(II) or maximal ∆pKi- performance is indicated for the VSA model intermediately
(I-II), respectively, compared to the case of maximal pKi- even dropping below the random line. All approaches exhibit
(II) or minimal ∆pKi(I-II). This finding corresponds to the a poorer performance in the “interesting” left part of the plot
observation described above that compounds with very low compared to the previous example. In this area MACCS and
pKi (i.e., the classes of scaffolds H and I) are usually well 3D models perform similarly, whereas at specificities below
predicted, whereas those with high pKi(II) (particularly the 80% ([1 - specificity] > 0.2) 3D QSAR models outperform
scaffolds A and C) are often overpredicted. In consequence the 1D/2D models.
392 J. Chem. Inf. Model., Vol. 48, No. 2, 2008 HILLEBRECHT AND KLEBE
Figure 4. Comparison of sensitivity (A, D), specificity (B, E), and hit rate (C, F) for the four QSAR methods. A, B, C: pKi(II); D, E, F:
∆pKi(I-II).
Figure 5. Receiver operating characteristic (ROC) curves for the retrieval of the 5% compounds with lowest pKi(II) (A) and ∆pKi(I-II)
(B), respectively, applying the four QSAR approaches considered in this study. The main diagonal corresponds to a random classifier,
whereas the left and top edge of the plot would show the ROC line of an ideal classifier.
In order to present a concise comparison for the other prediction (∆pKi(I-II), Figure 6B) the 1D/2D models
retrieval experiments, Figure 6 shows bar plots denoting the outperform 3D QSAR in successfully retrieving the most
corresponding AUCs. For classification with respect to pKi- hCA I selective compounds, whereas the opposite is true
(II) (Figure 6A), all approaches perform comparably well; for identification of hCA II selective compounds. As
however, the VSA models tend to be worse in identifying demonstrated above, 3D QSAR methods obviously suffer
compounds with maximum pKi(II). In the case of selectivity less from the overprediction problem than the 1D/2D
3D QSAR FOR DATABASE SCREENING J. Chem. Inf. Model., Vol. 48, No. 2, 2008 393
Figure 6. Bar plots displaying the area under the curve (AUC)
values of the ROC curves for the four QSAR approaches considered
in this study and activity thresholds obtained when pKi(II) (A) and
∆pKi(I-II) (B) are predicted.
screen databases. Since we were especially interested in 3D has to keep in mind that they will depend on the training
methods, a protocol had to be established which is reliable and test set composition and cannot be transferred generally.
and robust enough to produce consistent spatial alignments Furthermore, additional investigations considering other
of the molecules under consideration. Due to the large QSAR techniques (other statistical methods, 4D,42 5D,43 6D44
number of compounds encountered in real life screening approaches) need to be done to collect more experience on
scenarios the protocolsonce set upshas to be applicable the scope and limitations of QSAR methods for database
without further manual intervention. FlexS in combination screening.
with automated recognition of a chemical compound class
of ligands to be superimposed via the MAPREF methodology ACKNOWLEDGMENT
was chosen to successfully accomplish this task. We could
demonstrate that CoMFA and CoMSIA models based on this The authors acknowledge the kind support of BioSolveIT
with respect to special FlexS parameters, in particular Markus
alignment perform comparably well with similar models
Lilienthal. The Chemical Computing Group (CCG) is
based on manually derived alignments using the protein’s
acknowledged for provision of one research license of MOE.
binding pocket as a reference point along with subsequent The authors are grateful to Prof. Dr. Claudiu T. Supuran
force field relaxation. Since the superimposition comprises (University of Florence) for making the data set of CA I
a rather elaborate and time-consuming step we also tested and CA II inhibitors available to us.
the performance of alignment-free 1D and 2D QSAR models
particularly with respect to database screening. Therefore, Supporting Information Available: Atomic coordinates
fragment-based MACCS descriptors and property-based VSA of all molecules of the data set with assigned pKi(I), pKi(II),
descriptors were computed based on the 2D molecular and ∆pKi(I-II) values as SD file. This material is available
free of charge via the Internet at http://pubs.acs.org.
information. The external predictivity was assessed based
on a test set of 663 compounds with known activities. Of
REFERENCES AND NOTES
course, this number of test molecules does not touch the size
of a real library, but it should be sufficiently large for the (1) Hansch, C.; Fujita, T. r-s-p analysis - A method for the correlation of
intended benchmark test. In summary, the 3D QSAR models biological activity and chemical structure. J. Am. Chem. Soc. 1964,
and the MACCS keys performed quite well with respect to 86, 1616-1626.
(2) Free, S. M., Jr.; Wilson, J. W. A Mathematical Contribution to
affinity prediction (pKi(II)), whereas the VSA descriptors did Structure-Activity Studies. J. Med. Chem. 1964, 53, 395-9.
not achieve to establish models with comparable predictive (3) Kubinyi, H. QSAR: Hansch Analysis and Related Approaches;
power. In terms of numerical affinity prediction the 3D VCH: Weinheim, 1993; Vol. 1.
(4) Zheng, W.; Tropsha, A. Novel variable selection quantitative structure-
QSAR models significantly outperformed the 1D and 2D property relationship approach based on the k-nearest-neighbor
approaches. They tend to be more specific than the MACCS principle. J. Chem. Inf. Comput. Sci. 2000, 40, 185-94.
keys but at the cost of a lower sensitivity. The models are (5) Brown, N.; Lewis, R. A. Exploiting QSAR methods in lead optimiza-
tion. Curr. Opin. Drug DiscoVery DeV. 2006, 9, 419-24.
difficult to mutually rank against each other since relevance, (6) Shen, M.; Beguin, C.; Golbraikh, A.; Stables, J. P.; Kohn, H.; Tropsha,
predictive value, and applicability depend on the specific goal A. Application of predictive QSAR models to database mining:
of the project, e.g., whether retrieval of only a few com- identification and experimental validation of novel anticonvulsant
compounds. J. Med. Chem. 2004, 47, 2356-64.
pounds with an enhanced activity is intended or whether as (7) Oloff, S.; Mailman, R. B.; Tropsha, A. Application of validated QSAR
many actives as possible should be detected. Thus, even if models of D1 dopaminergic antagonists for database mining. J. Med.
the MACCS approach is easier to perform and computa- Chem. 2005, 48, 7322-32.
tionally less demanding it does not make the 3D methods (8) Tropsha, A. Application of Predicitve QSAR Models to Database
Mining. In Chemoinformatics in Drug DiscoVery; Oprea, T. I., Ed.;
superfluous for screening purposes. Also with respect to the Wiley-VCH: Weinheim, 2004; Vol. 23, pp 437-455.
aspect of generality of the descriptors the molecular fields (9) Moro, S.; Bacilieri, M.; Cacciari, B.; Bolcato, C.; Cusan, C.; Pastorin,
of a ligand approximate better the concept of molecular G.; Klotz, K. N.; Spalluto, G. The application of a 3D-QSAR
(autoMEP/PLS) approach as an efficient pharmacodynamic-driven
recognition, and second they should not suffer from the fact filtering method for small-sized virtual library: application to a lead
of missing connectivity information as the MACCS keys do. optimization of a human A3 adenosine receptor antagonist. Bioorg.
Most likely, the 3D QSAR methods are much more robust Med. Chem. 2006, 14, 4923-32.
(10) Pastor, M.; Cruciani, G.; McLay, I.; Pickett, S.; Clementi, S. GRid-
in handling data sets composed of compounds with structur- INdependent descriptors (GRIND): a novel class of alignment-
ally rather diverse molecular skeletons. Nevertheless, we independent three-dimensional molecular descriptors. J. Med. Chem.
could show that the PLS coefficients of the derived MACCS 2000, 43, 3233-43.
(11) Carosati, E.; Mannhold, R.; Wahl, P.; Hansen, J. B.; Fremming, T.;
model can be interpreted meaningfully and used to extract Zamora, I.; Cianchetta, G.; Baroni, M. Virtual screening for novel
knowledge about the influence of individual fragments on openers of pancreatic K(ATP) channels. J. Med. Chem. 2007, 50,
the dependent variables. 2117-26.
(12) Benedetti, P.; Mannhold, R.; Cruciani, G.; Ottaviani, G. GRIND/
With respect to a quantitative selectivity prediction, only ALMOND investigations on CysLT1 receptor antagonists of the
the internal consistency was convincing. None of the quinolinyl(bridged)aryl type. Bioorg. Med. Chem. 2004, 12, 3607-
techniques was able to make satisfying numerical forecasts 17.
(13) Murcia, M.; Ortiz, A. R. Virtual screening with flexible docking and
on this large data set. However, this can be attributed to the COMBINE-based models. Application to a series of factor Xa
rather imbalanced composition of the test set and the inhibitors. J. Med. Chem. 2004, 47, 805-20.
generally higher error contained in a “composed variable”. (14) Ortiz, A. R.; Pisabarro, M. T.; Gago, F.; Wade, R. C. Prediction of
drug binding affinities by comparative binding energy analysis. J. Med.
The results on categorical predictivity suggest that the 3D Chem. 1995, 38, 2681-2691.
models can still give crude estimates about selectivity. (15) Zhang, Q. Y.; Wan, J.; Xu, X.; Yang, G. F.; Ren, Y. L.; Liu, J. J.;
The main purpose of this study was to systematically Wang, H.; Guo, Y., Structure-based rational quest for potential novel
inhibitors of human HMG-CoA reductase by combining CoMFA 3D
assess the performance of 3D QSAR models with respect to QSAR modeling and virtual screening. J. Comb. Chem. 2007, 9, 131-
database screening. Although the results are convincing, one 8.
396 J. Chem. Inf. Model., Vol. 48, No. 2, 2008 HILLEBRECHT AND KLEBE
(16) Cramer, R. D. Comparative Molecular Field Analysis, (CoMFA). 1. (32) Supuran, C. T.; Scozzafava, A. Carbonic Anhydrase Inhibitors. Curr.
Effect of Shape on Binding of Steroids to Carrier Proteins. J. Am. Med. Chem.: Imm., Endoc., Metab. Agents 2001, 1, 61-97.
Chem. Soc. 1988, 110, 5959-5967. (33) Supuran, C. T.; Scozzafava, A. Applications of carbonic anhydrase
(17) Cramer, R. D.; DePriest, S. A.; Patterson, D. E.; Hecht, P. The inhibitors and activators in therapy. Expert Opin. Ther. Pat. 2002,
Developing Practice of Comparative Molecular Field Analysis. In 3D 12, 217-242.
QSAR in Drug Design: Theory Methods and Applications; Kubinyi, (34) SYBYL molecular modeling package, Version 7.1; Tripos Inc.: 1699
H., Ed.; ESCOM: Leiden, 1993; pp 443-485. South Hanley Road, Suite 303, St. Louis, MO 63144, 2005.
(18) Klebe, G.; Abraham, U.; Mietzner, T. Molecular similarity indices in
(35) MOE; Chemical Computing Group: Montreal, Canada.
a comparative analysis (CoMSIA) of drug molecules to correlate and
predict their biological activity. J. Med. Chem. 1994, 37, 4130-46. (36) AutoQSAR. In The SVL script AutoQSAR is freely available to MOE
(19) Klebe, G. Comparative Molecular Similarity Indices Analysis: CoM- licensees and can be downloaded at http://svl.chemcomp.com.
SIA. Perspect. Drug DiscoVery Des. 1998, 12/13/14, 87-104. (37) Bush, B. L.; Nachbar, R. B., Jr. Sample-distance partial least squares:
(20) Klebe, G., Structural Alignment of Molecules. In 3D QSAR in Drug PLS optimized for many variables, with application to CoMFA. J.
Design: Theory Methods and Applications; Kubinyi, H., Ed.; ES- Comput.-Aided Mol. Des. 1993, 7, 587-619.
COM: Leiden, 1993; Vol. 1, pp 173-199. (38) Thibaut, U.; Folkers, G.; Klebe, G.; Kubinyi, H.; Merz, A.; Rognan,
(21) Lemmen, C.; Lengauer, T. Computational methods for the structural D. Recommendations for CoMFA Studies and 3D QSAR Publications.
alignment of molecules. J. Comput.-Aided Mol. Des. 2000, 14, 215- In 3D QSAR in Drug Design: Theory Methods and Applications;
32. Kubinyi, H., Ed.; ESCOM: Leiden, 1993; Vol. 1, pp 711-716.
(22) Lemmen, C.; Lengauer, T.; Klebe, G. FLEXS: a method for fast (39) Wold, S.; Johansson, E.; Cocchi, M. PLS - Partial Least-Squares
flexible ligand superposition. J. Med. Chem. 1998, 41, 4502-20. Projections to Latent Structures. In 3D QSAR in Drug Design: Theory
(23) Hillebrecht, A.; Supuran, C. T.; Klebe, G. Integrated approach using Methods and Applications; Kubinyi, H., Ed.; ESCOM: Leiden, 1993;
protein and ligand information to analyze selectivity- and affinity- pp 523-550.
determining features of carbonic anhydrase isozymes. ChemMedChem (40) Evers, A.; Hessler, G.; Matter, H.; Klabunde, T. Virtual screening of
2006, 1, 839-53. biogenic amine-binding G-protein coupled receptors: comparative
(24) MDL Information Systems, Inc. 14600 Catalina Street, San Leandro, evaluation of protein- and ligand-based virtual screening protocols. J.
CA 94577. Med. Chem. 2005, 48, 5448-65.
(25) Durant, J. L.; Leland, B. A.; Henry, D. R.; Nourse, J. G. Reoptimization
of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. (41) Triballeau, N.; Acher, F.; Brabet, I.; Pin, J. P.; Bertrand, H. O. Virtual
2002, 42, 1273-80. screening workflow development guided by the “receiver operating
(26) Labute, P. A widely applicable set of descriptors. J. Mol. Graphics characteristic” curve approach. Application to high-throughput docking
Modell. 2000, 18, 464-77. on metabotropic glutamate receptor subtype 4. J. Med. Chem. 2005,
(27) Witten, I. H.; Frank, E. Data Mining: Practical Machine Learning 48, 2534-47.
Tools and Techniques, 2nd ed.; Morgan Kaufmann Publishers: San (42) Hopfinger, A. J.; Wang, S.; Tokarski, J. S.; Jin, B.; Albuquerque, M.;
Fransisco, 2005. Madhav, P. J.; Duraiswami, C. Construction of 3D-QSAR Models
(28) Triballeau, N.; Bertrand, H. O.; Acher, F. Are You Sure You Have a Using the 4D-QSAR Analysis Formalism. J. Am. Chem. Soc. 1997,
Good Model? In Pharmacophores and Pharmacophore Searches; 119, 10509-10524.
Langer, T., Hoffmann, R. D., Eds.; Wiley-VCH: Weinheim, 2006; (43) Vedani, A.; Dobler, M. 5D-QSAR: the key for simulating induced
Vol. 32, pp 325-364. fit? J. Med. Chem. 2002, 45, 2139-49.
(29) Maren, T. H. Carbonic anhydrase: chemistry, physiology, and (44) Vedani, A.; Dobler, M.; Lill, M. A. Combining protein modeling and
inhibition. Physiol. ReV. 1967, 47, 595-781. 6D-QSAR. Simulating the binding of structurally diverse ligands to
(30) Lindskog, S. Structure and mechanism of carbonic anhydrase. Phar- the estrogen receptor. J. Med. Chem. 2005, 48, 3700-3.
macol. Ther. 1997, 74, 1-20.
(31) Geers, C.; Gros, G. Carbon dioxide transport and carbonic anhydrase
in blood and muscle. Physiol. ReV. 2000, 80, 681-715. CI7002945