Qsar Guidance
Qsar Guidance
Qsar Guidance
Page 1 of 186
North American Free Trade Agreement (NAFTA)
Technical Working Group on Pesticides (TWG)
November, 2012
Contributors
Mary Manibusan, US EPA
Joel Paterson, PMRA
Dr. Ray Kent, US EPA
Dr. Jonathan Chen, US EPA
Page 2 of 186
PREFACE
Integrated Approaches to Testing and Assessment (IATA) and
(Q)SAR
Pesticide regulatory agencies have traditionally relied on extensive in vivo and in
vitro testing to support regulatory decisions on human health and environmental
risks. While this approach has provided strong support for risk management
decisions, there is a clear recognition that it can often require a large number of
laboratory animal studies which can consume significant amounts of resources in
terms of time for testing and evaluation. Even with the significant amounts of
information from standard in vivo and in vitro testing, pesticide regulators are
often faced with questions and issues relating to modes of action for toxicity,
novel toxicities, susceptible populations, and other factors that can be
challenging to address using traditional approaches.
1
In this context, mode of action for toxicity is the description of key events and processes,
starting with interaction of an agent with the cell through functional and anatomical changes,
resulting in cancer or other health endpoints. Mechanism of action for toxicity is the detailed
molecular description of key events in the induction of cancer or other health endpoints and
represents a more detailed understanding and description of events than is meant by mode of
action. Mode of action for toxicity can also be differentiated from the pesticidal mode of action
which is the specific biochemical or physical effect(s) by which the pesticide kills, inactivates or
otherwise controls pests. Mechanism and mode of action for toxicity are important components of
adverse outcome pathways (AOPs).
Page 3 of 186
outcomes in individual organisms and populations that are the bases for risk
assessments.
2
(Q)SAR is the study of the correlation between chemical structure and associated biological
activity, with the ultimate goal of predicting the activity of untested chemicals based on structurally
related compounds with known activity. The parentheses around the “Q” in (Q)SAR indicates that
the term refers to both qualitative predictive tools (i.e., structure-activity relationships (SARs)) and
quantitative predictive methods (quantitative structure-activity relationships (QSARs)). Although
the term (Q)SAR is often used to refer to predictive models, especially computer-based models, it
should be noted that (Q)SAR is actually inclusive of a wide variety of computerized and non-
computerized tools and approaches.
Page 4 of 186
Moving IATA from a long-term vision into mainstream practice for pesticide
assessments will require the development and application of biochemical and
cellular assays, along with the further development and broader application of
existing tools such as (Q)SAR. Towards that end, the United States
Environmental Protection Agency Office of Pesticide Programs (US EPA OPP)
has partnered with the Pest Management Regulatory Agency (PMRA) of Health
Canada to develop common approaches to IATA for the human health and
ecological risk assessment of pesticides. The formalized framework for this
partnership is a North American Free Trade Agreement (NAFTA) Joint Project on
“21st Century Toxicology: Integrated Approaches to Testing and Assessment”.
While this project is intended to cover a broad array of computational toxicity
tools, a key current activity is the development of this NAFTA (Q)SAR guidance
document for pesticide risk assessors.
Page 5 of 186
TABLE OF CONTENTS
PREFACE ................................................................................................................ 3
GLOSSARY ......................................................................................................... 11
2. INTRODUCTION ..................................................................................... 22
Page 6 of 186
3. BACKGROUND INFORMATION ON (Q)SAR ............................ 32
Page 7 of 186
5. EVALUATING THE ADEQUACY OF (Q)SAR
PREDICTIONS ........................................................................................ 57
5.0 Introduction ........................................................................................ 57
Page 8 of 186
6. COMBINING INFORMATION FROM MULTIPLE
PREDICTIONS ........................................................................................ 80
6.0 Introduction ........................................................................................ 80
Page 9 of 186
8.2 Adverse Outcome Pathway: Conceptual Framework .................. 101
LIST OF TABLES
Table 3–1: Pesticidal Model of Action and Associated Chemical Class for
a Select Group of Insecticides .............................................................. 34
LIST OF FIGURES
Figure 2–0: (Q)SAR Guidance Document Schematic........................................... 31
Figure 5–1: Evaluating the Adequacy of a (Q)SAR Prediction for a Pesticide ... 59
Page 10 of 186
GLOSSARY
This Glossary section is intended to provide additional explanation for common
scientific terms which are presented in order to enhance communication between
(Q)SAR experts and users of (Q)SAR models, particularly in the field of
pesticides.
There are two portions in the glossary — abbreviations (acronyms) and terms
with more detailed explanations.
ABBREVIATIONS
EC European Commission
ER Estrogen Receptor
EU European Union
Page 11 of 186
HPV High Production Volume Chemicals Program (US EPA)
Log P Logarithm to the base 10 of the 1-octanol/water partition coefficient, also Log Kow
NHEERL National Health and Environmental Effects Research Laboratory (US EPA ORD)
PCKOC Organic Carbon Partition Coefficient model components within the EPI Suite
(US EPA)
Page 12 of 186
PMRA Pest Management Regulatory Agency (Health Canada)
2
Q Cross-validated correlation coefficient
2
Q ext External correlation coefficient
TERMS
Adverse outcome A conceptual construct that portrays existing knowledge concerning the
pathway (AOP) linkage between a direct molecular initiating event and an adverse
outcome at a biological level of organization relevant to risk assessment.
Algorithm A sequence of instructions for carrying out a defined task. Typically the
instructions are mathematical equations or computer code.
Analog A chemical compound that has a similar structure and similar chemical
properties to those of another compound, but differs from it by one or a
few atoms or functional groups.
Apical endpoint Observable effects of exposure to a toxic chemical in a test animal. The
effects reflect relatively gross changes in animals after substantial
durations of exposure.
Congeneric series A group of chemicals with a common base structure (e.g. aliphatic
alcohols) but differing in the arrangement of common substituents. The
polychlorinated biphenyls are considered a congeneric series.
Page 13 of 186
Cross-validation A statistical technique for assessing the predictive ability of a QSAR by
the removal of different proportions of the chemicals from the training
set, developing a QSAR on the remaining chemicals and using that
QSAR to predict the activity of those removed. This procedure is
repeated a number of times, so that a number of statistics can be derived
from the comparison of predicted data with the known data.
Data mining A collective term that refers to all procedures (informatic and statistical)
that are applied to large heterogeneous data sets, in order to develop a
data matrix amenable to statistical methods.
Domain of Applicability The domain of applicability of a (Q)SAR model is the chemical structure
and response space in which the model makes predictions with a given
reliability. It can be thought of as a theoretical region in multi-dimensional
space in which the model is expected to make reliable predictions. It
depends on the nature of the chemicals in the training set, and the
method used to develop the model and helps the user of the model to
judge whether the prediction for a new chemical is reliable or not.
Endpoint The measure of a biological effect, e.g., LC50 or EC50. A large number of
endpoints are used in regulatory assessments of chemicals. These
include lethality, carcinogenicity, immunological responses, organ
effects, developmental and reproductive effects, etc. In (Q)SAR analysis,
it is important to develop models for individual toxic endpoints.
Page 14 of 186
External validation A validation exercise in which the chemical structures selected for
inclusion in the test set are different from those included in the training
set, but which should be representative of the same chemical domain.
The QSAR model developed by using the training set chemicals is then
applied to the test set chemicals in order to assess the predictive ability
of the model.
Genetic algorithm A statistical method that selects the best combination of descriptors to
describe a given property, modeled on the principle of the survival of the
fittest (best) in the breeding of organisms.
LD50 Median Lethal Dose. Statistically derived single dose causing death in
50% of test animals when administered by the route indicated (oral,
dermal, inhalation), expressed as a weight of substance per unit weight
of animal, e.g., mg/kg.
Lipinski’s rule of 5 A rule of thumb developed by Christopher Lipinski for evaluating whether
the properties of a chemical are likely to make it an orally active drug in
humans. The rule states that, in general, an orally active drug has no
more than one violation of the following criteria: not more than 5
hydrogen bond donors, not more than 10 hydrogen bond acceptors, a
molecular weight under 500, and an octanol-water partition coefficient
less than 5.
Mechanism of Action The detailed molecular description of key events in the induction of
(Toxicity) cancer or other health endpoints. Mechanism of action for toxicity
represents a more detailed understanding and description of events than
is meant by mode of action. Mechanism of action of toxicity is an
important component of an adverse outcome pathway (AOP).
Page 15 of 186
Mode of Action The mode of action of a pesticide refers to the specific biochemical or
(Pesticide) physical effect(s) by which the pesticide kills, inactivates, or otherwise
controls pests.
Mode of Action The description of key events and processes, starting with interaction of
(Toxicity) an agent with the cell through functional and anatomical changes,
resulting in cancer or other health endpoints. Mode of action for toxicity is
an important component of an adverse outcome pathway (AOP).
OECD QSAR Toolbox The OECD QSAR Toolbox is a software application intended to be used
by governments, chemical industry, and other stakeholders in filling gaps
in (eco)toxicity data needed for assessing the hazards of chemicals. The
Toolbox incorporates information and tools from various sources in a
logical workflow. Crucial to this workflow is the grouping of chemicals into
chemical categories ( http://www.oecd.org/document/54/
0,3746,en_2649_34379_42923638_1_1_1_1,00.html ).
Outlier A data point that is far removed from other members of the dataset.
Typically, the outlier of a (Q)SAR model has a cross-validated
standardized residual greater than three standard deviation units.
Point of Departure Commonly abbreviated POD, the point of departure is the dose-response
point that marks the beginning of a low dose extrapolation. This point is
often the lower bound on an observed incidence or on an estimated
incidence from a dose-response model.
Read-across Endpoint information for one or more chemicals (the source chemical(s))
is used to predict the same endpoint for another chemical (the target
chemical), which is considered to be “similar” in some way (usually on
the basis of structural similarity or similar mode or mechanisms of
action). Sometimes, it is also referred to as “data bridging.” In principle,
read-across can be used to estimate physicochemical properties,
toxicity, environmental fate, and ecotoxicity. For any of these endpoints,
it may be performed in a qualitative or quantitative manner.
Page 16 of 186
SAR Structure-activity relationship — a qualitative relationship (i.e., an
association) between a molecular (sub)structure and the presence or
absence of a biological activity, or the capacity to modulate a biological
activity imparted by another substructure.
SMILES Simplified Molecular Input Line Entry System — a computer-compatible,
standardized, two-dimensional description of chemical structure. The
SMILES string is written by following a small number of rules. In brief,
each non-hydrogen atom (hydrogen is only explicitly included in special
circumstances) is denoted by its symbol; double and triple bonds are
shown by “=” and “#” symbols, respectively; branches are shown in
parentheses; and rings are opened and closed by the use of numbers.
For example, CCO represents ethanol, and c1ccccc1N represents
aniline (the digits indicate the beginning and ending of ring, and lower
case “c” indicates aromatic carbon).
Structural alert A molecular (sub)structure associated with the presence of a specific
(usually adverse) biological activity.
Substructure A portion of the overall structure of a chemical that may be associated or
correlated with a biological activity or property of the chemical.
TD50 The statistically derived median toxic dose of a drug or toxin at which
toxicity occurs in 50% of the test population.
Test set A set of chemicals, not included in the training set used to develop a
QSAR, that is used to validate (assess the predictive ability of) the
QSAR. It is sometimes called an “independent” or “external” test set or
validation set. For the purpose of (Q)SAR validation, it is important that
the test set has the same domain of applicability as the training set, and
contains a sufficient number of chemical structures.
Toxicity pathway A cellular response pathway that, when sufficiently perturbed, is
expected to result in adverse health effects (NRC, 2007). Toxicity
pathways are important components of adverse outcome pathways
(AOPs).
Training set A set of chemicals used to derive a QSAR. The data in a training set are
typically organized in the form of a matrix of chemicals and their
measured properties or effects observed in a toxicity test. A
homogeneous training set is a set of chemicals which belong to a
common chemical class or share a common chemical functionality or a
common mechanism of action. A heterogeneous training set is a set of
chemicals which belong to multiple chemical classes, or which do not
share a common chemical functionality or common mechanism of action.
Validation The testing of a (Q)SAR tool to assess its reliability and relevance. The
OECD Guidance Document on the Validation of (Quantitative)Structure-
Activity Relationship (Q)SAR Models (OECD Series on Testing and
Assessment No. 69) defines validation as the process by which the
reliability and relevance of a particular approach, method, process or
assessment is established for a defined purpose.
Page 17 of 186
1. EXECUTIVE SUMMARY
Moving IATA from a long-term vision into mainstream practice for pesticide
assessments will require the development and application of new predictive tools
and the further development and broader application of existing tools such as
(Q)SAR. In recognition of these requirements and the need to develop common
approaches to IATA for the risk assessment of pesticides, the United States
Environmental Protection Agency (US EPA) Office of Pesticide Programs (OPP)
and the Pest Management Regulatory Agency (PMRA) of Health Canada have
established a North American Free Trade Agreement (NAFTA) Joint Project on
“21st Century Toxicology: Integrated Approaches to Testing and Assessment”.
Page 18 of 186
that provide guidance on the development and application of (Q)SAR, but
provides an introduction to the evaluation of (Q)SAR tools and their application to
pesticide regulatory risk assessments. While the focus of this document is on the
application of (Q)SAR to pesticide risk assessments, the principles and issues
are general enough to be applied to other types of chemicals. Regardless of the
type of risk assessment scenario, (Q)SAR experts should be consulted and peer
review procedures used to ensure scientific excellence and rigor.
The document is organized into eight sections including this executive summary.
Page 19 of 186
(Q)SAR tools are adequate for consideration in pesticide assessments. Section 5
also includes a discussion of the documentation of (Q)SAR tools and predictions.
Section 8 provides conclusions and perspectives on the future vision for (Q)SAR
and pesticides. It is noted that the conclusions of the NAS report on Toxicity
Testing in the 21st Century: A Vision and a Strategy with respect to increased
reliance on existing knowledge-bases for chemical classes and alternative testing
methods is especially relevant for pesticide regulatory authorities and will require
research on new testing technologies and integrated approaches to testing and
assessment (IATA) for more efficient and effective reviews that don’t compromise
public health and the environment. (Q)SAR tools are one example of an
alternative method that could be applicable to IATA and the increasing use of
these tools by pesticide authorities make it important to communicate a
systematic and transparent approach to using (Q)SAR in pesticide assessments.
This guidance document is consistent with the current hazard/risk assessment
paradigm with an overall emphasis on not using (Q)SAR in isolation. In addition
to the validity and relevance of the individual (Q)SAR tools and predictions, the
defensibility of predictions depends on biological consistency and plausibility
across all scientific lines of evidence in a holistic weight of evidence approach.
Future applications will involve anchoring (Q)SAR predictions with what is known
about chemical classes/categories, biological mode of action, toxicity pathways
Page 20 of 186
and population effects. Eventually, (Q)SAR predictions will be built into larger
conceptual frameworks or Adverse Outcome Pathways (AOPs) that delineate the
documented, biologically plausible, measurable, and testable processes by which
a chemical induces molecular perturbations and subsequent biological responses
that are relevant for risk assessment.
In addition to the eight sections discussed above, the document also includes an
appendix of the web pages for a number of national and international
organizations that may be useful to evaluators seeking additional information on
general (Q)SAR concepts, and the development, validation, and evaluation of
(Q)SAR tools and predictions (Appendix I), an appendix summarizing the content
of the European Commission’s (Q)SAR model and prediction reporting formats,
examples of detailed information templates that could be considered when
(Q)SAR predictions are used as critical sources of data in pesticide assessments
(Appendix II), and an appendix of several examples of the application of (Q)SAR
tools and methods to pesticides and other chemicals (Appendix III).
Page 21 of 186
2. INTRODUCTION
INTRODUCTION
Page 22 of 186
(>10%) pesticide metabolites and degradates in human dietary and
environmental risk assessments.
The advent of the Food Quality Protection Act (FQPA) has necessitated an
increased refinement of pesticide risk assessments including a closer scrutiny of
all metabolites and degradates. In order to determine whether these metabolites
and degradates should be included in human dietary and environmental risk
assessments in the absence of detailed toxicity data, the US EPA OPP has relied
upon various types of structural similarity evaluations. Also, more recently, the
agency has explored the use of (Q)SAR models to predict the potential toxicity of
pesticide metabolites/ degradates in order to provide scientific rationales and
support for requiring additional toxicity testing, to substantiate the use of
metabolites/degradates in estimates of total toxic residues, or to exclude
metabolites/degradates from further testing based on a lack of toxicity concerns.
Similarly, the US EPA OPP has made sporadic use of bridging techniques and
structure activity relationships to identify whether additional ecotoxicity testing of
environmental degradates should be required and whether these residues should
be included in environmental exposure estimates for pesticides.
Since empirical data are typically available on the parent pesticide, one of the
key factors considered when determining whether (Q)SAR model predictions for
the toxicity or ecotoxicity of metabolites and degradates are reliable enough to be
used is how well predictions from the same model for the parent pesticide
compare to the empirical data for the parent pesticide.
Page 23 of 186
There is also an ongoing threshold of toxicological concern (TTC) project that is
designed to determine a level of concern for various chemical classes of
antimicrobial pesticides. Human exposures below the TTC would not be
considered to be of concern and no additional toxicological data would be
required. SAR will be used to characterize the toxicity of all chemicals within
specific classes of antimicrobial chemicals. This is an American Chemistry
Council (ACC) Biocide Panel sponsored project conducted through the
International Life Sciences Institute (ILSI) with the US EPA participating on the
ILSI Steering Committee and Expert Working Group.
Page 24 of 186
For ecological risk assessments, OPP has made increasing use of bridging
techniques and structure activity relationships (SARs) to identify whether
additional testing of degradates/transformation products should be required and
whether these residues should be included in modeling exposure estimates.
The Pest Management Regulatory Agency (PMRA) of Health Canada takes into
account the same kinds of considerations as the US EPA OPP when addressing
the potential toxicity of pesticide metabolites/degradates of chemical pesticides. If
metabolites or degradates of a pesticide are identified in plants or soil, but not in
rat metabolism studies, the agency will require the submission of available
toxicity data on those metabolites. Also, toxicity data on metabolites/degradates
are sometimes voluntarily submitted to the PMRA by applicants. In terms of the
application of (Q)SAR, the PMRA can include a request for (Q)SAR predictions
on metabolites/degradates when requiring the submission of existing data and
can also generate (Q)SAR predictions to help identify potential concerns.
Page 25 of 186
2.1.1 US EPA, Office of Pollution Prevention and Toxics (OPPT)
OPPT has also developed a number of publicly available (Q)SAR tools that are
used in regulating substances under TSCA. Examples include the EPI Suite
program which includes several models for estimating physical-chemical
properties and environmental fate parameters. EPI Suite also contains the
ECOSAR model for predicting ecotoxicity. Other tools developed by OPPT
include Oncologic, an expert system for predicting carcinogenicity, and an analog
identification tool (AIM) for identifying structural analogs.
The US EPA Office of Research and Development (ORD), National Health and
Environmental Effects Laboratory (NHEERL), Mid-Continent Ecology Division
(MED) has been developing (Q)SAR models and related databases since the
1980s. Examples include a database of ecotoxicity information (ECOTOX) as
well as ASTER3, a collection of databases and (Q)SAR models for toxicity to
aquatic species. ASTER also includes models to estimate physical-chemical
properties, bioconcentration, and environmental fate.
3
ASTER is a US EPA intranet application only accessible to US EPA staff and contractors.
http://www.epa.gov/med/Prods_Pubs/aster.htm
Page 26 of 186
2.1.3 US FDA, Office of Food Additive Safety
The US Food and Drug Administration’s (FDA) Office of Food Additive Safety
(OFAS) has utilized (Q)SAR analysis in the pre-market review of food contact
substances for many years and has recently implemented the use of multiple
commercial and publicly available (Q)SAR software models (Lo Piparo et al.,
2011; Arvidson et al., 2010; Bailey et al., 2005). OFAS is also investigating the
potential application of metabolism prediction software to the review of food
contact substances. OFAS uses (Q)SAR analysis as a decision support tool in
conjunction with open literature data and submitted test results, and (Q)SAR may
also be used to identify the need for additional toxicity testing during pre-
submission consultations for food contact substances.
Health Canada and Environment Canada have extensive experience with the
use of (Q)SAR to address selected data requirements for new substances under
the Canadian Environmental Protection Act (CEPA). Adequately validated
(Q)SAR predictions may be submitted by notifiers or in some cases generated by
government evaluators to address physical-chemical properties,
persistence/bioaccumulation, human health effects, ecotoxicity endpoints and
other endpoints included in the New Substances notification requirements under
CEPA. For instance, predictions are sometimes used for assessing substances
with low production volumes and in cases where the substance cannot be
isolated in pure enough form to provide meaningful test results. (Q)SAR data are
generally combined with empirical data and expert judgment in a weight of
evidence approach. (Q)SAR was also utilized by both departments for the
categorization (prioritization) of existing substances on the Domestic Substances
List (DSL) for further assessment. Environment Canada used (Q)SAR predictions
to assist with determinations of persistence, bioaccumulation and inherent
toxicity to non-human organisms from existing substances while Health Canada
used (Q)SAR to generate physical-chemical data to support determinations of
greatest potential for human exposure and as part of the hazard tools used to
prioritize chemicals for inherent toxicity to humans when data for specific
endpoints were not available. (Q)SAR can also be used by both departments as
supporting information in screening level risk assessments for DSL substances
when experimental data are not available.
Page 27 of 186
with the aim of facilitating the application of (Q)SAR approaches in regulatory
settings and their regulatory acceptance. One of the most important products
from the OECD (Q)SAR project has been the principles for the validation of
(Q)SAR models (OECD, 2004). Comprehensive guidance has also been
produced on the development and application of grouping methods for chemicals
including chemical categories, read-across and trend-analysis (OECD, 2007a).
The OECD has also done extensive work on software for identifying structural
characteristics and mode/mechanism of action data on chemicals, systematically
grouping them into chemical categories and applying read-across, trend analysis
and (Q)SARs to fill data gaps. The end result of these efforts, the OECD QSAR
Toolbox, is intended for use by government agencies and stakeholders for
addressing gaps in the toxicity and ecotoxicity databases used in the hazard and
risk assessment of chemicals and is freely available (OECD, 2011a).
Page 28 of 186
2.2 Purpose of the NAFTA (Q)SAR Guidance Document
The purpose of this guidance document is to help pesticide evaluators to
evaluate the (Q)SAR-related information and to identify the important issues that
may be involved when incorporating (Q)SAR information into the risk assessment
process. It is recognized that there is an ever-expanding volume of journal
articles, national and international reports and guidance documents, and
academic textbooks on the subject of (Q)SAR. This document does not
reproduce or replace these journal articles, reports, guidance documents, and
textbooks on (Q)SAR, but provides an introduction to the evaluation of (Q)SAR
tools and their application to pesticide regulatory risk assessments.
While many of the illustrative examples in this document involve the application
of (Q)SAR to the prediction of toxicity in pesticide hazard assessments, the
general principles discussed can also be applied to (Q)SAR predictions for
ecotoxicity, physical chemical parameters, and other activities and properties of
relevance to pesticide assessments. Similarly, although many issues are raised
in the context of the prediction of apical endpoints, pesticide evaluators should
recognize that most of these issues will also apply when (Q)SAR is eventually
used in IATA to predict key events related to mechanism/mode of action for
toxicity and AOPs such as receptor binding, gene activation, enzyme
inhibition/activation, etc.
Page 29 of 186
information on problem formulation and (Q)SAR, evaluating the adequacy of
(Q)SAR predictions, combining information from multiple predictions, and
incorporating predictions into weight of evidence assessments. Each section can
also be considered as stand-alone guidance on its particular subject area.
Page 30 of 186
Figure 2–0: (Q)SAR Guidance Document Schematic
• Definition of (Q)SAR
• Types of (Q)SAR tools and approaches
• Importance of data quality in (Q)SAR model development
• Importance of mode/mechanism of action in (Q)SAR development
• Examples of (Q)SAR tools and their applications
CONCLUSIONS
CONCLUSIONS AND FUTURE VISION FOR (Q)SAR
(Section 8)
• Toxicity Testing in the 21st Century: Shift in the Risk Assessment Paradigm
• Weight of Evidence Approach: Biological Plausibility
• Adverse Outcome Pathway: Conceptual Framework
• Expert Scientific Judgement and Peer Review
Page 31 of 186
3. BACKGROUND INFORMATION ON (Q)SAR
Definition of (Q)SAR
3.0 Introduction
The purpose of this section is to provide some brief background information on
the definition of (Q)SAR, types of (Q)SAR tools and approaches and some key
issues associated with the development of (Q)SAR tools. In particular, the
importance of data quality and mode/mechanism of action in the development of
(Q)SAR models is highlighted. Also, while (computerized) (Q)SAR models are
frequently cited in examples elsewhere in this document, this section illustrates
that (Q)SAR actually consists of a range of tools and approaches.
Page 32 of 186
statistical algorithms developed through a variety of techniques (e.g., univariate
regression, multiple linear regression, partial least squares analysis).
Deltamethrin Cypermethrin
Listed below are some common criteria used to identify structurally similar
substances. Many of these have been proposed by the OECD and the US EPA
as a basis for building chemical categories (OECD, 2007a; US EPA, 1999).
Page 33 of 186
functional analogs are not necessarily structural analogs and vice versa
(Saliner et al., 2005; Russom et al., 1997).
Table 3–1 lists examples of pesticidal modes of action for several examples of
insecticides. Most pesticidal modes of action include more than one chemical
class. Consequently, ‘similarity’ can be based on at least three aspects of a
pesticide, (i) pesticidal mode of action (e.g., acetylcholinesterase inhibition),
(ii) pesticide classification (e.g., insecticide) and (iii) chemical
classification/common functional group (e.g., carbamate, organophosphate, etc.).
Therefore, a weight of evidence approach can be important when defining
similarity for the purpose of developing (Q)SAR tools and approaches. As
discussed in example 5, Appendix III, information on the pesticidal mode of
action and structural similarity can be combined with pharmacokinetic and
empirical animal study results in a weight of evidence approach in pesticide
assessments.
Table 3–1: Pesticidal Mode of Action and Associated Chemical Class for a
Select Group of insecticides (adapted from Insecticide Resistance
Action Committee (IRAC); http://eclassification.irac-online.org/)
Organophosphates
Phenylpyrazoles (Fiproles)
Organochlorines
Carbamates
Page 34 of 186
matching. These estimation tools rank chemicals based on (structural)
characteristics or features of each chemical that are similar (match/overlap), and
features that are dissimilar (mismatch/difference) (Saliner et al., 2005; Monev,
2004.). Figure 3–2 provides a schematic of the measures that can be described
in similarity indices. Similarity indices can utilize two- or three-dimensional
structural information and examples include correlation-type indices (e.g.,
Tanimoto Index (also known as Jaccard coefficient), Hodgkin Ricards Index,
Cosine-similarity index), dissimilarity measures (e.g., Euclidean distance index,
Hamming distance), and composite measures of similarity and dissimilarity (e.g.,
Hamann measure, Yule measure). For an overview of these approaches see
Saliner et al., 2005; Monev, 2004; and Urbano-Cuadrado et al., 2008. It is also
important to know that a high degree of similarity based on mathematical
similarity indices does not necessary indicate there are similarities in the mode of
action (MOA) for the concerned effects.
Chemical A Chemical B
a c b d
Page 35 of 186
3.2.1 Analogs
Analog approaches have traditionally involved predicting an endpoint or property
of one chemical based on the available data for the same endpoint or property of
a similarly structured chemical (OECD, 2007a). An example of an analog
technique is bridging or extrapolating the results of toxicological studies on a
parent pesticide compound to a metabolite or transformation product of that
same parent pesticide. When using an analog approach for bridging from a
parent pesticide to metabolites or transformation products, it is important to have
sufficient evidence to link a particular substructure or substructures to the toxicity
endpoint of interest, and that the substructure is conserved from the parent
pesticide to the metabolite or transformation product.
Page 36 of 186
especially when the boundary of the category is difficult to define. The
observation of a quantitative trend (increasing, decreasing, or constant) in the
experimental data for a given endpoint across chemicals in a category can also
be used as the basis for interpolation or extrapolation (i.e., trend analysis). In
addition, it is possible to develop a QSAR within a chemical category by plotting
the activities versus the properties of chemicals with empirical data. By using a
combination of tools, i.e., read-across, trend analysis and (Q)SAR, the matrix of
properties/activities for chemicals under consideration can be rendered less
uncertain through the greater use of existing data (OECD, 2007a).
Property 2 Interpolation
Property 3 Extrapolation
Property 4
Activity 2
Empirical data
Activity 3 Missing data
Activity 4
Chemical category approaches have been used for assessing chemicals with
data gaps by the US EPA’s OPPT, in the US EPA HPV Challenge Program,
under the REACH legislation, and in OECD SIDS program (van Leeuwen et al.,
2009). Additional examples of categories can be found in Enoch, 2010; Enoch et
al., 2009; US EPA, 1999; and Worth and Patlewicz, 2007.
Page 37 of 186
freeware) models available for predicting human health related and
environmental activities, physical-chemical properties, and other parameters.
Page 38 of 186
presumptive. Ideally, (Q)SAR models should strive to achieve statistical
association but have a mechanistic foundation.
Local QSAR models are generally developed for individual classes of chemicals.
Their training sets usually consist of highly structurally homogeneous or
congeneric chemicals or classes of chemicals with similar known biological
activity/function (e.g., peroxisome proliferators). Local models require fewer
training set chemicals and tend to perform better, presumably because they are
more likely to focus on a single mechanism of action. However, they are often
limited in scope to a small subset of narrowly defined chemicals. Global models
are generally derived using training sets of structurally heterogeneous or non-
congeneric chemicals. Due to the diversity of the training set chemicals, these
models often cover a range of different mechanisms of action, usually resulting in
poorer predictive performance than local models, unless the training sets are
subdivided based on mechanism of action. Global predictive models tend to be
more adept in discovering new insights, but may be more likely to yield incorrect
results if the predicted chemical structure is not well-represented in the training
set. Several publications have investigated the ability of global and local (Q)SAR
approaches to fulfill regulatory requirements (e.g., EC , 2010; Yuan et al., 2007;
and Worth et al., 2011).
In (Q)SAR model development, usually a set of chemicals with reliable data are
collected for a particular biological/chemical activity. Typically the original test
data are randomly separated into a training set and a validation set, with the
training data set used to develop a model and the validation data set used to test
the assumptions that the model works for chemicals not involved in the
development of the original model (Leonard and Roy, 2006).
Page 39 of 186
information on the training set chemicals can also be obtained prospectively. In
the more commonly used retrospective approach data are collected from readily
available sources (e.g., the open literature). This often results in noisier
predictions because of the lack of control or consistency in study protocols,
interpretation criteria, etc., although this can be compensated for by evaluating
the available data and selecting only consistent higher quality studies (i.e.,
chemical identity/form confirmed, concentrations/purity measured, standard test
protocols, assays optimized for the type of chemicals, etc.). Also, it is important
to verify that the identities of the chemicals in the training set correspond to their
structural representations used in the predictions. Sometimes information on the
metabolism of the chemicals tested and mechanisms of action can also be
obtained retrospectively to help enhance the interpretability of the predictions.
Page 40 of 186
observed activity and that the relationship can be reliably modeled. For example,
in vitro systems (e.g., Ames mutagenicity) are often less complex and more
reliably modeled than many in vivo systems (e.g., carcinogenicity, teratogenicity).
However, this is not always the case as in vivo fish acute toxicity LC50 values are
well predicted for several modes of action because chemical concentration in
water is a good surrogate for chemical activity in the blood (MacKay et al., 1983).
Also, if in vitro systems include metabolic components, their complexity for
(Q)SAR development will increase.
Several reviews have been written on the types of tools available (EC, 1995a,b;
Hulzebos et al., 1999; Jensen et al., 2008; Pavan et al., 2005a,b; Rorije and
Hulzebos, 2005; Tsakovska et al., 2005, 2008), but it should be kept in mind that
the inventories of available tools is constantly changing with emerging research
in this area.
With the development of Simplified Molecular Input Line Entry System (SMILES)
notation (Weininger, 1988) as a means to identify structure information in a
computer readable format, and the advancement of desk top computing in the
1970’s, (Q)SAR tools have become more readily accessible to risk assessors
(Benfenati, 2007). Although initially (Q)SAR approaches were primarily used in
the drug and pesticide discovery and development fields, these techniques
became especially important to regulatory risk assessment after the promulgation
of the Toxic Substances Control Act (TSCA) (Zeeman et al., 1995). The use of
QSARs in assessing potential toxic effects of organic chemicals on ecologically
relevant species and humans evolved as computational efficiency and
toxicological understanding advanced, and in many cases has proved to be
scientifically-credible for use in estimating toxicity for substances with little or no
available empirical data (OECD, 2007b).
(Q)SAR models also exist for specific endpoints such as skin sensitization
(Patlewicz et al., 2008), eye irritation (Tsakovska et al., 2005), acute toxicity and
repeated-dose endpoints for mammalian species (Tsakovska et al., 2008),
bioaccumulation (Arnot and Gobas, 2004), mutagenicity and carcinogenicity
Page 41 of 186
(Benfenati et al., 2009; Benigni et al., 2007a,b), estimating physical chemical
properties (EC, 1995a,b; Deardon and Worth, 2007), toxicity to aquatic species
(EC, 1995a,b; Netzeva et al., 2007; Pavan et al., 2005a,b), and reproductive
toxicity (Jensen et al., 2008).
Under REACH, information on models that meet the OECD validation principles
and are proposed for use in filling data gaps are currently being gathered. A
searchable catalog of all models including background information required to
validate the models, authors/source of model, related publications, endpoint
estimated and related experimental protocol, algorithm with training set and
validation set, including all input variables for the models can be found at the
following website: http://qsardb.jrc.ec.europa.eu/qmrf/index.jsp. Some actual
example cases are listed in Appendix III.
Page 42 of 186
3.6 Summary
(Q)SAR tools and approaches involve the study of correlations between chemical
structure and associated biological activity, physical-chemical properties or other
properties, with the ultimate goal of predicting the activity or properties of
untested chemicals using available data from structurally-related compounds.
While frequently associated with computerized models, (Q)SAR tools actually
encompass a wide range of approaches such as analogs, chemical categories,
and computer or non-computer based SAR and (Q)SAR. The development of
reliable (Q)SAR models depends upon a number of factors, among which,
experimental data are probably the most important. In particular, data quality and
a good understanding of the available information on mode/mechanism of action
can contribute to the confidence in (Q)SAR model predictions. Types of
endpoints or properties from a pesticide context that can be predicted using
(Q)SAR and related methods include in vivo ecotoxicity and human health-
related toxicity endpoints, specialized in vitro endpoints, metabolism, physical-
chemical parameters, and environmental fate parameters. While this document is
not intended to recommend or endorse individual (Q)SAR tools, it is recognized
that there are currently a variety of computerized and non-computerized,
commercial and non-commercial (Q)SAR tools for predicting the endpoints or
properties described above. Sections 2 and 3 of this document were designed to
provide a brief introduction and background information on (Q)SAR tools and
approaches. The subsequent sections of this document (4, 5, 6, and 7) focus on
issues associated with applying (Q)SAR predictions to pesticides including
problem formulation and (Q)SAR (section 4), evaluating the adequacy of (Q)SAR
predictions (section 5), combining information from multiple predictions (section
6) and incorporating (Q)SAR into weight of evidence assessments (section 7).
Page 43 of 186
4. Problem Formulation and (Q)SAR
4.0 Introduction
Problem formulation is an important initial step for framing the specific
question(s) to be addressed in assessments of human health and environmental
risks from pesticides. In its Guidelines for Ecological Risk Assessment, the US
EPA has indicated that problem formulation involves the on-going integration of
the available information that eventually leads to three products: assessment
endpoints, a conceptual model of the risk to be investigated, and an analysis plan
(US EPA, 1998).
Since guidance on the general problem formulation process for the risk
assessment of chemicals such as pesticides has been outlined in other published
documents (e.g., US EPA, 1998; Doull et al., 2007), the details of that guidance
will not be discussed here. Instead, this section will focus on the preliminary
analysis of (Q)SAR predictions as one of the several potential sources of
information to be integrated at the problem formulation stage. Preliminary
analysis of (Q)SAR prediction for a pesticide at the problem formulation
essentially involves answering the following questions:
What are the characteristics of the pesticide that is the subject of the
prediction?
What are the characteristics of the (Q)SAR tool and the prediction?
Page 44 of 186
Answering these questions at the problem formulation stage may enable an
evaluator to immediately determine that a prediction is not suitable or relevant for
addressing the specific pesticide risk assessment question. Alternatively, these
questions may lead to a more in-depth evaluation of whether the (Q)SAR
prediction is adequate or “fit for purpose” (see section 5) and eventually to the
consideration of how the results of a fit for purpose prediction could be
incorporated into an overall weight of evidence decision (see section 7).
(Q)SAR predictions are generally used to try to gain some insights into the
toxicity, ecotoxicity, behavior in the environment or other aspects of a pesticide in
the absence of empirical data. Consideration of a (Q)SAR prediction for the
premarket assessment of a pesticide would likely involve one of the following
scenarios: 1) submission of a (Q)SAR prediction by a registrant to address a data
requirement or as supporting evidence for a data requirement for pesticide, a
metabolite or a transformation product, or 2) use of a prediction by an evaluator
to identify or support a data requirement for a pesticide, metabolite or
transformation product.
Page 45 of 186
pesticides or to justify a requirement for a study on a metabolite or transformation
product for which no data have been submitted. This would also include cases in
which (Q)SAR predictions are used as supporting information when questioning
the reliability of experimental data, leading to a requirement for the submission of
more reliable studies. Criteria for what constitutes a reliable (Q)SAR prediction
and acceptable levels of uncertainty would likely be less stringent for scenarios in
which (Q)SAR predictions are used to drive data requirements compared to
cases where (Q)SAR is used to support waiving data requirements.
While these premarket scenarios are likely to be the most frequent applications
of (Q)SAR, there may also be instances where (Q)SAR tools could be used post-
market such as the toxicity characterization of a novel impurity (e.g.,
leachable/extractable) not originally characterized during the pre-market approval
process.
Page 46 of 186
4.2.1 Chemical Identifiers and Mixtures
Examples of the types of pesticides for which a (Q)SAR prediction may be
required include discrete substances; individual isomers or mixtures of isomers;
crystalline structures (e.g., minerals); substances with unknown or variable
composition, complex reaction products and biological materials; polymers; other
mixtures or formulations; and complex salts and metal-containing compounds.
Therefore, accurate information on the identity, composition and structure of a
pesticide is critical to determining whether a prediction was based on a correct
structure. Confusion can result when common or trade names are applied to
multiple isomers of a pesticide, salt forms, acid/base forms or polymeric and
monomeric forms. The use of more precise chemical nomenclature (e.g.,
International Union of Pure and Applied Chemistry (IUPAC)) can assist with more
accurate identification (IUPAC, 2010). While Chemical Abstract Service (CAS)
numbers (American Chemical Society, 2010) are frequently used as unique
identifiers for pesticides, in some cases they may actually represent isomer
mixtures, polymers, and unknown or variable composition substances rather than
discrete, single chemicals, so it may be necessary to review the CAS number to
clearly determine which structure(s) it actually represents.
In general, mixtures cannot be run through (Q)SAR models, nor can synergistic
or antagonistic effects of chemicals in mixtures be accounted for because models
typically use single, discrete chemical structures as input. For mixtures of
discrete organic chemicals, one option may be to make separate predictions for
each chemical and compare and contrast the results. Alternatively, if one
component of a mixture is predominant, in some cases that component may be
used to represent the entire mixture. However, for pesticides with variable
compositions, (i.e., oligomers, natural fats, or mixtures that change composition
depending on reaction conditions) evaluators should be aware that (Q)SAR
predictions generated using a representative structure may not accurately reflect
the true nature of the material used in the pesticide application.
Page 47 of 186
Information on degradates and metabolites may be available from empirical
pharmaco/toxicokinetic studies or from in silico models of potential metabolic or
degradation pathways. If degradates and metabolites are identified, it will also be
important to consider their stability and in the case of (Q)SAR predictions of
metabolites, the likelihood that they could occur in vivo. Individual pesticide
regulatory agencies have specific criteria they use to identify probable, stable
metabolites.
Page 48 of 186
Examples of isomeric forms to consider include stereoisomers that differ in their
spatial orientation of atoms. The pyrethroid insecticide fenvalerate is a racemic
mixture of stereoisomers (i.e., R/S enantiomers) of a chiral active ingredient,
although the S-isomer in the mixture (esfenvalerate) has the greatest insecticidal
activity (WHO/FAO, 1996) (see Figure 4–1). Because these different isomers
have the same molecular formula, molecular weight, and physical-chemical
properties, it can be difficult for some (Q)SAR models to distinguish them,
especially models that do not take stereoisomerism into account.
Fenvalerate Esfenvalerate
(RS)-alpha-Cyano-3-phenoxybenzyl (RS)-2-(4- (S)-alpha-Cyano-3-phenoxybenzyl (S)-2-(4-
chlorophenyl)-3-methylbutyrate chlorophenyl)-3-methylbutyrate
Page 49 of 186
and molecular editors (Daylight Chemical Information Systems, 2008; IUPAC,
2010b; Dalby et al., 1992). For multiple (batch) chemical entries, the Structure
Data Format (SDF) file and SMILES (SMI) file formats are commonly used
(Dalby et al., 1992). These structural entry methods have strengths and
limitations, and in some cases, it may be necessary to verify the accuracy of the
structural representations from these methods to ensure that correct structures
are used for predictions.
Many of the concepts discussed in this section overlap with the evaluation of the
scientific validity of a (Q)SAR tool as discussed in section 5.1. However, at the
problem formulation stage, it is intended that the evaluator will gain a basic
understanding of these issues. This may enable an immediate decision that the
(Q)SAR prediction is not adequate for the assessment context or it could lead to
a more detailed evaluation as discussed in section 5.1, especially with respect to
the application of the OECD (Q)SAR validation principles (section 5.1.1).
A starting point for characterizing a (Q)SAR tool at the problem formulation stage
is a sufficient understanding of the general methodology behind the tool. Is the
tool based on simple analog extrapolations, read-across or trend-analysis
approaches using chemical categories, a structural alert/rule based SAR/expert
system, a statistical (e.g., regression based) QSAR derived from a specific
database of chemicals and their descriptors or some other method? Each of
these methods has strengths and limitations that can influence how they should
be interpreted and the reliability of predictions from them. For instance,
SAR/expert systems based on structural alerts may be supported by expert
reviews of relevant research, and can include a mechanistic rationale to support
predictions. However, in some cases these systems do not include structural
alerts associated with inactivity, may have limited databases of alerts, and may
not have a clearly defined domain of applicability. Statistical QSAR models based
on training sets of active/inactive chemicals and descriptors of chemical structure
may provide insights into associations between specific structures and activity
that were not previously investigated, help to identify structures that modify or
Page 50 of 186
eliminate specific activities, and may be capable of generating quantitative
predictions (e.g., probabilities or specific numerical values) rather than
dichotomous active/inactive (yes/no) predictions. However, in some cases QSAR
models may overemphasize statistical associations in the absence of
mechanistic rationales, their domains may be restricted by the structural diversity
in their training sets, and their training sets may include chemicals with a variety
of different mechanisms which can result in poor predictive performance and/or
considerable uncertainty in their predictions. A number of reviews of the
strengths and limitations of (Q)SAR models are available in the scientific
literature (e.g., Hulzebos et al., 2001; Greene, 2002).
Gaining an understanding of the empirical data from which the (Q)SAR tool was
derived is another important starting point for determining whether a (Q)SAR
prediction is likely to be relevant to a pesticide assessment. It may be possible to
quickly discount (Q)SAR tools derived from studies based on outdated protocols
not conducted according to GLP standards, based on endpoints that are vague
or inconsistent, interpreted according to non-standard criteria, involving
chemicals significantly structurally dissimilar to the pesticide of interest, and/or
obtained from non-peer reviewed sources. On the other hand, (Q)SAR tools
based on higher quality empirical data may be subjected to a more detailed
evaluation and potentially included in a weight of evidence assessment.
Page 51 of 186
Investigating other details of the (Q)SAR tool used may also assist in determining
the relevance of a (Q)SAR prediction to a pesticide assessment during problem
formulation. For example, details on a (Q)SAR model such as the specific name
of the model, version number, date it was developed, and contact information for
the developer can be important for determining the relevance of a (Q)SAR
prediction. Model developers can make significant changes from one version to
another such as increasing the number and diversity of the chemicals in the
training set, modifying the library of descriptors or structural alerts, and modifying
model algorithms. As a result, predictions from a newer version of a model may
not be comparable to predictions from previous versions. Model developers can
even discontinue support for older versions making it difficult to obtain additional
information on training sets, interpretation criteria, etc.
Page 52 of 186
are used by SAR/expert systems to identify potentially active compounds.
Information on how these features influenced the overall prediction either
quantitatively or qualitatively can impact on the level of reliability assigned to a
(Q)SAR prediction (see section 5.4). In particular, it can be important to
investigate whether the structural fragments, descriptors, and/or physical-
chemical properties that drive a prediction are consistent with available
information on mechanism of action or not.
When using QSAR models, it can be important to review the identities of the
compounds similar to the test pesticide that influenced the prediction. This would
likely be obtained from an analysis of the training set compounds that formed the
basis for the model algorithm. Similarly, for SAR/expert systems, the compounds
that were utilized to support the development of any structural alerts identified in
the test pesticide could be reviewed. The compounds that make up a category or
group used in a read-across or trend analysis approach can also be considered
as compounds that are similar to the test pesticide and that directly influenced
the (Q)SAR prediction from that approach. Regardless of the type of tool used,
the identities of the compounds that influenced the prediction, how their similarity
to the test pesticide was assessed and the degree of similarity, how they
influenced the prediction, the nature of the empirical data for them that is related
to the predicted endpoint, how well they are predicted by the (Q)SAR tool (i.e.,
internal validation), and whether a mode and/or mechanism of action has been
established are all important considerations when determining the reliability of
the (Q)SAR prediction for a test pesticide (see section 5.4).
Page 53 of 186
on a pesticide with relevant and reliable (Q)SAR predictions could also help build
defensible rationales for requiring additional empirical studies on specific
endpoints, mode of action, etc. (e.g., targeted testing).
The integration of the empirical database with (Q)SAR predictions at the problem
formulation stage could also be important for more detailed evaluations of the
reliability of the (Q)SAR predictions at a later stage in the assessment (see
section 5.4). Questions to consider include whether a predicted endpoint for a
pesticide is consistent with and supported by empirical data for related endpoints
for the same pesticide or whether the prediction contradicts these data. Empirical
data for similar compounds, metabolites and degradation products can be
particularly important to consider when assessing the reliability of a (Q)SAR
prediction. Knowledge of the toxicity database for a parent pesticide compound
could impact on the level of confidence assigned to a (Q)SAR prediction for a
metabolite. In some cases, the consistency of the results of (Q)SAR predictions
for a parent pesticide versus a metabolite may be useful in determining the
confidence in the prediction for the metabolite, especially if the parent compound
contains structural alerts known to be associated with specific mechanisms of
toxicity and those alerts are preserved or activated following metabolic
transformation (e.g., substructures associated with DNA/protein binding). Also,
an evaluation of the existing empirical data for a pesticide may provide
justification for using more or less conservative criteria to interpret a (Q)SAR
prediction for that same pesticide.
Page 54 of 186
For a postulated (eco)toxicological mode of action, the extent to which the initial
chemical-biological system relationship is understood and how well the cascade
of key events leading to the adverse outcome is understood (i.e., mode of action,
mechanism of action, adverse outcome pathway) in taxa under consideration
could directly influence the level of confidence in (Q)SAR predictions for
endpoints associated with this mode of action. For instance, when a
(eco)toxicological mode of action has already been established for a structurally
similar compound, or for a chemical class in which the pesticide in question
resides, this mode of action could be used at the problem formulation stage to
focus (Q)SAR predictions on particular endpoints and taxa, bridge from the
structurally similar compound to inform dose selection for any study required for
the pesticide in question, provide support for waiving the need for specific studies
based on the current pesticide dataset, and/or help to rule out the relevance of
the observed or predicted effect to humans or other species, so that additional
studies are unlikely to be required.
Page 55 of 186
4.5 Summary
The initial step in framing the questions to be addressed in the human health or
environmental assessment of a pesticide is problem formulation. Although the
questions to be addressed in pesticide risk assessments have traditionally been
framed in terms of the available empirical data, (Q)SAR predictions are another
source of information that can be considered during the problem formulation
process. The assessment context in which (Q)SAR is being applied, the
characteristics of the pesticide that is the subject of the prediction, the
characteristics of the (Q)SAR tool and the prediction, and the available empirical
data including mode of action data that could impact on the application of
(Q)SAR are all important factors to consider when integrating (Q)SAR into a
problem formulation. This type of preliminary analysis of the (Q)SAR information
on a pesticide could lead to an immediate conclusion that (Q)SAR is not suitable
for the particular pesticide assessment question or it could set the stage for a
more in depth evaluation of whether a (Q)SAR prediction is fit for purpose for
integrating into a weight of evidence decision (see section 5).
Page 56 of 186
5. Evaluating the Adequacy of (Q)SAR Predictions
5.0 Introduction
Page 57 of 186
framework have have been adapted to guide pesticide evaluators through the
information to be considered when evaluating whether predictions from (Q)SAR
tools are adequate for use in pesticide assessments. A schematic for the
resulting modified framework is shown in Figure 5–1. Evaluating the adequacy of
(Q)SAR predictions relies on a lot of the same information initially considered at
the problem formulation stage (see section 4), but with a more focussed
consideration of validity, applicability, relevance, and reliability. This type of
evaluation can be done in advance of or at least independently of the process of
combining the prediction with other information in a weight of evidence
assessment (see section 7). Since clear and complete documentation of (Q)SAR
tools and predictions is important both to the evaluation of the adequacy of
predictions and their incorporation into weight of evidence assessments, this
section also includes a discussion of documentation.
Page 58 of 186
Figure 5–1: Evaluating the Adequacy of a (Q)SAR Prediction for a Pesticide
(modified from ECHA, 2008 and Worth et al., 2011)
RELIABLITY OF
THE (Q)SAR
PREDICTION APPLICABILITY OF
SCIENTIFIC VALIDITY OF
THE (Q)SAR TOOL TO
THE (Q)SAR TOOL
THE PESTICIDE
ADEQUACY OF
THE (Q)SAR
PREDICTION
Page 59 of 186
scientific basis for making decisions on the acceptability of (Q)SAR predictions,
and to improve the transparency and consistency of (Q)SAR reporting leading to
a greater mutual acceptance of predictions (OECD, 2007c).
In response, the OECD developed the Principles for the Validation, for
Regulatory Purposes, of (Q)SAR Models which can be used as guidance for the
types of information to review when determining if a (Q)SAR model is acceptable
or not for use in a regulatory or decision-making framework. The principles
include “1) a defined endpoint, 2) an unambiguous algorithm, 3) a defined
domain of applicability, 4) appropriate measures of goodness-of-fit, robustness
and predictivity, and 5) a mechanistic interpretation, if possible.” (OECD, 2004).
The OECD also drafted and finalized a separate guidance document (Guidance
Document on the Validation of (Quantitative) Structure-Activity Relationships
[(Q)SAR] Models) that includes a discussion of the principles and information on
how to validate (Q)SARs for different applications (OECD, 2007c).
The five OECD (Q)SAR validation principles are presented in sections 5.1.1.1–
5.1.1.5 along with a summary of some of the key issues identified in the OECD
guidance document and other sources that should be considered in the context
of evaluating (Q)SAR tools for application to specific purposes in pesticide risk
assessments. For further details on the principles and their application,
evaluators should consult the OECD guidance document (OECD, 2007c).
Evaluators may also be interested in consulting a recent paper by Dearden et al.
(2009) which outlined 21 types of errors related to the OECD (Q)SAR validation
principles which were identified in various (Q)SAR analyses published in the
scientific literature.
Page 60 of 186
5.1.1.1 Principle 1 — Defined Endpoint
The purpose of this principle is to make sure that the endpoint being predicted by
a given (Q)SAR tool is transparent. According to the OECD a “defined endpoint”
can be considered as “any physicochemical, biological or environmental effect
that can be measured and therefore modeled.” (OECD 2007c).
No (Q)SAR model can be better than the data upon which it is based. Optimally,
all of the training set data for a particular (Q)SAR model should correspond to the
specific regulatory endpoint of interest, have been generated using the same
experimental protocol (ideally a standardized guideline type protocol), and be
interpreted using evaluation criteria that correspond to those of the specific
pesticide regulatory program. While this type of approach would help to ensure
the reliability and relevance of (Q)SAR predictions, (Q)SAR model developers
often have to rely on studies conducted under different protocols and conditions
in order to ensure sufficient numbers and diversity of chemical structures in the
model training sets (OECD, 2007c).
Page 61 of 186
potential sources of variability and uncertainty into account when evaluating the
validity of a (Q)SAR tool.
Page 62 of 186
5.1.1.3 Principle 3 — Defined Domain of Applicability
Netzeva et al. (2005) have defined the applicability domain of a (Q)SAR model as
“the response and chemical structure space in which the model makes
predictions with a given reliability.” This means the range of chemical structures,
physicochemical properties, mechanisms, and responses over which the (Q)SAR
tool can generate reliable predictions for the intended regulatory purpose. The
domain of applicability is dependent upon the set of chemicals on which the tool
is based (e.g., (Q)SAR model training set).
There is a balance between the overall range of the domain of applicability and
the predictivity of a (Q)SAR tool. Models with large training sets and diverse
domains of applicability may be capable of generating predictions for a wider
variety of chemical structures than smaller more structurally and mechanistically
homogeneous models, but there is a greater chance that many of those
predictions will be unreliable (OECD, 2007c; ECHA, 2008). Using information on
mechanisms, mode of action, and/or adverse outcome pathways to group
chemicals can improve predictive performance for large heterogeneous training
sets.
Page 63 of 186
The OECD guidance document on the validation of (Q)SAR models summarizes
a variety of different methods for defining domain of applicability including the use
of structural features that enhance (toxicophores) or modulate toxicity to define
the mechanistic domain, characterizing the descriptor or interpolation space by
graphing and distance (geometric) analysis, using Williams plots to visualize
outliers in descriptor and response space, comparing the structural and physical-
chemical similarity of the test chemical to the training set by fragment based
approaches, and other methods (OECD, 2007c). A number of reviews of different
methods for defining domain of applicability have also been published (Nikolova
and Jaworska, 2003; Dimitrov et al., 2005a; Jaworska et al., 2005; Netzeva et al.,
2005).
In the context of (Q)SAR predictions for pesticide active ingredients, the need to
assess the domain of applicability cannot be over-emphasized. A long-standing
limitation of many commercial and non-commercial (Q)SAR models has been
domains of applicability that are not sufficiently representative of the structures
and mechanisms of action associated with pesticide active ingredients. This is in
part related to the nature of pesticide data (i.e., confidential unpublished studies
accessible only by regulatory agencies). Fortunately, this is changing over time
as resources such as the US EPA ToxRef database should it make it possible to
build models and other tools with domains of applicability that are more
encompassing of pesticide active ingredients.
Page 64 of 186
variation in the predicted values that can be explained by the regression
equation, and the standard error of the estimate (s) which measures the
dispersion of the predicted values around the regression line. Well-fitted models
have R2 values close to 1 and low s values. Poorly-fitted models are not likely to
be too useful for regulatory applications. However, it should be noted that
deceptively high R2 and low s values can be obtained by including a large
number of variables or descriptors in the regression equation (i.e., over-fitting a
model). Generally, better predictive performance can be obtained when the ratio
of the number of chemicals in the training set to the number of descriptors in the
regression equation (i.e., the Topliss ratio) is 5:1 or more. Note that R2 and s
values alone are not enough to assess model validity as they do not provide
information on the predictive performance for chemicals not included in the
training set of a (Q)SAR model (OECD, 2007c; ECHA, 2008).
Page 65 of 186
Predictivity can be assessed by external validation, either through the use of a
test set of chemicals separate from the (Q)SAR model training set or by
separating a set of chemicals into a training set and a test set at the design stage
(Gramatica, 2007). External validation is usually measured by an external
correlation coefficient (Q2ext). External test sets should be of sufficient size and
representative of the types of chemicals to be predicted using the (Q)SAR model.
In some cases, model developers may also present the results of internal
validation techniques such as leave-one-out (LOO) and leave-many-out (LMO)
methods. For these methods, one or more chemicals is removed from the
training set, the model is re-built, the removed chemicals are predicted, the
process is repeated, and the average predictivity across the various versions of
the model is estimated as a cross-validated regression coefficient (Q2). One of
the reasons internal validation statistics are presented is that there may be
limited data from which to construct an independent external test set because
(Q)SAR model developers generally want to maximize the number of training set
chemicals, leaving few chemicals for external validation testing.
Q2 or Q2ext values of >0.5 and >0.9 are considered to represent good and
excellent performance, respectively, but it should be noted that predictivity is
dependent on the statistical method used and the composition of the test set.
Also, as stated previously, predictions outside the domain of the training set are
likely to be less reliable than predictions within the domain of applicability, so that
validation principle 4 is closely linked to validation principle 3 (OECD, 2007c;
ECHA, 2008).
It should be noted that not all elements of principle 4 are applicable to all (Q)SAR
tools, so the assessment of goodness-of-fit, robustness, and predictivity may
have to be made on a case by case basis. Rule-based SAR/Expert Systems that
use databases of structural alerts are one example in which there is generally no
training set and as such LOO, LMO, and other methods will not be applicable.
Also, when considering Cooper statistics for external validation testing, the
determination of specificity and negative predictivity may be difficult if the expert
system is only based on structural alerts for activity.
Page 66 of 186
5.1.1.5 Principle 5 — Defined Mechanism of Action, if Possible
The fifth validation principle states that a (Q)SAR model should be associated
with a mechanistic interpretation wherever possible. Although it is recognized that
mechanistic information is not always available for (Q)SAR models, whenever it
is available, it should be investigated and reported. A transparent mechanistic
interpretation can assist in the determination of whether the domain of
applicability of a model is suitable for predictions for the chemical of interest, help
with the interpretation of outliers, guide hypothesis testing, and provide support
for the biological plausibility (i.e., toxicological interpretation) and reliability of the
predictions from a model. However, the absence of a clearly identified
mechanistic basis for a model does not necessarily mean that the model is not
potentially useful for a given regulatory application (OECD, 2007c).
Page 67 of 186
5.2 Applicability of the (Q)SAR Tool to the Pesticide
Whether a (Q)SAR tool can be considered as applicable to a pesticide depends
upon the characteristics of the pesticide (see section 4.2) and the domain of
applicability of the (Q)SAR tool (see section 5.1.1.3).
Similarly, if the isomeric form of a pesticide could have an impact on the endpoint
or property to be predicted, the (Q)SAR tool will need to be capable of
differentiating between isomers to be applicable. A QSAR model that uses 2-D
structural descriptors and only accepts 2-D structural representations of
chemicals to be predicted will not be very useful for predicting differences in
toxicity between stereoisomeric forms of a pesticide. A better approach would be
to use a QSAR model capable of recognizing structural representations of
isomers, that includes isomer specific descriptors, and whose training set is
sufficiently diversified with respect to data on different isomeric forms.
Section 5.1.1.3 outlines the concept of defining domain of applicability during the
evaluation of the validity of a (Q)SAR tool. While it is possible for some (Q)SAR
tools to make predictions for pesticides outside their domains of applicability,
those predictions are likely to be less reliable at best or in some cases the
Page 68 of 186
pesticides will be so far outside the domain of applicability that the (Q)SAR tools
should not be considered as applicable. As discussed, there are a number of
commercial and non-commercial (Q)SAR models that include automated
methods for assessing whether a chemical lies within their domain of applicability
based on limits on descriptor values, the presence of unrecognized structural
features, and other parameters. Also, a variety of different methods of defining
domain of applicability have been published (OECD 2007c; Nikolova and
Jaworska, 2003; Dimitrov et al., 2005a; Jaworska et al., 2005; Netzeva et al.,
2005).
In order for a (Q)SAR tool to be relevant, the endpoint or property that it predicts
must correspond to the endpoint or property for which a data requirement exists
in a given pesticide assessment context. A (Q)SAR model, capable of generating
reliable predictions for the mutagenicity of chemicals in Salmonella typhimurium
TA1538 may provide useful information on the in vitro mutagenicity of a pesticide,
but it will not provide specific information to address a data requirement for an in
vivo clastogenicity study. Similarly, a positive prediction for general pre-natal
developmental toxicity for a pesticide may not be sufficient to address a question
about whether a pesticide can induce specific skeletal malformations.
The type of information that a (Q)SAR tool can generate can also impact on its
relevance to a pesticide assessment question. In particular, (Q)SAR models are
usually designed to generate qualitative or quantitative predictions for particular
Page 69 of 186
endpoints. A model that can provide a qualitative (e.g., yes/no, positive/negative)
estimate of the toxicity of a pesticide to freshwater fish may provide some useful
information, but will be of limited relevance if a prediction of an acute LC50 in trout
is required for a particular assessment context.
Page 70 of 186
necessarily inaccurate, but are generally considered less reliable than predictions
for compounds falling with the domain of applicability.
The age of the QSAR model and its training set may also have impacts on the
consideration of the domain of applicability of the model and the reliability of the
prediction. An older, global type QSAR model may make a negative prediction for
a pesticide because its training set is populated with a limited number of
chemicals that contain the key structural elements in the pesticide and that all
tested negative in historical empirical studies. However, a more up-to-date
model, whose training set has been tested in more modern empirical studies, has
been segregated into groups according to mechanism of action, and contains a
larger number of compounds from the same chemical class as the pesticide of
interest, many of which have positive empirical test results, may generate a
positive prediction that is more reliable even though the pesticide falls within the
domains of applicability of both models. Consequently the use of the most up-to-
date versions of models and training sets is recommended and could be
particularly important when combining information from multiple predictions (see
section 6).
Page 71 of 186
5.4.2 Strengths and Limitations of the (Q)SAR Tool
The strengths and limitations of a (Q)SAR tool can impact on the evaluation of
the reliability of the predictions from that tool (Hulzebos et al., 2001; Greene,
2002). One source of strengths and limitations is the general methodologies on
which various (Q)SAR tools are based (e.g., analog approaches, chemical
categories, SAR and QSAR models, etc.) (see section 4.3). An example already
cited in this document is the lack of structural alerts linked to inactivity or negative
test results in some SAR/expert systems. If no structural alerts are identified for a
pesticide using this type of system and this is considered as equivalent to a
prediction of inactivity (negative), the prediction may be less reliable than a
positive prediction from the same system or a negative prediction from another
type of (Q)SAR tool that uses descriptors, alerts or other parameters directly
related to inactivity, depending on the assessment context. Similarly, the
overemphasis on statistical associations and lack of a mechanistic basis for
predictions may make some statistical QSAR models less reliable.
Built-in biases are another source of strengths and limitations of (Q)SAR tools
that could influence the reliability of predictions. For instance, some QSAR
models for pharmaceutical applications have training sets with distributions of
positive and negative compounds designed to generate higher specificity versus
sensitivity scores (Section 5.1.1.4). This type of bias needs to be taken into
account when models of this type are applied to pesticides as they may generate
a higher proportion of false negative predictions. The European Chemicals
Agency noted a potential source of bias for biodegradation models in their
guidance for the implementation of the REACH legislation. Because QSAR
models for biodegradation are often biased towards non-ready biodegradability,
predictions of biodegradability may be less reliable than predictions of non-ready
biodegradability (ECHA, 2008).
The sources of data for training set compounds, and the sources of data or
methods of calculation for descriptors (see section 5.1.1.1) can be another type
of strength or limitation of (Q)SAR model that could impact on the reliability of
predictions. While empirical datasets for registered pesticides usually consist of
peer reviewed guideline type studies, many model training sets are based on
open literature studies of varying quality. Also, as noted by Doull et al. (2007), for
some chemical classes, potential training set data may not be available from the
published literature. Similarly, the sources of the descriptor values and/or
methods used to estimate them may need to be scrutinized when evaluating the
reliability of a QSAR model prediction. Whether calculated descriptors, especially
obscure types, are reproducible or whether methods used to estimate descriptors
Page 72 of 186
for older versions of QSAR models have been supplanted by newer methods
could impact on the acceptability of predictions. These considerations also apply
to chemical category/read-across approaches. The methods used to identify
similar compounds, and the sources used for the endpoint related, physical-
chemical property, mechanistic and other data used to support chemical category
development and read-across predictions may need to be carefully considered
when determining the reliability of those predictions (OECD, 2007a).
Page 73 of 186
chemicals. How those empirical study results are interpreted can influence the
nature of the algorithm, the predictive performance of the model, and ultimately
the reliability of predictions from that model. For a carcinogenicity (Q)SAR model
developed from a training set of rodent bioassays, the bioassay results may have
been interpreted as positive based on a specific percentage increase in tumor
incidence over controls, a statistically significant increase in incidence over
controls, a statistically significant trend over several dose groups, and/or other
criteria. The reliability of a prediction from such a model could be influenced by
whether the study interpretation criteria were consistent among the training set
chemicals and whether the criteria correspond to regulatory agency specific
interpretation criteria.
The criteria for interpreting predictions that have been developed by the
originator of the (Q)SAR tool and the rationale for them should also be taken into
account when evaluating the reliability of predictions. Statistical-based QSAR
models often generate probabilities (i.e., 0 – 1.0) for dichotomous (e.g.,
positive/negative) endpoints and the model developers recommend specific
criteria for interpreting the predicted probabilities (e.g., TOPKAT criteria: ≥0.7 and
≤1.0 = positive; ≥0.0 and <0.3 = negative; ≥0.3 and <0.7 = inconclusive; Accelrys
Inc., 2004). Criteria of this nature are usually developed based on internal and/or
external validation testing to optimize the predictive performance of the model.
Page 74 of 186
5.4.4 Predictive Performance of the (Q)SAR Tool for Similar
Chemicals
Testing the predictive performance of a (Q)SAR tool on chemicals that are similar
to the pesticide in question and have empirical data available for them can
provide another source of information for evaluating the reliability of predictions.
Chemicals from the same chemical class as the pesticide in question, as well as
isomers, salts, and other forms could be considered for testing the predictive
performance of the (Q)SAR tool. For example, a starting point for testing the
predictive performance of a (Q)SAR tool for a sodium salt of an organic acid
would be to generate a prediction for a de-salted acid form of the compound for
which empirical data are available. Which chemicals to use would depend on the
type and quality of empirical data available for them, the parameter used to
assess similarity (e.g., physical-chemical parameters, structure, metabolism) and
the degree of similarity.
Page 75 of 186
reliability of the prediction and seek additional data or predictions for structurally
similar chemicals. On the other hand, there may be cases where precursor
effects in a target organ consistently reported in short-term studies may be used
to support a (Q)SAR prediction of a related longer-term effect (e.g.,
carcinogenicity) in the same target organ and species by the same route of
exposure.
Page 76 of 186
5.5 Documentation of (Q)SAR Tool and Prediction
In order for an evaluator to critically review the adequacy of a (Q)SAR prediction
for a pesticide, the (Q)SAR tool and the prediction must be documented with a
sufficient level of transparency. This is similar to the concept of sufficient
documentation for empirical studies as delineated in empirical study guidelines
and in guidance for producing robust study summaries for regulatory purposes.
What constitutes a sufficient level of transparency will depend on the assessment
context, specific data reporting requirements or policies of the regulatory agency
and the type of (Q)SAR tool. When predictions of toxicity, ecotoxicity,
environmental fate, etc. are used in a prioritization or screening context, it may
not be necessary to provide full details on the adequacy of the (Q)SAR
predictions. However, for a (Q)SAR prediction to be accepted as a critical data
point in a pesticide assessment would likely require less uncertainty, and thus
more extensive documentation analogous to pesticide data evaluation records
(DERs) used to capture critical information from conventional toxicity, exposure,
and other study types.
While the recommendations in Table 5–1 are fairly general, they include a
number of information elements that may be more suitable for (Q)SAR models
than other tools such as analog and category approaches. Specific guidance on
reporting formats for analog and category approaches has been developed by
the OECD (OECD, 2007a).
Page 77 of 186
Table 5–1: Recommended General Types of Information to Include when
Documenting (Q)SAR Predictions
Information on the chemical
• Chemical (systematic) and common names
• CAS number
• Structural formula
• Form of the chemical (including relevant stereochemistry)
• Structural entry format
Page 78 of 186
summarizing the information fields in each, and the EC website link for the
templates and additional guidance on their use are included in Appendix II.
5.6 Summary
Evaluating the adequacy of a (Q)SAR prediction is an essential step for
determining whether the prediction is useful source of data for a pesticide
assessment. Adequate or fit for purpose predictions can be incorporated into
weight of evidence assessments, whereas inadequate predictions necessitate a
reliance on other sources of data alone. Whether a prediction is adequate or fit
for purpose should always be determined within a specific assessment context
such that a prediction that is adequate to support a request that specific empirical
studies be conducted may not be adequate enough to replace those studies in
an assessment. In this section, a framework has been presented for evaluating
the adequacy of predictions based on a consideration of the validity of the
(Q)SAR tool, the applicability of the (Q)SAR tool to the pesticide of interest, the
relevance of the (Q)SAR tool to the pesticide assessment context, and the
reliability of the prediction (Figure 5–1). This framework relies on information
obtained during problem formulation for (Q)SAR (see section 4) and it is flexible
enough to be useful for a variety of different assessment contexts. The next
section of this document (section 6) deals with combining information from
multiple predictions. While there are specific issues associated with combining
multiple predictions, in general, the framework for evaluating the adequacy of
(Q)SAR predictions outlined in section 5 can also be applied to multiple
predictions. For either single or multiple predictions that have been determined to
be adequate, the next step is the incorporation into a weight of evidence
assessment which is the subject of section 7 of this document.
Page 79 of 186
6. Combining Information from Multiple Predictions
6.0 Introduction
Combining information from multiple (Q)SAR predictions can be thought of as
analogous to combining the results of multiple in vivo and in vitro studies to
strengthen a weight of evidence argument for a toxicity or ecotoxicity endpoint.
Because different (Q)SAR tools may have different prediction paradigms and
different strengths and limitations, combining predictions has the potential to
increase the confidence in the overall prediction. However, it should be noted that
combining predictions from multiple (Q)SAR tools does not eliminate the need to
ensure that each prediction is adequate or fit for purpose (see section 5). In
particular, it is important that each (Q)SAR tool used be scientifically valid,
applicable to the pesticide of interest, and relevant to the assessment context.
Page 80 of 186
6.1 Approaches to Combining Multiple Predictions
Combining the output of multiple (Q)SAR models into an overall prediction has
been referred to as consensus modeling, battery approaches, or weight of
evidence approaches to (Q)SAR modeling (Abshear et al., 2006; OECD, 2007;
Hewitt et al., 2007; Matthews et al., 2008, 2009a; Ellison et al., 2010; Hewitt et
al., 2010). An example of a fairly simple approach to consensus predictions is the
set of interpretation criteria in the KnowITAll computational system described by
Abshear et al. (2006) which are summarized in Table 6–1 below. These criteria
range from single hit/unanimity requirements for true/false predictions based on
the worst case scenario and vice versa for the best case scenario, to the
assessment of counts of true and false predictions for the majority and
percentage agreement scenarios. For toxicity predictions, a value of true can be
considered as equivalent to a positive prediction and value of false as equivalent
to a negative prediction.
Scenario Definition
1. If any model returns a value of true, return a value of true
Worst Case
2. Only if all models return a value of false, return a value of false
1. If any model returns a value of false, return a value of false
Best Case
2. Only if all models return a value of true, return a value of true
1. If the majority of the models return true, the consensus will be true
Majority Rules
2. If the majority of the models return false, the consensus will be false
1. If a specified percentage of the models returns a true value, the consensus
Percentage
will be true
Agreement 2. Otherwise, the consensus will be false
Page 81 of 186
domain of applicability), and the known predictive performance of the models or
tools. Finally, when predictions are made for a quantitative (i.e., continuous)
endpoint (e.g., LC50, LOAEL, TD50, EC50, etc.), it may, in some cases, be
possible to average the predicted numerical values or combine them using other
statistical methods. Consensus predictions can vary in complexity depending on
the tools considered and the methods of counting, scoring or weighting the
individual predictions.
Page 82 of 186
and 57% for Hazard Expert alone. 4 However, not all studies have demonstrated
improved model statistics for consensus versus individual models. Hewitt et al.
(2007) used genetic algorithms to construct a range of models for four different
data sets (silastic membrane flux, toxicity of phenols to the ciliated protozoan
Tetrahymena pyriformis, acute toxicity in fathead minnow and flash point). There
was no consistent improvement in statistical fit or predictivity (i.e., R2, Q2 root
mean square error) for average predictions from a consensus of the 10 best
models (i.e., models with the highest R2 and Q2 values); or a consensus of a
diverse set of models that best covered the available model space compared to
the single regression model with the best R2 and Q2 values.
4
Although the COMPACT model is not currently available, the example was included to illustrate
some of the advantages of combining predictions from different (Q)SAR models.
Page 83 of 186
information (Matthews et al., 2009a). This type of confirmatory information can
increase the overall confidence in predictions (Contrera et al., 2007), especially
when additional mechanistic insights are provided by one or more of the models.
Individual models can emphasize a set of structural features in a molecule while
placing reduced or no emphasis on other features (Gramatica et al., 2007).
Consequently, combining multiple (Q)SAR models that are based on different
methodologies can help to relate the activity of a compound to different aspects
of its structure, confirming the impact of key structural features on activity or
providing additional insights into the key parts of a compound's structure that
influence activity (Contrera et al., 2007). Confirmatory predictions also increase
the likelihood that the structures in the active compounds are causally related to
the activity in question and that the compounds come from clusters with the same
mechanism of action (Matthews et al., 2008; 2009a). However, combining
predictions from models based on the same methodology (e.g., several statistical
(Q)SAR models based on similar descriptors) and developed from the same
training set would not be expected to provide much additional information.
Conflicting predictions from individual models for the same endpoint may
represent another source of complexity in consensus approaches. While the
evaluation of the adequacy of the individual (Q)SAR predictions (see section 5)
may help to resolve some differences, conflicting predictions may still occur if
different (Q)SAR paradigms are employed and/or the models are based on
different training data sets. While simplified approaches for reconciling conflicting
predictions may be adopted, such as the criteria summarized by Abshear et al.
(2006), quantitative or semi-quantitative weighting of models may be necessary
to account for differences in model domains of applicability, scoring criteria for
training set data, performance measures, mechanistic bases and other
characteristics. When multiple factors are influencing the weighting, multiple
predictions can be very complex to interpret, especially when there is a need to
Page 84 of 186
additionally weight the (Q)SAR predictions against other data available on a
chemical.
While selection of models for specific scenarios will most likely be made on a
case by case basis taking into account the factors described above and expert
judgment, one example of an approach that has been used previously is the
selection of a molecular fragment-based method (Q)SAR model and a descriptor-
based model, both having high predictive performance for a toxicological
endpoint of interest. Such a combination could have a good chance of improving
the combined domain of applicability of the models, and positive predictions
Page 85 of 186
could provide strong evidence that the fragments/descriptors associated with the
toxicological activity are highly significant and well separated from the structural
features of inactive molecules.
Page 86 of 186
potentially enhance predictive performance compared to the individual models
(Matthews et al., 2009a). Knowing the objective of combining predictions and the
characteristics of the (Q)SAR tools used can help the pesticide evaluator
determine whether it is appropriate to combine predictions or not.
The adequacy of the training sets in the models is another important factor in
determining whether it is appropriate to combine predictions (Matthews, 2009b).
If confirmatory predictions are required from multiple models then the models
should have comparable training sets in terms of coverage of the chemicals to be
predicted. Also, the scoring systems or criteria used to characterize the data on
the training set chemicals should be comparable. For example, two models
based on similar training sets of Ames test results, but constructed using different
scoring systems for what constitutes a positive versus a negative assay result
may not give reliable predictions when combined. Whether the training sets have
been designed to be balanced or heavily weighted towards active or inactive
chemicals should also be considered. If the training set for one model has a
relatively low ratio of active to inactive chemicals (A/I ratio) and a high sensitivity
prediction is desired, then it may be better to combine the predictions from this
model with predictions from models that have higher A/I ratios in order to
enhance the chance of correctly predicting positive chemicals.
Page 87 of 186
document may be used to capture the individual predictions and the rationale for
combining predicitions.
6.5 Summary
In general, the aims of combining multiple predictions include enhancing
predictive performance, expanding domain of applicability, obtaining
complementary or confirmatory information, and ultimately increasing confidence
in (Q)SAR predictions. Although there can be advantages to combining multiple
predictions, it is not always necessary, as a single prediction from a validated,
applicable and relevant (Q)SAR tool is likely to be more acceptable than
combined predictions from tools with significant limitations. The selection of
appropriate tools for generating multiple predictions for a pesticide will involve a
trade-off of the desired advantages against the potential disadvantages and a
consideration of the assessment context, characteristics of the pesticide,
characteristics of the (Q)SAR tools, and the available empirical data. Combining
information from multiple predictions does not represent a new data stream for
consideration in pesticide assessments. Rather it is a variation in the (Q)SAR
data stream that needs to undergo the same types of problem formulation,
evaluation of adequacy, and weight of evidence considerations as single (Q)SAR
predictions prior to being incorporated into pesticide assessments.
Page 88 of 186
7. Integration of (Q)SAR Predictions into Hazard
Assessments
7.0 Introduction
Traditional pesticide risk assessments in regulatory agencies have routinely been
based on the results of laboratory animal testing and estimates of exposure
according to the following four key steps from the National Academy of Sciences
(NAS) risk assessment paradigm (NRC, 1983):
• Hazard Identification
• Dose Response Assessment
• Exposure Assessment
• Risk Characterization
Twenty four years later, the NAS presented a vision for toxicity testing and risk
assessment in the document Toxicity Testing in the 21st Century A Vision and a
Strategy; this document recommended the use of predictive tools such as
(Q)SAR (NRC, 2007). In the outline of this vision, the NAS described some of the
risk assessment-related applications of (Q)SAR including the prediction of
toxicity, ADME properties, environmental fate, and ecologic effects for chemicals.
Consistent with the NAS vision and the existing risk assessment paradigm, the
emphasis in this section is on the integration of (Q)SAR tools into the hazard
identification (biological endpoint) component of the risk assessment process for
pesticides. This section builds upon concepts discussed in section 4 (Problem
Formulation for (Q)SAR) and section 5 (Evaluating the Adequacy of (Q)SAR
Page 89 of 186
Predictions), and discusses the process of integrating the overall toxicity
database (including (Q)SAR predictions) to arrive at conclusions regarding
hazard with consideration of confidence and level of uncertainty. As with
traditional pesticide risk assessments, the characterization of the hazards and
associated uncertainties are communicated to risk managers for consideration in
regulatory decision making.
For the second situation where an evaluator uses a (Q)SAR prediction, usually
there is no empirical information involved and it is still important to assess the
validity and reliability of the chosen (Q)SAR method or model for the evaluator’s
purpose.
The basic steps in the integration of (Q)SAR predictions and empirical data into a
weight of evidence analysis are listed in Figure 7–1 and described in more detail
in subsequent sections.
Page 90 of 186
Figure 7–1: Weight of Evidence Analysis: Integration of (Q)SAR Predictions
and Empirical Data
1. Problem formulation: What is the goal of the assessment? And the role of
(Q)SAR in that assessment?
4. Data base sufficient: Is the data base sufficient for risk assessment?
What data are missing? What is the level of data base uncertainty?
* For most pesticides, a complete understanding of the mode of toxicological action may be
absent. To the extent that a toxicological mode of action is postulated for an analog of the
pesticide of interest, it would be important to consider this information to help build confidence in
a predicted endpoint.
Page 91 of 186
also involves ensuring that the correct structure was the subject of the prediction
and whether it is appropriate to use (Q)SAR predictions for that structure.
Empirical data related to the (Q)SAR prediction and available mode of action
data are important to consider if they either support the (Q)SAR prediction or
contradict it. These data will be discussed further in the following sections.
Toxicity studies tend to be very detailed. The description of the conduct of the
study (i.e., materials and methods) is typically very extensive. The results
sections of the studies are also reported in even greater detail and many of the in
vivo studies cover a variety of biological endpoints. The studies also have a
conclusion section in which the data submitter will often propose a point of
departure for the study depending on how the submitter interprets the outcome of
the study. The reviewer or evaluator of each study prepares a written evaluation
of the study (“Data Evaluation Record” or “DER” at EPA and PMRA), and in this
DER the evaluator summarizes the key points of the study. Perhaps most
importantly, the evaluator records his or her own conclusions about the adequacy
of the study and the appropriate points of departure for the endpoints supported
by the study. In the normal review of empirical data for pesticide risk assessment,
each study is evaluated individually for scientific rigor and those studies that are
Page 92 of 186
considered acceptable are integrated to “tell a story” or “paint a picture” of the
hazard profile of the pesticide chemical.
The review of a (Q)SAR prediction is similar to the review of empirical data. In the
typical scenario, the pesticide applicant will submit a documented (Q)SAR
prediction including the purpose of the prediction, the rationale for selection of a
model (or models), information on how the query structure was entered into the
model, a discussion of the OECD validation principles as applied to the model,
why the training set chemicals are applicable to the query structure, why the
(Q)SAR prediction satisfies a data requirement (or supports a waiver from the
data requirement) and a discussion of limitations and uncertainties associated
with the prediction. The evaluator reviews the submitted (Q)SAR prediction and
all of the supporting documentation and prepares a written record of the review
much like the DER for empirical data (see section 5.5).
Once the review of the empirical data and the (Q)SAR prediction have taken
place, and provided that the review indicates that the prediction is scientifically
valid, the evaluator is in a position to determine whether the prediction is reliable,
i.e., whether it fills a data gap or supports a waiver from a data requirement.
Page 93 of 186
that demonstrates clear systemic toxicity in short-term dermal toxicity tests. If
read-across extrapolations from analog chemicals indicate low potential for
dermal absorption, then there is an apparent conflict. However, if a closer
examination of the physical-chemical properties of the pesticide and skin irritation
testing data reveals that the pesticide is likely to be poorly absorbed at low
concentrations, but at high concentrations, it is corrosive, destroying the barrier
properties of the skin, facilitating access to the systemic circulation then it may be
possible to resolve the differences between the (Q)SAR predictions and the
empirical data.
Depending on the basis for a conflict between a (Q)SAR prediction and the
results of empirical studies, it may be necessary to fully examine the adequacy of
the (Q)SAR prediction (see section 5), and also the adequacy of the empirical
data. (Q)SAR tools are reductionist methods that may not fully account for the
impact of physical-chemical properties and pharmacokinetics/dynamics, may
over-emphasize the contribution of a particular structural alert or property or may
miss a toxicologically relevant alert because of database/training set limitations.
In general, (Q)SAR predictions should not be used to override the results of well-
conducted, guideline type studies for the same endpoint. However, in cases
where the empirical studies are of questionable reliabilty because they are non-
guideline studies, conducted according to older protocols, restricted to examining
specific research endpoints or have other limitations, it may be necessary to give
greater weight to reliable and relevant (Q)SAR predictions and/or develop
recommendations for further testing to help resolve the conflict. Weighting of
predictions and empirical data is discussed further in section 7.5.
There are three likely outcomes to the weight of evidence evaluation: 1) The
(Q)SAR prediction adequately addresses the data requirement, i.e., the data
requirement is satisfied or the study need not be done; 2) The (Q)SAR prediction
is not relevant to the data requirement, i.e., the (Q)SAR prediction is scientifically
Page 94 of 186
adequate but does not address the specific data requirement; and 3) The
(Q)SAR prediction is scientifically acceptable and addresses the data
requirement, but the data requirement is critical, requiring a high degree of
certainty and confidence that the prediction as submitted is unable to meet. In
this last instance, the result of reviewing the empirical data and the (Q)SAR
prediction together may point the way to follow up action such as additional
information from the submitter on the identity of, and relevant data for the training
set chemicals; or perhaps targeted mode of action testing on the query structure
and specific training set chemicals that could help fulfill the data requirement.
Page 95 of 186
data points within each individual empirical study, combining information from
similar empirical studies within one line of evidence, and finally, integrating
(Q)SAR, empirical and other lines of evidence together to arrive at an
assessment conclusion (Health Canada, 2011).
Just as with empirical data based weight of evidence, approaches that integrate
(Q)SAR and empirical lines of evidence usually include a qualitative weighting or
ranking of the importance of the different lines of evidence for the overall
assessment conclusion. Such a weighting involves a consideration of the
adequacy (i.e., see section 5) and uncertainties associated with the different lines
of evidence. Regardless of the weighting/ranking approach adopted,
transparency is critical and can be addressed via comprehensive narrative
rationales outlining the approaches followed in considering each line of evidence
and integrating the lines of evidence together. It is particularly important to outline
the approaches taken when there are conflicts between the (Q)SAR and the
empirical data lines of evidence. For ease of interpretation, tabular presentations
of the (Q)SAR and empirical data lines of evidence can also be considered.
While qualitative approaches are likely to be used most often, it is also possible
to consider quantitative scoring systems or mathematical algorithms that may be
more systematic for weighting (Q)SAR and empirical data lines of evidence than
qualitative, expert judgement-based approaches. Examples of such systems
usually involve numerical weightings for each line of evidence, multiplying the
scores for each line of evidence by its weighting, and summing up the weighted
scores into an overall result. Just as transparency is critical for qualitative
weighting/ranking approaches, it is especially critical to clearly outline the
rationale behind any quantitative weight of evidence scoring systems.
Page 96 of 186
exposure assessment, but also the quality of the data, level of uncertainty, and
confidence in the overall assessment. Any risk mitigation decision that is based
on risk assessment conclusions, must be made with a clear understanding of the
level of uncertainty surrounding the risk assessment conclusions and what level
of confidence should be placed on those conclusions to support a regulatory
decision. If, for example, the level of uncertainty in the database is high because
most of the non-cancer endpoints are predicted and the level of confidence in the
overall risk assessment is weak, the risk manager should be cognizant of this low
level of confidence before selecting an adequate risk mitigation option. In short,
the regulatory option selected should be consistent with the level of uncertainty
identified for the predicted and empirical datasets so as not to over or under-
inflate the confidence in these datasets. If that level of database uncertainty can
be addressed by additional research, the decision on when the data will be
required to be submitted and when they can be considered in future risk
assessment and management decisions may be dependent on the potential
health outcome. In any scenario, it is critical that the risk manager has all the
relevant information from the risk assessor in order to develop appropriate risk
management options and make a good regulatory decision based on sound
science.
7.7 Summary
The integration of (Q)SAR predictions into the risk assessment involves many
steps which are similar to the risk assessment paradigm: hazard identification,
dose response assessment, exposure assessment and risk characterization. The
only difference between a risk assessment based on traditional empirical data
from that which involves in silico predictions is the judgement of adequacy of the
(Q)SAR predictions and determination of database completeness.
The steps involved with integrating (Q)SAR predictions rely on starting with a
solid problem formulation to establish what the (Q)SAR prediction is intended to
inform for the assessment and what type of assessment to be performed will
indicate the amount of uncertainty that is deemed acceptable. For example, a
screening level assessment would allow for more uncertainty than a risk
assessment. The determination that the (Q)SAR prediction is valid and reliable
for the purpose described in the problem formulation step is critical to proceed
forward in the subsequent steps. Without the determination of scientific
adequacy, the (Q)SAR prediction would be rendered unacceptable and therefore,
could not be considered in the risk assessment.
The next step of integrating the (Q)SAR prediction with extant scientific data on
that compound or a structural analog is critical to a scientific weight of evidence
Page 97 of 186
analysis. Issues of reproducibility of observations, consistency of effects across
species, strain, time of exposure and routes of exposure, as well as the
determination of biological plausibility and incorporation of mode of action
information is considered in this analysis. After developing the weight of scientific
evidence, the risk assessor proceeds forward to determine the completeness of
the database to support the risk assessment.
If the database is considered deficient, and missing critical studies, the weight of
evidence, including any mode of action information, should be informative in
determining the type of study(s) needed to fulfill the database deficiency; this
may be in vitro and/or short term studies depending on the data deficiency and
as the concept of the adverse outcome pathway becomes elucidated for the
particular toxicity endpoint, the determination of what study will be needed will be
more clearly defined. In short, the combination of the (Q)SAR prediction,
empirical data, mode of action, and/or adverse outcome pathway in a weight of
evidence approach will inform the risk assessment on database deficiency and
identification of critical research needed to address this level of uncertainty.
Page 98 of 186
8. Conclusions and Future Vision for (QSAR) and
Pesticides
CONCLUSIONS AND FUTURE VISION FOR (Q)SAR AND
PESTICIDES
Topics Discussed in this Section:
While there is the need for more efficient review processes, there is also the
impetus of research on newer technologies, a recognition of the accelerated pace
of scientific innovation. New technologies will allow pesticide regulatory
authorities to build upon existing knowledge of pesticide toxicity to develop
integrated approaches for testing and assessment (IATA) of pesticides. These
parallel regulatory and risk assessment changes and advancements in the state-
of-the-science propel the agencies forward and expedite the transition towards
global application of newer, swifter risk assessment and testing methodologies.
Page 99 of 186
Computational tools vary widely depending on the purpose for their use in risk
assessment. (Q)SAR tools represent an example of an alternative testing method
that could be a useful component of integrated approaches to testing and
assessment. (Q)SAR tools have had a long history of use by industry and
regulatory agencies for hazard determinations and other applications; there are
also many different types of commercial and non-commercial (Q)SAR tools that
are either currently available or rapidly under development. However, in spite of
the regulatory experience and the on-going developments in the field of (Q)SAR,
there are very few examples of formal (Q)SAR guidance documents that discuss
the unique issues and considerations associated with the application of (Q)SAR
to pesticide regulatory risk assessments.
One way in which (Q)SARs could contribute to the AOP approach would be
through the identification of structural alerts associated with key events in an
AOP, particularly molecular initiating events (MIEs). The OECD has noted that a
close linkage between an MIE and an observed adverse outcome in vivo can be
used as a basis for developing a chemical category for the relationship between
chemical structure and the in vivo endpoint. Thus, rather than just relying on
intrinsic chemical activity, AOPs potentially provide a comprehensive mechanistic
basis for forming toxicologically meaningful categories for making predictions
using read-across or (Q)SAR models (OECD, 2012b). As noted previously the
European Commission Joint Research Centre is developing a reporting format
for describing key events/intermediate effects in AOPs in collaboration with the
OECD and ECHA (OECD, 2012a).
In summary, (Q)SAR tools are important predictive technologies for today’s risk
assessment as well as those for tomorrow — provided that they are applied with
appropriate constraints and cautionary guidance and that there is a meaningful
attempt to build data bridges in a weight of evidence approach among (Q)SAR
and future emerging predictive technologies to better target efforts to more
efficiently and effectively maximize overall biological predictive capability (Benigni
et al., 2007b).
To this end, the NAFTA pesticide regulatory authorities are currently assembling
a (Q)SAR expert committee to provide advice to pesticide evaluators in complex
assessments that seek to integrate (Q)SAR predictions with empirical data in a
weight of evidence approach for hazard/risk determinations that may trigger
regulatory risk management decisions. One of the mandates of this expert
Accelrys Inc. 2004. TOPKAT, Version 6.2, User Guide, Accelrys Inc., San Diego,
CA, USA.
American Chemical Society. 2010. CAS RegistrySM and CAS Registry System.
CAS® A Division of the American Chemical Society.
Ankley, G.T., G.P. Daston, S.J. Degitz, N.D. Denslow, R.A. Hoke, S.W. Kennedy,
A.L. Miracle, E.J. Perkins, J. Snape, D.E. Tillitt, C.R. Tyler, and D. Versteeg.
2006. Toxicogenomics in regulatory ecotoxicology. Envron. Sci. Technol.
40(13):4055–4065.
Ankley, G.T., R.S. Bennett, R.J. Erickson, D.J. Hoff, M.W. Hornung, R.D.
Johnson, D.R. Mount, J.W. Nichols, C.L. Russom, P.K. Schmieder, J.A.
Serrrano, J.E. Tietge, and D.L. Villeneuve. 2010. Adverse outcome pathway: A
conceptual framework to support ecotoxicology research and risk assessment.
Environ. Toxicol. Chem. 29(3):730–741.
Arnot, J.A., F.A.P.C. Gobas. 2004. A Food Web Bioaccumulation Model for
Organic Chemicals in Aquatic Ecosystems. Environ. Toxicol. Chem.
23(10):2343–2355.
Benfenati, E. 2010. The CAESAR project for in silico models for the REACH
legislation. Chem. Cent. J. 4(Suppl 1):I1.
Benigni, R., and C. Bossa. 2008. Predictivity and reliability of QSAR models: The
case of mutagens and carcinogens. Toxicol. Mechan. Methods 18:137–147.
Bradbury, S.P. 1994. Predicting modes of toxic action from chemical structure: an
overview. SAR QSAR Environ. Res. 2:89–104.
Bradbury, S.P., C.L. Russom, G.T. Ankley, T.W. Schultz, and J.D. Walker. 2003.
Overview of data and conceptual approaches for derivation of quantitative
structure-activity relationships for ecotoxicological effects of organic chemicals.
Environ. Toxicol. Chem. 22(8):1789–1798.
Bradbury, S.P., T.C.J. Feijtel, and C.J. van Leeuwen. 2004. Meeting the scientific
needs of ecological risk assessment in a regulatory context. Environ. Sci.
Technol. 38(23):463A–470A.
Brown, T.N. and F. Wania. 2008. Screening chemicals for the potential to be
persistent organic pollutants. A case study of arctic contaminants. Environ. Sci.
Technol. 42:5202–5209.
Contrera, J.F., N.L. Kruhlak, E.J. Matthews, and R.D. Benz. 2007. Comparison of
MC4PC and MDL-QSAR rodent carcinogenicity predictions and the
enhancement of predictive performance by combining QSAR models. Reg.
Toxicol. Pharm. 49:172–182.
Cronin, M.T.D., J.D. Walker, J.S. Jaworska, M.H.I. Comber, C.D. Watt, and A.P.
Worth. 2003. Use of QSARs in International Decision-Making Frameworks to
Predict Ecologic Effects and Environmental Fate of Chemical Substances.
Environ Health Perspect 111:1376–1390.
Dalby, A., J.G. Nourse, W.D. Hounshell, A.K.I. Gushurst, D.L. Grier, B.A. Leland,
and J. Laufer. 1992. Description of several chemical structure file formats used
by computer programs developed at molecular design limited. J. Chem. Inf.
Comput. Sci. 32:244–255.
Dearden, J.C., M.T.D. Cronin, and K.L.E. Kaiser. 2009. How not to develop a
quantitative structure-activity or structure-property relationship (QSAR/QSPR).
SAR and QSAR in Environmental Research 20:241–266.
Demchuk, E., P. Ruiz, J.D. Wilson, F. Scinicariello, H.R. Pohl, M. Fay, M.M.
Mumtaz, H. Hansen, and C.T. De Rosa. 2008. Computational toxicology methods
in public health practice. Toxicol. Mech. Methods 18:119–135.
Dimitrov, S.D., L.K. Low, G.Y. Patlewicz, P.S. Kern, G.D. Dimitrova, M.H.I.
Comber, R.D. Philips, J. Niemela, P.T. Bailey, and O.G. Mekenyan. 2005b. Skin
sensitization: Modeling based on skin metabolism simulation and formation of
protein conjugates. Int. J. Toxicol. 24:189–204.
EC. 2008a. QSAR Model Reporting Format (Version 1.2). European Commission
Directorate General Joint Research Centre. Institute for Health and Consumer
Protection. Toxicology and Chemical Substances Unit. Ispra 26/05/2008.
http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology/qsar_tools/qrf/QM
RF_version_1.2.pdf
EC. 2008b. QSAR Prediction Reporting Format (QPRF) (version 1.1, May 2008).
European Commission Directorate General Joint Research Centre. Institute for
Health and Consumer Protection. Toxicology and Chemical Substances Unit.
Ispra 26/05/2008.
http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology/qsar_tools/qrf/QP
RF_version_1.1.pdf
Ellison, C.M., J. C. Madden, P.N. Judson, and M.T.D. Cronin. 2010. Application
of in silico tools in a weight of evidence approach to aid toxicological assessment.
Poster 4.4. 14th International Workshop on Quantitative Structure-Activity
Enoch, S.J. 2010. Chapter 7. Chemical category formation and read-across for
the prediction of toxicity. In: Puzyn, T., J. Leszczynski, and M.T. Cronin (eds.)
Recent Advances in QSAR Studies, Springer, pp. 209–219.
Enoch, S.J., M.T.D. Cronin, J.C. Madden, and M. Hewitt. 2009. Formation of
structural categories to allow for read-across for teratogenicity. QSAR Comb. Sci.
28(6-7):696–708.
Eriksson, L., J. Jaworska, A.P. Worth, M.T.D. Cronin, R.M. McDowell, and P.
Gramatica. 2003. Methods for reliability and uncertainty assessment and for
applicability evaluations of classification- and regression-based QSARs. Environ.
Health Perspect. 111(10):1361–1375.
Geiss, K.T. and J.M. Frazier. 2001. QSAR modeling of oxidative stress in vitro
following hepatocyte exposures to halogenated methanes. Toxicology in Vitro
15(4-5), 557–563.
Gramatica, P. and E. Papa. 2007. Screening and ranking of POPs for global half-
life: QSAR approaches for prioritization based on molecular structure. Environ.
Sci. Technol. 41:2833–2839.
Gramatica, P., E. Giani, and E. Papa. 2007. Statistical external validation and
consensus modeling: A QSPR case study for KOC prediction. J. Mol. Graph.
Model. 25:755–766.
Greene, N. 2002. Computer systems for the prediction of toxicity: an update. Adv.
Drug Deliv. Rev. 54:417–431.
Hansch, C. and T. Fujita. 1964. A method for the correlation of biological activity
and chemical structure. J. Am. Chem. Soc. 86:1616–1626.
Hester, S.D., D.C. Wolf, S. Nesnow, and S.F. Thai. 2006 Transcriptional profiles
in liver from rats treated with tumorigenic and non-tumorigenic triazole conazole
fungicides: Propiconazole, triadimefon, and myclobutanil. Toxicol. Pathol.
34:879–94.
Hewitt, M., C.M. Ellison, E.J. Enoch, J.C. Madden, and M.T.D. Cronin. 2010.
Integrating (Q)SAR models, expert systems and read-across approaches for the
prediction of developmental toxicity. Repro. Tox. 30:147–160.
Hewitt, M., M.T.D. Cronin, J.C. Madden, P.H. Rowe, C. Johnson, A. Obi, and S.J.
Enoch. 2007. Consensus QSAR Models: Do the Benefits Outweigh the
Complexity? J. Chem. Inf. Model 47:1460–1468.
Jaworska, J.S., M. Comber, C. Auer, and C.J. van Leeuwen. 2003. Summary of a
workshop on regulatory acceptance of (Q)SARs for human health and
environmental endpoints. Environ. Health Perspect. 111(10):1358–1360.
Judson, R.S., K.A. Houck, R.J. Kavlock, T.B. Knudsen, M.T. Martin, H.M.
Mortensen, D.M. Reif, D.M. Rotroff, I. Shah, A.M. Richard, and D.J. Dix. 2010. In
vitro screening of environmental chemicals for targeted testing prioritization: The
ToxCast Project. Environ. Health Perspect. 118(4):485–492.
Klopman, G., S.K. Chakravarti, N. Harris, J. Ivanov, and R.D. Saiakhov. 2003. In-
silico screening of high production volume chemicals for mutagenicity using the
MCASE QSAR expert system. SAR QSAR Environ. Res. 14:165–180.
Lai, D.Y., Y.-T. Woo, M.F. Argus, and J.C. Arcos. 1996. Cancer risk reduction
through mechanism-based molecular design of chemicals. In: Designing Safer
Chemicals: Green Chemistry for Pollution Prevention (DeVito, S.C., and R.L.
Garrett, eds.). ACS Symposium Series 640. Washington, DC:American Chemical
Society, 1996; 62–73.
Leonard, J.T. and K. Roy. 2006. On selection of training and test sets for
development of predictive QSAR models. QSAR Comb. Sci. 25(3):235–251.
Lewis, D.F.V., M.G. Bird, and M.N. Jacobs. 2002. Human carcinogens: an
evaluation study via the COMPACT and Hazard Expert procedures. Hum. Exp.
Toxicol. 21:115–122.
MacKay, D., S. Paterson, and M. Joy. 1983. Application of fugacity models to the
estimation of chemical distribution and persistence in the environment. In:
Swann, R.L. and A. Eschenroeder, eds., Fate of Chemicals in the Environment.
American Chemical Society Symposium Series, Vol. 225, Chapter 9, pp 175–
196.
Matthews, E.J. and J.F. Contrera. 1998. A new highly specific method for
predicting the carcinogenic potential of pharmaceuticals in rodents using
enhanced MCASE QSAR-ES software. Reg. Toxicol. Pharm. 28:242–264.
Matthews, E.J., C.J. Ursem, N.L. Kruhlak, R.D.Benz, D.A. Sabaté, C. Yang, G.
Klopman, and J.F. Contrera. 2009a. Identification of structure-activity
relationships for adverse effects of pharmaceuticals in humans: Part B. Use of
(Q)SAR systems for early detection of drug-induced hepatobiliary and urinary
tract toxicities. Reg. Toxicol. Pharm. 54:23–42.
Matthews, E.J., N.L. Kruhlak, R.D. Benz, and J.F. Contrera. 2008. Combined use
of MC4PC, MDL-QSAR, BioEpisteme, Leadscope PDM, and Derek for Windows
software to achieve high-performance, high-confidence, mode of action-based
predictions of chemical carcinogenesis in rodents. Toxicol. Mech. Method.
18:189–206.
Matthews, E.J., N.L. Kruhlak, R.D. Benz, and J.F. Contrera. 2007a. A
comprehensive model for reproductive and developmental toxicity hazard
identification: I. Development of a weight of evidence QSAR database. Reg.
Toxicol. Pharm. 47:115–135.
National Research Council (NRC). 2007. Toxicity testing in the 21st century: A
vision and a strategy. National Academy of Sciences, Washington, DC.
http://dels.nas.edu/resources/static-assets/materials-based-on-reports/reports-in-
brief/Toxicity_Testing_final.pdf
Netzeva, T., M. Pavan, and A. Worth. 2007. Review of data sources, QSARs,
and integrated testing strategies for aquatic toxicity. European Commission, Joint
Research Centre, EUR 22943 EN, 103 p. Ispra, Italy.
http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology/doc/EUR_22943_
EN.pdf
OECD. 2005. The OECD Guidance Document on the Validation and International
Acceptance of New or Updated Test Methods for Hazard Assessment (OECD
OECD. 2007b. Report on the regulatory uses and applications in OED member
countries of (quantitative) structure-activity relationship [(Q)SAR] models in the
assessment of new and existing chemicals. OECD Environmental Health and
Safety Publications. Series on Testing and Assessment No. 58. ENV/JM/MONO
(2006)25: 79 p. http://www.oecd.org/dataoecd/55/22/38131728.pdf
OECD. 2009. Guidance document for using the OECD QSAR application
toolbox to develop chemical categories according to the OECD guidance on
grouping of chemicals. OECD Series on Testing and Assessment No. 102.
ENV/JM/MONO (2009)5: 118 p.
http://www.oecd.org/dataoecd/50/60/42294034.pdf
Overton, C.E. 1901. Studien über die Narkose. Gustav Fischer, Jena, Germany.
Patlewicz, G., A.O. Aptulu, D.W. Roberts, and E. Uriarte. 2008. A minireview of
available skin sensitization (Q)SARs/Expert Systems. QSAR Combin. Sci.
27(1):60–76.
Pavan, M., A.P. Worth, and T.I. Netzeva. 2005a. Preliminary analysis of an
aquatic toxicity dataset and assessment of QSAR models for narcosis. European
Commission, Joint Research Centre, EUR 21749 EN, 42 p. Ispra, Italy.
http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology/doc/Report_QSAR
_model_for_narcosis.pdf
Pavan, M., A.P. Worth, and T.I. Netzeva. 2005b. Comparative assessment of
QSAR models for aquatic toxicity. European Commission, Joint Research
Centre, EUR 21750 EN, 122 p. Ispra, Italy.
http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology/doc/Report_Comp
arative_assessment_QSAR_models.pdf
Pavan, M., and A. Worth. 2008. A set of case studies to illustrate the applicability
of DART (Decision Analysis by Ranking Techniques) in the ranking of chemicals.
European Commission, Joint Research Centre, EUR 23481 EN, 80 p. Ispra, Italy.
http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology/doc/EUR_23481_EN.pdf
Perkins, R., H. Fang, W. Tong, and W.J. Welsh. 2003. Quantitative structure-
activity relationship methods: Perspectives on drug discovery and toxicology.
Environ. Toxicol. Chem. 22(8):1666–1679.
Richard, A.M., I. Swirsky Gold, and M.C. Nicklaus. 2006. Chemical structure
indexing of toxicity data on the internet: moving towards a flat world. Curr.
Opinion Drug Disc.Dev. 9(3):314–325.
Russom, C.L., R.L. Breton, J.D. Walker, and S.P. Bradbury. 2003. An overview of
the use of quantitative structure-activity relationships for ranking and prioritizing
large chemical inventories for environmental risk assessments. Environ. Toxicol.
Chem. 22(8):1810–1821.
Russom, C.L., S.P. Bradbury, S.J. Broderius, D.E. Hammermeister, and R.A.
Drummond. 1997. Predicting modes of toxic action from chemical structure:
Acute toxicity in the fathead minnow (Pimephales promelas). Environ. Toxicol.
Chem. 16(5):948–967.
Saliner, G., G.Patlewicz, and A.P. Worth. 2005. A similarity based approach for
chemical category classification. European Commission, Joint Research Centre
Report number: EUR 21867 EN. Institute for Health and Consumer Protection,
Toxicology and Chemical Substances Unit, European Chemicals Bureau, I-21020
Ispra (VA) Italy.
http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology/doc/Report_Chemi
cal_Similarity_for_Category_Classification.pdf
Schmieder, P.K., M.A. Tapper, J.S. Denny, R.C. Kolanczyk, B.R. Sheedy, T.R.
Henry, and G.D. Veith. 2004. Use of trout liver slices to enhance mechanistic
interpretation of estrogen receptor binding for cost-effective prioritization of
chemicals within large inventories. Environ. Sci. Technol. 38:6333–6342.
Tong, W., W.J. Welsh, L. Shi, H. Fang, and R. Perkins. 2003. Structure-activity
relationship approaches and applications. Environ. Toxicol. Chem. 22(8):1680–
1695.
Tsakovska, I., T. Netzeva, and A. Worth. 2005. Evaluation of (Q)SARs for the
prediction of eye irritation/corrosion potential. Physicochemical exclusion rules.
European Commission, Joint Research Centre, EUR 21897 EN, 42 p. Ispra, Italy.
http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology/doc/Evaluation_of
_Eye_Irritation_QSARs.pdf
Urbano-Cuadrado, M., I.L. Ruiz, and M.A. Gomez-Nieto. 2008. Description and
application of similarity-based methods for fast and simple QSAR model
development. QSAR Comb. Sci. 27(4):457–468.
Ursem, C.J., N.L. Kruhlak, J.F. Contrera, P.M. MacLaughlin, R.D. Benz, and E.J.
Matthews. 2009. Identification of structure-activity relationships for adverse
effects of pharmaceuticals in humans. Part A: Use of FDA post-market reports to
create a database of hepatobiliary an urinary tract toxicities. Reg. Toxicol. Pharm.
54:1–22.
van der Jagt, K., S. Munn, J. Torslov, and J. de Bruijn. 2004. Alternative
approaches can reduce the use of test animals under REACH. European
Commission, Joint Research Centre, Institute for Health Consumer Protection.
EUR 21405 EN. 31 p.
http://publications.jrc.ec.europa.eu/repository/bitstream/111111111/8790/1/EUR
%2021405%20EN.pdf
van Leeuwen, C.J., G.Y. Patlewicz, and A.P. Worth. (2007). Intelligent testing
strategies. In: Risk Assessment of Chemicals: An Introduction. 2nd Edition. Van
Leeuwen, C.J. and T.G. Vermeire, (eds.).
van Leeuwen, K. T.W. Schultz, T. Henry, B. Diderich, and G.D. Veith. 2009.
Using chemical categories to fill data gaps in hazard assessment. SAR QSAR
Environ. Res. 20(3-4):207–220.
Veith, G.D., D.L. De Foe, and B.V. Bergstedt. 1979. Measuring and estimating
the bioconcentration factor of chemicals in fish. J. Fish Res. Board Can.
36:1040–1047.
Veith, G.D., D.J. Call, and L.T. Brooke. 1983. Structure-toxicity relationships for
the fathead minnow, Pimephales promelas: Narcotic industrial chemicals. Can. J.
Fish. Aquat. Sci. 40:743–748.
Walker, J.D., D. Knaebel, K. Mayo, J. Tunkel, and D.A. Gray. 2004. Use of
QSARs to promote more cost-effective use of chemical monitoring resources. 1.
Screening industrial chemicals and pesticides, direct food additives, indirect food
additives and pharmaceuticals for biodegradation, bioconcentration and aquatic
toxicity potential. Water Qual. Res. J. Canada 29(1):35–39.
Williams, C., M. Wolf, and A.M. Richard. 2009. DSSTox chemical-index files for
exposure-related experiments in ArrayExpress and Gene Expression Omnibus:
enabling toxico-chemogenomics data linkages. Bioinformatics 25(5):692–694.
Woo, Y.T., and D.Y. Lai. 2010. (Q)SAR analysis of genotoxic and nongenotoxic
carcinogens: a state-of-the-art overview. In: Cancer Risk Assessment: Chemical
Carcinogenesis, Hazard Evaluation, and Risk Quantification, edited by Hsu, C.H.
and T. Stedeford, Wiley. Chapter 20, 517–556.
Woo, Y.-T., D.Y. Lai, J.C. Arcos, and M.F. Argus. 1985. Chemical Induction of
Cancer, Structural Bases and Biological Mechanisms. Vol IIIB: Aliphatic and
Polyhalogenated Carcinogens. Orlando, FL: Academic Press.
Woo, Y.-T., D.Y. Lai, J.L. McLain, M.K. Manibusan, and V. Dellarco. 2002. Use of
Mechanism-Based Structure-Activity Relationships Analysis in Carcinogenic
Potential Ranking for Drinking Water Disinfection By-Products. Environ. Health
Perspect. 110(Suppl 1):75–87.
Woo, Y.-T., D.Y. Lai, M.F. Argus, and J.C. Arcos. 1998. Integrative approach of
combining mechanistically complementary short-term predictive tests as a basis
for assessing the carcinogenic potential of chemicals. Environ. Carcinog.
Ecotoxicol. Rev. (Part C of J. Environ. Sci. Health) C16(2):101–122.
Worth, A., and G. Patlewicz. 2007. A compendium of case studies that helped to
shape the REACH guidance on chemical categories and read across. European
Commission, Joint Research Centre, EUR 22481 EN. 188 p.
http://publications.jrc.ec.europa.eu/repository/bitstream/111111111/1234/1/7212
%20-%20Compendium%20of%20case%20studies_final_060707.pdf
Worth, A.P. 2010. The role of QSAR methodology in the regulatory assessment
of chemicals, Chapter 13 in Recent Advances in QSAR Studies: Methods and
Applications. Puzyn, T., J. Leszczynski, and M.T.D. Cronin (eds). Springer,
Heidelberg, Germany, pp. 367–382.
Yuan, H., Y. Wang, and Y. Cheng. 2007. Local and Global Quantitative
Structure−Activity Relationship Modeling and Prediction for the Baseline Toxicity.
J. Chem. Inf. Model., 2007, 47 (1), pp 159–169.
Zeeman, N., C.M. Auer, R.G. Clements, J.V. Nabholz, and R.S. Boethling. 1995.
U.S. EPA regulatory perspectives on the use of QSAR for new and existing
chemical evaluations. SAR/QSAR Environ. Res. 3:179–201.
Zvinavashe, E., A.J. Murk, and I.M.C.M. Rietjens. 2008. Promises and pitfalls of
quantitative structure-activity relationship approaches for predicting metabolism
and toxicity. Chem. Res. Toxicol. 21:2229–2236.
The list below is by no means exhaustive, and since the field of (Q)SAR is
constantly expanding, pesticide evaluators are advised to regularly monitor
various national and international agency websites and the open literature for
developments in the area of (Q)SAR of interest to them.
Descriptions of the key (Q)SAR activities from the Danish EPA webpage (i.e., the
(Q)SAR database and the Advisory list for self-classification of dangerous
substances) are provided below.
“The Danish EPA has for a number of years worked with the development and
use of (Q)SAR’s, also called ‘computer models’ for prediction of properties of
chemical substances. (Quantitative) Structure Activity Relationships — (Q)SAR
— are relations between structure properties of chemical substances and some
other property. The other property can be a physical-chemical property or a
biological activity, including the ability to cause toxic effects.”
“The Danish EPA has made a database, which comprise predictions from more
than 70 (Q)SAR models on endpoints for physico-chemical properties, fate, eco-
toxicity, absorption, metabolism and toxicity. The database is constantly growing
as new models are obtained and developed. More than half of all the estimates
The IHCP website provides information and links to a number of documents and
several downloadable (Q)SAR tools:
The following is a description of the OECD (Q)SAR project from the OECD
website:
While much of the information on the webpage is in Dutch, the majority of the
publications are available in English.
“The goal of the Sustainable Futures Initiative (SF) is to make new chemicals
safer, available faster, and at lower cost. It works by giving chemical developers
the same risk-screening models that EPA uses to evaluate new chemicals before
they enter the market.”
The computer-based models and tools freely available for download from the
OPPT webpage include the following:
• EPI Suite
• ECOSAR
• PBT Profiler
• Oncologic
• Analog Identification Methodology (AIM)
• NonCancer Screening Protocol
• E-FAST
• ChemSteer
The webpage also includes links to databases and further information on ICSAS
activities including:
• Database Projects
• Maximum Recommended Therapeutic Dose (MRTD) Database
• Human Liver Adverse Effects Database
• Genetic Toxicity, Reproductive and Developmental Toxicity, and
Carcinogenicity Database
• Salmonella Mutagenicity E-state Descriptors
• Chemical Structure Similarity Searching
• The Computational Toxicology Program and ComTox Consulting Service
• ComTox Regulatory Application of ICSAS MCASE/MC4PC-ES by the
Center for Food Safety and Applied Nutrition
• Application of Computational Toxicology to Assess Clinical Adverse Drug
Reactions
• Publications
van Leeuwen, C.J., G.Y. Patlewicz, and A.P. Worth. 2007. Intelligent testing
strategies. In: Risk Assessment of Chemicals: An Introduction. 2nd Edition. Van
Leeuwen, C.J. and T.G. Vermeire, (eds.).
Worth, A.P. 2010. The role of QSAR methodology in the regulatory assessment
of chemicals, Chapter 13 in Recent Advances in QSAR Studies: Methods and
Applications. Puzyn, T., J. Leszczynski, and M.T.D. Cronin, (eds.). Springer,
Heidelberg, Germany, pp. 367–382.
While not specifically designed for documenting (Q)SAR models and predictions
for pesticides, the QMRF and QPRF can be viewed as examples of detailed
information templates of the type that may need to be considered when (Q)SAR
predictions are to be used as critical sources of data in pesticide assessments.
Tables 12.1 and 12.2 below list the information fields included in the QMRF and
the QPRF. The most current versions of the QMRF and the QPRF, guidelines for
reviewing the QMRF, a QMRF editor for filling in the QMRF, guidance on creating
SDF files associated with QMRFs, and examples of completed QMRFs can be
downloaded from the following website:
http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology/qsar_tools/QRF
The website above also provides access to the JRC database of QMRFs.
In addition, the European Chemicals Agency has developed guidance for the use
of the QMRF and QPRF to report on (Q)SARs and to input information from
these templates into IUCLID 5 (ECHA, 2009).
1. QSAR identifier
1.1 QSAR identifier (title)
1.2 Other related models
1.3 Software coding the model
2. General Information
2.1 Date of QMRF
2.2 QMRF author(s) and contact details
2.3 Date of QMRF update(s)
2.4 QMRF update(s)
2.5 Model developer(s) and contact details
2.6 Date of model development and/or publication
2.7 Reference(s) to main scientific papers and/or software package
2.8 Availability of information about the model
2.9 Availability of another QMRF for exactly the same model
9. Miscellaneous information
9.1 Comments
9.2 Bibliography
9.3 Supporting information
1. Substance
1.1 CAS number
1.2 EC number
1.3 Chemical name
1.4 Structural formula
1.5 Structure code
a. SMILES
b. InChi
c. Other structural representation
d. Stereochemical features
2. General information
2.1 Date of QPRF
2.2 QPRF author and contact details
3. Prediction
3.1 Endpoint (OECD Principle 1)
a. Endpoint
b. Dependent variable
3.2 Algorithm (OECD Principle 2)
a. Model or submodel name
b. Model version
c. Reference to QMRF
d. Predicted value (model result)
e. Predicted value (comments)
f. Input for prediction
g. Descriptor values
3.3 Applicability domain (OECD Principle 3)
a. Domains
i. Descriptor domain
ii. Structural fragment domain
iii. Mechanistic domain
iv. Metabolic domain
b. Structural analogues
c. Considerations on structural analogues
3.4 The uncertainty of the prediction (OECD Principle 4)
3.5 The chemical and biological mechanisms according to the model
underpinning the predicted result (OECD Principle 5)
4. Adequacy (Optional)
4.1 Regulatory purpose
4.2 Approach for regulatory interpretation of the model result
4.3 Outcome
4.4 Conclusion
Example No. 1
Case Study:
Use of EcoSAR QSAR Models to Estimate the Acute Toxicity of Organophosphate and
Carbamate Pesticide Classes to Fish Species.
Issue: Currently the US EPA’s Office of Pesticide Programs receives acute toxicity data
for fish species via the FIFRA registration process. OPP typically does not obtain test
data for degradate chemicals of active ingredients and relies on QSAR approaches to
determine the potential hazard associated with these substances. In addition, the Office
of Water is interested in using QSAR approaches to fill data gaps to meet minimum data
requirements in the development of water quality criteria for pesticide active
ingredients. These are typically for other chordate and arthropod taxa, but at times
data gaps exist for fish and salmonid species. A potential issue with using QSAR models
is that most tools were developed to support the TSCA legislation which deals primarily
with industrial organic compounds. To this end, an analysis was conducted to
determine the reliability and validity of QSAR models for use in estimating the acute
toxicity to fish for a set of organophosphate and carbamate pesticides acting via an
acetylcholinesterase inhibition mode of action.
2. General information:
a. QPRF author and contact details: This analysis was compiled by Chris
Russom, US EPA, ORD, NHEERL, MED, Duluth, MN
(Russom.chris@epa.gov and completed using EcoSAR V 1.1 with EcoSAR
outputs completed on June 2011.
a. Empirical Data Set: A data set was compiled for use in validating model
predictions by selecting test results from the ECOTOX database
(www.epa.gov/ecotox), and the OPP database of studies submitted for
registration of active ingredients (Brian Montague, OPP/EFED, personal
The focus of this evaluation was the use of QSAR models to estimate
acute toxicity to fulfill minimum data requirements for use by OW in
deriving Agency benchmarks. Rainbow trout or bluegill sunfish tended to
be the most sensitive fish species (Figure 1) upon examination of the
empirical validation data set. Since the ECOSAR model is generic for fish
(see 3b below), and a critical minimum data requirement under the
Water Quality Criteria Guidelines is a salmonid (USEPA 1985),
comparisons of the QSAR model estimates were made using the average
test concentration of empirical test results for rainbow trout when
possible, and bluegill test data only when rainbow trout data were not
available (See Attachment 2).
100000
1000
100
10
ii. Test Protocols: Acute toxicity test data used in the training set
followed either ASTM or OPP standard testing procedures (ASTM,
2007; U.S. EPA 1996). Therefore the data were from several
laboratories.
Figure 2: Log molar LC50 for fish (empirical test data and EcoSAR V1.1
QSAR estimates) vs. the log of the octanol / water partition coefficient (logP)
4.00
2.00
Non-polar Narcosis
0.00
Log Molar LC50
-2.00
-4.00
-6.00
-8.00
-2 -1 0 1 2 3 4 5 6 7
-10.00
Log P
Empirical Test Data Phenyl Carbamate Ester Oxime Carbamate Ester
iii. Input for prediction: CAS Registry numbers were used as input for
EcoSAR model prediction, and SMILES strings were available
within EcoSAR for all chemicals except cis-Thiocarboxime. The
EcoSAR structures were verified against a second source
(Alanwood or ChemID). The SMILES string was written and used
as input for EcoSAR model prediction for cis-Thiocarboxime and
this structure was verified in a second source.
iv. Descriptor values: The logP value from KowWin Version 1.68 was
retrieved by EcoSAR as the descriptor variable for the resident
QSAR models.
a. Domains:
i. Descriptor variable: The logP values for the validation data set
ranged from 0.123-2.552 for the carbamates and -0.096-5.863 for
the organophosphates. Examining training sets used for each
QSAR model against the chemicals with estimated values not
included in the training set found that all carbamate chemicals
except formetanate (logP=0.89 vs. logP for Phenyl carbamate
ester training set of 1.52 to 3.06) had logP values within the
model training set logP range. For the organophosphates
estimated using the Monothiophosphate ester QSAR models, all
chemical in the validation set were within the training set logP
ranges (i.e., logP range of 2.4 to 4.7) except Dichlofenthion
(logP=5.202), Fenchlorfos (logP=4.865), Iodophos (logP=5.387),
and Trichloronate (logP=5.863). All chemicals estimated using the
Dithiophosphate ester QSAR models had logP values within the
training set logP range. Dicrotophos (logP=-0.096) was the only
substance estimated using the Phosphate ester QSAR model that
was outside the logP range for the training set (-0.74 to 4.85).
EcoSAR documentation acknowledges that in general, above a
a. Data sources: The model training set was not generated from one
laboratory, therefore the data sets may have variability amongst the data
related to differences in genetic stock for fish, consistency in application of
test protocols such as static vs flow through exposures; use of analytical
procedures to measure test concentration, or reporting of only nominal
values, etc.
b. Test Species: As mentioned earlier these models are generic for fish, as
can be seen in Figure 1, fish species can have a range of sensitivities to these
substances.
-1
-2
EcoSAR Estimated Log Molar LC50 for Fish
-3
-5
-6 Azinophos-methyl
-7
-8
-9
-9.00 -8.00 -7.00 -6.00 -5.00 -4.00 -3.00 -2.00 -1.00
Empirically Derived Log Molar LC50 for Fish
Phenyl Carbamate Ester Oxime Carbamate Ester Phosphate Ester
1 Dithiophosphate Ester Monothiophosphate Ester
3.4 The chemical and biological mechanisms according to the model underpinning
the predicted result (OECD principle 5).
4. Adequacy
Scientific validity of the EcoSAR model used to estimate the toxicity of chemicals listed in
Attachment 2, based on the adherence to the OECD principles as outlined above, finds
that for carbamates the method appears to be valid, but issues exist for some of the
organophosphates. Both carbamate models provided acceptable estimates of toxicity
for the five substances in the validation set, with all chemicals estimated within an order
of magnitude of the empirical test value. Comparison of the EcoSAR QSAR estimates for
the freshwater fish acute LC50 to empirical data sets for the organophosphate chemicals
found that most values agreed within an order of magnitude (see Figure 3). One
significant outlier was chlorpyrifos-methyl oxon, the metabolically active form of
chlorpyrifos-methyl. EcoSAR categorized the degradate as a phosphate ester, and using
this model, estimated the toxicity to be nearly 4000 times less toxic than found in
empirical studies. Methidathion and azinphos-methyl were more than 2 orders of
magnitude more toxic than estimated by the EcoSAR dithiophosphate ester model. The
QSAR estimate for naled was more than an order of magnitude different than empirical
test data. Once again, this substance was estimated using the phosphate ester model.
The reliability of these models for use with organophosphates depends on the
development of new or refined chemical class to estimate toxicity more effectively.
Log P is the only molecular descriptor used in the EcoSAR models evaluated in this case
study. The coefficient of determination for the EcoSAR models evaluated herein (see
Table 1 for R2 values) reflects that factors other than partitioning into the organism are
required to completely describe the toxic response. To this end, these QSAR models
would improve by the identification and inclusion of toxicologically relevant molecular
descriptors in the EcoSAR QSAR models, with linkages to key events within the
acetylcholinesterase inhibition adverse outcome pathway.
Risk assessors will need to determine how relevant these model outputs are to the
question being asked. For instance, are estimations within an order of magnitude
acceptable? What statistical criteria must be met for the internal performance of a
model to be considered acceptable (i.e., R2 > 0.60)? Similarly, what statistical
benchmarks should be used to determine the predictivity of a model? The answer to
these questions may depend on whether the risk assessment is for a screening and
prioritization, or for deriving a final benchmark value. The purpose of this analysis was
to determine whether QSAR estimates could fill data gaps used in deriving Agency
benchmark values, and it is recommended that the models undergo further refinement
prior to this use.
References:
American Society for Testing and Materials (ASTM). 2007. Standard Guide for
Conducting Acute Toxicity Tests on Test Materials with Fishes, Macroinvertebrates, and
Amphibians. ASTM Subcommittee E46.01; E729-96(2007). 22 pp. American Society for
Testing and Materials, West Conshohocken, PA.
Mileson, B.E., J.E. Chambers, W.L. Chen, W. Dettbarn, M. Ehrich, A.T. Eldefrawi, D.W.
Gaylor, K. Hamernick, E. Hodgson, A.G. Karczmar, S. Padilla, C.N. Pope, R.J. Richardson,
D.R. Saunders, L.P. Sheets, L.G. Sultatos, and K.B. Wallace. 1998. Common mechanism
of toxicity: A case study of organophosphorus pesticides. Tox. Sci. 41:8-20.
Moore, D.R.J., R.L. Breton, and D.B. MacDonald., 2003. A comparison of model
performance for six quantitative structure–activity relationship packages that predict
acute toxicity to fish. Environ. Toxicol. Chem. 22, 1799–1809.
U.S. EPA. 1996. Ecological Effects Test Guidelines: OPPTS 850.1075. Fish acute toxicity
test, freshwater and marine. U.S. EPA, Office of Prevention, Pesticides, and Toxic
Substances, EPA 712-C-96-118. Public Draft; 13 pp.
U.S. EPA. 1985. Guidelines for deriving numerical national water quality criteria for the
protection of aquatic organisms and their uses. U.S. Environmental Protection Agency,
Office of Research and Development, Environmental Research Laboratories. PB85-
227049.
Greyed cells are substances that were not included in the final evaluation of model performance because the chemical was
part of the EcoSAR model training set.
Carbamates (N=17 total, with N=12 included in the EcoSAR model training set, N=5 used in model validation.)
endo-3-Chloro-exo-6-
cyano-2-
norbornanone, o- 15271-41-7 Yes No Oxime carbamate CNC(=O)ON=C1C(C2)C(C#N)CC2C1Cl
(Methylcarbamoyl)
oxime
Formetanate
23422-53-9 Yes No Formamidine CNC(=O)Oc1cccc(N=CN(C)C)c1
hydrochloride
3,4,5-
Trimethylphenyl 2686-99-9 Yes Yes Phenyl methylcarbamate CNC(=O)Oc1cc(C)c(C)c(C)c1
methylcarbamate
Benzofuranyl
Carbofuran 1563-66-2 Yes Yes CNC(=O)Oc1cccc2CC(C)(C)Oc12
methylcarbamate
Greyed cells are substances that were not included in the final evaluation of model performance because the chemical was
part of the EcoSAR model training set.
Organophosphates (N=47 total, with N=28 included in the EcoSAR model training set, N=19 used in model
validation.)
Benzotriazine
Azinphos-methyl 86-50-0 Yes No S=P(OC)(OC)SCN1N=Nc2ccccc2C1(=O)
organothiophosphate
Chlorpyrifos-methyl
5598-52-7 Yes No Degradate — Oxon O=P(OC)(OC)Oc1nc(Cl)cc(Cl)c1Cl
oxon
Heterocyclic
Dioxathion 78-34-2 Yes No CCOP(=S)(OCC)SC1OCCOC1SP(=S)(OCC)OCC
organothiophosphate
Heterocyclic
Fosthiazate 98886-44-3 Yes No CCOP(=O)(SC(C)CC)N1CCSC1=O
organothiophosphate
Greyed cells are substances that were not included in the final evaluation of model performance because the chemical was
part of the EcoSAR model training set.
Thiadiazole
Methidathion 950-37-8 Yes No COc1nn(CSP(=S)(OC)OC)c(=O)s1
organothiophosphate
Methyl
953-17-3 Yes No Phenyl organothiophosphate COP(=S)(OC)SCSc1ccc(Cl)cc1
carbophenothion
Heterocyclic
Coumaphos 56-72-4 Yes Yes S=P(OCC)(OCC)Oc1ccc2C(C)=C(Cl)C(=O)Oc2c1
organothiophosphate
S=P(OCC)(OCC)OCCSCC +
Demeton 8065-48-3 Yes Yes Aliphatic organothiophosphate
O=P(OCC)(OCC)OCCSCC
Pyrimidine
Diazinon 333-41-5 Yes Yes CCOP(=S)(OCC)Oc1cc(C)nc(n1)C(C)C
organothiophosphate
Aliphatic amide
Dimethoate 60-51-5 Yes Yes CNC(=O)CSP(=S)(OC)OC
organothiophosphate
Phenyl
EPN 2104-64-5 Yes Yes CCOP(=S)(Oc1ccc(cc1)N(=O)=O)c2ccccc2
phenylphosphonothioate
Greyed cells are substances that were not included in the final evaluation of model performance because the chemical was
part of the EcoSAR model training set.
Isoindole
Phosmet 732-11-6 Yes Yes COP(=S)(OC)SCN2C(=O)c1ccccc1C2=O
organothiophosphate
Pyrimidine
Tebupirimfos 96182-53-5 Yes Yes CCOP(=S)(OC(C)C)Oc1cnc(nc1)C(C)(C)C
organothiophosphate
COP(=S)(OC)Oc2ccc(Sc1ccc(OP(=S)(OC)OC)cc1)
Temephos 3383-96-8 Yes Yes Phenyl organothiophosphate cc2
3,4,5-Trimethylphenyl
2686999 Oncorhynchus mykiss 4700 4700 4700 1
methylcarbamate
endo-3-Chloro-exo-6-cyano-2-
15271417 norbornanone, o- Oncorhynchus mykiss 13000 13000 13000 1
(Methylcarbamoyl) oxime
ECOSAR Version 1.1 Model Estimates and Average Empirical Test Data — Fish
Average of Ratio of
Version 1.1
Version Acceptable ECOSAR
CAS General Version 1.1 — ECOSAR Fish-
1.1 — Empirical Estimated
Registry Pesticide 96 hr LC50
KowWin Fish 96 hr value by
Number Chemical Class ECOSAR Class Model Estimate
LogP LC50 values Empirical
(mg/L)
(mg/L) Test data
Carbamate Validation Data Set: Chemical was NOT part of EcoSAR Model Training Set
22781233 Bendiocarb Carbamate Carbamate esters, phenyl 2.552 3.521 1.20 2.93
29118874 cis-Thiocarboxime Carbamate Oxime Carbamate Ester 0.123 3.684 1.50 2.46
endo-3-Chloro-exo-6-
cyano-2-norbornanone,
15271417 Carbamate Oxime Carbamate Ester 1.089 1.797 13.00 0.14
o-(Methylcarbamoyl)
oxime
Formetanate
23422539 Carbamate Carbamate esters, phenyl 0.879 13.318 4.40 3.03
hydrochloride
23103982 Pirimicarb Carbamate Carbamate esters, phenyl 1.399 9.456 79.00 0.12
Organophosphate Validation Data Set: Chemical was NOT part of EcoSAR Model Training Set
5598527 Chlorpyrifos-methyl oxon Organophosphate Esters (phosphate) 1.911 7.151 0.00 3865.41
ECOSAR Version 1.1 Model Estimates and Average Empirical Test Data — Fish
Average of Ratio of
Version 1.1
Version Acceptable ECOSAR
CAS General Version 1.1 — ECOSAR Fish-
1.1 — Empirical Estimated
Registry Pesticide 96 hr LC50
KowWin Fish 96 hr value by
Number Chemical Class ECOSAR Class Model Estimate
LogP LC50 values Empirical
(mg/L)
(mg/L) Test data
953173 Methyl carbophenothion Organophosphate Esters, Dithiophosphates 4.463 0.109 0.76 0.14
(CAS: 953-17-3)
The following case study was prepared by the Office of Pesticide Programs
(OPP), US EPA.
Case Study:
In pesticide risk assessment there is often an abundance of toxicity data on the parent active
ingredient and very little, if any, data on pesticide metabolites or environmental degradation
products. This can be a problem in trying to assess the risks of metabolites or environmental
degradates. In the case of environmental degradates, a screening level risk assessment may be
performed to determine if additional toxicity data on the degradate should be called in. The
hazard component of the screening level assessment is often based on structural analogy of a
degradate to the parent active ingredient. If parent and degradate are closely related
structurally then toxicity data on the parent ai can be used to estimate the toxicity of the
degradate. If the margin of exposure between estimated toxicity and estimated exposure is not
considered large enough, additional toxicity data may be called in to enable a more
comprehensive risk assessment of a degradate. On occasion, the metabolite or degradate bears
little resemblance to the parent and an alternative analog with associated data must be found to
support the screening level risk assessment.
The herbicide dichlobenil is relatively stable in the environment except for aqueous photolysis.
A major photodegradate of dichlobenil in water (up to 19% of applied dichlobenil) has been
identified as 4-chloro-2(3H)benzoxazolone (BZZ). This photodegradate bears little resemblance
to the parent dichlobenil and therefore toxicity data on the parent are not considered useful for
assessing the toxicity of BZZ. In the absence of appropriate toxicity data the degradate, termed
BZZ, was determined to be of potential concern.
Chlorzoxazone has estimated physical properties very close to BZZ (see table below) and so
bioavailability is likely to be similar. Chlorzoxazone is pharmacologically active as a muscle
relaxant at doses of 10-30 mg/kg/day (http://www.drugs.com/monograph/chlorzoxazone.html),
so the potential exists for BZZ to be biologically active at the same dosage, although no
assumption is made about the kind of effects that might be observed at this dose of BZZ.
*Benzoxazolone is the unsubstituted fused ring structure common to BZZ and chlorzoxazone
Screening Risk Assessment. The theoretical upper limit for BZZ based on estimated surface
water concentration of parent dichlobenil is 0.005 mg/L and the corresponding dosage in a
young child is approximately 0.0005 mg/kg/day for a 10 kg child ingesting 1 liter of BZZ
contaminated water per day. The MOE between the lowest effective pharmacological dose of
chlorzoxazone (10 mg/kg/day and the theoretical intake is 10 mg/kg/day divided by 0.0005
mg/kg/day or 20,000. Although there are many uncertainties in a screening level risk
assessment such as this, the MOE is sufficiently large to conclude that additional toxicity data
are unlikely to result in risks of concern from ingestion of BZZ formed in drinking water as a
result of registered uses of dichlobenil.
Case Study: Fomesafen cancer assessment and mode of action: use of mechanism-
based SAR
Several structurally related diphenyl ether pesticides with carcinogenicity data were
identified. Among these, Nitrofen, Lactofen, Acifluorfen and Oxyfluorfen were
considered the closest. Like Fomesafen, all four were hepatocarcinogenic in mice with
Oxyfluorfen being weakly/marginally active. The chemical structures are shown in the
figure below.
Cl O NO2
Nitrofen
Cl Cl Cl
Cl
CF3 O NO2
OC2H5
Oxyfluorfen
The mode of action of rodent hepatocarcinogenesis for both Lactofen and Acifluorfen
(HED MTARC; TXR #s 0051907 and 0052006, respectively) has been extensively
studied and shown to involve PPARα-medicated peroxisome proliferation. Lactofen can
be readily hydrolyzed by esterases to yield Acifluorfen as its primary metabolite.
Structure-activity relationships studies have shown that one of the major structural
requirements/alerts of most peroxisome proliferators is the presence of an acidic
functional group (e.g., carboxylic, sulfonic) either in the parent compound or a metabolite
(Woo and Lai 2003). The key question is whether Fomesafen can be hydrolyzed to a
carboxylic acid metabolite. In general, the amide (-CO-NH-) bond is quite resistant to
enzymatic hydrolysis. However, in Fomesafen, the presence of a sulfonyl group adjacent
to the amide linkage can significantly facilitate hydrolysis. Indeed, a metabolism study
by the submitter showed that up to 10% of Fomesafen may be hydrolyzed to yield a
carboxylic acid metabolite as the most significant metabolite. Thus, Fomesafen,
Acifluorfen, and Lactofen may actually have common carboxylic acid metabolite(s). It is
interesting to note that, despite structural similarity, Oxyfluorfen, which cannot be
metabolized to a carboxylic acid metabolite, is only weakly/marginally active as a
hepatocarcinogen . Attempts to demonstrate possible PPARα-mediated activity were
unsuccessful for Oxyfluoren. Overall, these findings strengthen the biological
The SAR study provided significant support to the weight of evidence of a PPARα mode
of action of Fomesafen-induced mouse liver tumors. Based on the current scientific
understanding of peroxisome proliferation (e.g., Klaunig et al., 2003) and previous EPA
decisions on structurally related herbicides (e.g., Lactofen and Acifluorfen), the level of
confidence in this assessment is high. While the proposed mode of action for liver
tumors in mice is theoretically plausible in humans, it is quantitatively implausible and
unlikely to take place in humans based on quantitative species differences in PPAR
activation and toxicokinetics. In accordance with the EPA Final Guidelines for
Carcinogen Risk Assessment (March 29, 2005), the CARC classified Fomesafen as “Not
Likely to be Carcinogenic to Humans”.
References:
Klaunig JE, Babich MA, Baetcke KP, Cook JC, Corton JC, David RM, DeLuca JG, Lai
DY, McKee RH, Peters JM, Roberts RA, Fenner-Crisp PA. (2003). PPARalpha agonist-
induced rodent tumors: modes of action and human relevance. Crit Rev Toxicol
33(6):655–780.
Woo, Y.T., and Lai, D.Y. (2003). Mechanism of action of chemical carcinogens and their
role in structure-activity relationships (SAR) analysis and risk assessment. In:
Quantitative Structure-Activity Relationship (QSAR) Models of Mutagens and
Carcinogens, R. Benigni, ed., CRC Press, p. 41.
This example does not include the entire text or conclusions of the screening
assessment document. The example is only an abstract of the following sections:
Substance Identity, Physical and Chemical Properties, Health Effects
Assessment, Appendix 6, and Appendix 7.
These sections have been abstracted to illustrate the application of (Q)SAR and
information on analog substances to assess the toxicity of a substance in a
weight-of-evidence type approach.
For a copy of the complete screening assessment document, please consult the
following website:
http://www.ec.gc.ca/ese-ees/default.asp?lang=En&n=403207BF-1
For the purposes of this document, this substance will be referred to as MAPBAP acetate,
derived from the DSL name. MAPBAP acetate belongs to a class of dyes known as
cationic triarylmethanes. The class can be further sub-divided into those where the
charge on the cation (triarylmethane moiety) is localized or delocalized. MAPBAP
acetate belongs to the latter sub-category (Hunger 2003) implying that the bond holding
the cationic and anionic components of the structure together is at least partly covalent.
Chemical Abstracts
Service Registry 72102-55-7
Number (CAS RN)
Methylium, [4-(dimethylamino)phenyl]bis[4-(ethylamino)-3-
DSL name methylphenyl]-, acetate
Methylium, [4-(dimethylamino)phenyl]bis[4-(ethylamino)-3-
National Chemical methylphenyl]-, acetate (1:1) (TSCA)
Inventories (NCI)
names1 Methylium, [4-(dimethylamino)phenyl]bis[4-(ethylamino)-3-
methylphenyl]-, acetate (AICS, PICCS, ASIA-PAC, NZIoC)
[4-(Dimethylamino)phenyl]bis[4-(ethylamino)-3-
Other names
methylphenyl]methylium acetate
Chemical group
Discrete organics
(DSL Stream)
CN(c2ccc(cc2)C[(OC(=O)C)](c3cc(c(cc3)NCC)C)c1cc(c(cc1)N
SMILES 3
CC)C)C
No experimental data are available for MAPBAP acetate. At the Environment Canada-
sponsored Quantitative Structure-Activity Relationship (QSAR) Workshop in 1999
(Environment Canada 2000) modelling experts identified many structural classes of
pigment and dyes as being “difficult to model” using QSARs. Some physical and
chemical properties of many of the structural classes of dyes and pigments are not
amenable to prediction by models. Under such circumstances, a "read-across" approach
is considered which employs close analogues, to determine the approximate physical and
chemical properties of MAPBAP acetate. A search of the ChemIDPlus (2009) database
yielded a number of suitable analogues which are described in Table 2. Experimental
data for these analogues, when available, were used as extrapolated (read-across) values
for MAPBAP acetate or as supporting values for the weight of evidence.
A limited number of read-across data were found for the selected analogues and,
therefore, predicted values are also used for MAPBAP acetate and the uncertainties of the
predictions are noted.
Koc
(dimensionless)
Analogue 9000
(CAS RN 2390-59-2)
Water solubility 3 (mg/L)
1
Values and units in brackets represent those originally reported by the authors or estimated by the models.
2
This value was modelled using the experimental analogue logKow of 0.51 as input,.
3
Importer of MAPBAP acetate has indicated that it is completely soluble at environmental pHs (eg. pH 7).
No empirical toxicity data were identified for MAPBAP acetate. Sources of health
hazard information considered included examination of available international reviews,
assessments or classifications, reviewing the available empirical data where available and
the use of predictive models as appropriate. The outputs of predictive models were also
considered using five different QSAR models: TOPKAT (2004), CASETOX (2008),
Toxtree (2009), DEREK 2008, and Model Applier (2009).
Using the representative molecular structure of MAPBAP (with the acetic acid fragment
(acetate) attached to the carbon atom (attached to three aromatic rings)), the following
results were obtained. Positive predictions were obtained on five different genotoxicity
endpoints and only one of these (i.e. rodent micronucleus assay) is corroborated by more
than one model (CASETOX and Toxtree). The Benigni-Bossa model within the Toxtree
also predicts it to be a Salmonella typhimurium TA100 mutagen with metabolic
activation. On the other hand, the female rat cancer models of both CASETOX and
Model Applier gave positive predictions. The male rat cancer model of Model Applier as
well as both mice models (male and female) of CASETOX gave positive predictions. The
presence of a structural alert indicative of genotoxic carcinogenicity is another piece of
supporting information that has been obtained from the Benigni-Bossa model within
Toxtree. Applying the OncoLogic model to a nearly similar structure containing hydroxyl
group in place of the acetate group results in a positive carcinogenicity prediction. This
prediction is based on presence of Nitrogen substituted groups on the aromatic rings.
It is important to note that the Toxtree micronucleus model is a coarse grain filter for
preliminary screening of potential in vivo mutagens and the OncoLogic does not use the
identical structure for prediction purposes. Also, the Ames point mutation models of
CASETOX and Model Applier predict negative results whereas TOPKAT and DEREK
fail to provide any information. However, in the case of cancer models, there are at least
three models (CASETOX, Model Applier and Toxtree) that classify this chemical as a
potential carcinogen. The CASETOX, Model Applier and the Toxtree models are based
on unique methodologies for making predictions and since they point towards a similar
outcome, it carries more weight.
Thus the model predictions were mixed for carcinogenicity (6 positive and 4 negative),
genotoxicity (6 positive and 7 negative), developmental (2 positive; 18 negative and 10
no result) and reproductive toxicity (1 positive and 12 no result).
Potential structural analogues of MAPBAP acetate for the purposes of read-across for
human health toxicity information were identified using Leadscope (Leadscope 2008)
Gentian violet has been classified by the European Union as Carcinogenicity Category 2
(ECB 2002) based on carcinogenicity in experimental animals. One study did report
negative in vitro genotoxicity for mutations in a reverse mutation assay in several S.
typhimurium strains after exposure to gentian violet at concentrations ranging from 5 –
1000 µg/plate (NICNAS 1999). Malachite green has been classified by the European
Union as Reproductive Toxicity Category 3(ECB 2003) based on developmental toxicity
in experimental animals. Also, the U.S. NTP (2005) reported equivocal evidence of
carcinogenicity in female rats and negative results for genotoxicity from an in vivo
micronucleus assay and an in vitro assay in S. typhimurium (NTP 1997, 1994). C.I Basic
Violet 4 had negative in vitro genotoxicity data for chromosomal aberrations in Chinese
Hamster Ovary cells (NICNAS 1999) and was also found to be predominately negative in
vitro in assays conducted in S. typhimurium and mouse lymphoma cells (CCRIS 2009).
Leucomalachite green was found to have some evidence of carcinogenicity in female
mice and had positive in vivo genotoxicity data (NTP 1996, 2005).
The information obtained from the QSAR models as well as potential analogues, suggest
that there may be potential carcinogenic or developmental toxicity hazards associated
with the substance.
The confidence in the toxicity database is considered to be low due to the lack of
available data for MAPBAP acetate.
ChemIDPlus. 2009. Tool for searching the NLM ChemIDplus database using Chemical
Name, CAS Registry Number, Molecular Formula, Classification Code, Locator Code,
and Structure or Substructure. http://chem.sis.nlm.nih.gov/chemidplus/
Green, FJ. 1990. The Sigma-Aldrich handbook of stains, dyes and indicators. Aldrich
Chemical Company, Inc. Milwaukee, Wisconsin.
[NTP] National Toxicology Program 1994. Study report number A90714. Available
from: http://ntp-
apps.niehs.nih.gov/ntp_tox/index.cfm?fuseaction=salmonella.salmonellaData&
endpointlist=SA&study%5Fno=A90714&cas%5Fno=569%2D64%2D2&activetab=sum
mary
[NTP] National Toxicology Program 1996. Study report number A31156. Study start
date 07/20/1996
[NTP] National Toxicology Program 1997. Study report number A82983. Study start
date 01/16/1997 Available from: http://ntp-
apps.niehs.nih.gov/ntp_tox/index.cfm?fuseaction=micronucleus.micronucleus
Data&cas_no=569%2D64%2D2&endpointlist=MN
Pfenninger Heinz and Bruttel Beat. 1985. Process for converting sparingly soluble
inorganic salts of cationic dyes and brighteners into more soluble salts of organic acids.
United States Patent 4559144. Available from:
http://www.freepatentsonline.com/4559144.html
[TOPKAT] Toxicity Prediction Program [Internet]. 2004. Version 6.2. San Diego (CA):
Accelrys Software Inc. Available from:
http://www.accelrys.com/products/topkat/index.html
Toxtree version 1.60. 2009. Tool to estimate toxic hazard by applying decision tree
approach. Developed by Ideaconsult Ltd Bulgaria.
Mice Rat
Model/
Rat Mice Rodent Mammal
Species
Male Female Male Female
Model
Applier N N P P P N N -
Multicase
Casetox P P ND* P - - - -
Topkat
NR NR NR NR - - - -
Derek - - - - - - - NR
TT
CT
TK
MA
chrom. ab.
-
-
N
ND
chrom. ab. other rodent
-
-
-
ND
chrom. ab. rat
-
-
-
ND
micronucleus mice
-
-
P
ND
micronucleus rodent
-
-
P
ND
drosophila
-
-
ND
ND
drosophila HT
-
-
-
ND
drosophila SLRL
-
-
-
ND
(Q)SAR PREDICTIONS ON GENOTOXICITY
mam. mutation
-
-
-
ND
- mam. mutation DL
-
-
ND
UDS
-
-
P
NR
UDS human
-
-
-
N
lymphocytes
-
-
-
s. cerevisiae
-
-
-
N
yeast
-
-
-
N
hgprt
-
-
-
ND
e. coli
-
-
-
N
e. coli w
-
-
-
ND
microbial
-
-
-
P
salmonella
P
N
N
NR
-
-
-
BB cancer alert
(Q)SAR PREDICTIONS ON REPRODUCTIVE TOXICITY
Model Applier
Model/ Female
Male
endpoint
repro ND ND ND ND ND ND
sperm - - - ND ND ND
Multicase Casetox
NR P NR NR
Model Applier
Retardation N ND N N
Weight decrease N ND N N
Fetal death N ND N N
Structural N ND ND N
Visceral N - N N
Teratogenicity - P NR
Developmental NR - -
MA – model applier;
CT – Multicase Casetox;
TK – Topkat;
TT – Toxtree;
BB – Benigni-Bossa rule;
ND – not in domain;
NR – no result
P – positive
N – negative
Genitian violet
Genotoxicity
548-62-9
In-vitro reverse mutation:
Carcinogenicity
Equivocal evidence of cancer in female rats (NTP 2005)
Genotoxicity
Malachite green
In-vitro gene mutation:
569-64-2
Negative in S.typhimurium TA97, TA98, TA100, TA102,
TA104, TA1535 with and without activation (NTP 1994).
Chromosome aberration:
Chromosome aberration:
Carcinogenicity
Case Study: Use of a weight of Evidence (WOE) approach, including SAR information, to
waive the chronic toxicity/carcinogenicity study requirement in a biocide reregistration decision.
The chemical structures of the isothiazolone biocides can be divided into two sub-classes (Figure
1):
The issue to be discussed in the example case is whether chronic/cancer studies can be waived
based on existing conditions. The issue is discussed from following different aspects.
• Pesticidal Mode of Action: BIT, CMIT, MIT, OIT, and DCOIT all share a common
pathway for antimicrobial activity:
• All inhibit cell respiration
• All inhibit the same class of dehydrogenase enzymes
These biocides react with microbial cells through cleavage of the S-N bond to form an
S-S linkage with the thiol group on target enzymes. Biocidal activity is a function of the
inhibition of cell respiration.
• Toxicity Profile for BIT: There are no carcinogenicity or chronic toxicity studies for
any of the benzene ring-isothiazolone chemicals (such as BIT). BIT is not mutagenic as
all acceptable guideline mutagenicity studies were negative. The toxicity profile of BIT
shows that it is an irritant following oral and dermal exposures, and this is the effect
observed following repeated dosing in subchronic toxicity studies.
o In a 90-day rat dermal study, skin irritation and histopathology were noted at
all doses of 100, 300 and 1000 mg/kg/day, while systemic toxicity was only
reported at the limit dose (1000 mg/kg/day). Gastrointestinal
• Toxicity Profile for the non-benzene ring isothiazolone pesticides: The mutagenicity
data for the non-benzene ring containing isothiazolones were largely negative except for
a few positive observations in vitro with CMIT/MIT and DCOIT. Three
chronic/carcinogenicity studies are available for non-benzene ring isothiazolone
pesticides and all were negative for carcinogenicity, although two of these were found to
have major deficiencies for the chronic toxicity portion of the studies.
o The second study used dermal administration of a single dose of 400 ppm
CMIT/MIT to the skin of mice for 30 months and the only significant finding
was dermal irritation.
All of the isothiazolones produced toxicity at the site of contact, i.e. irritation of the
gastrointestinal tract, skin and respiratory tract, when administered at high doses. These
biocides produce minimal to no significant systemic toxicity; no histopathological change
distant from the site of dosing was observed, which appears correlated with rapid
metabolism and excretion for these chemicals. Based on read-across comparison, it is
concluded that:
o Skin irritation: Similar findings in all dermal studies (BIT, CMIT/MIT, OIT)
although at different dose levels.
o Skin histopathology: Similar for BIT and OIT, none found in rabbit study on
CMIT/MIT
o Similar clinical chemistry findings with BIT and OIT and similar to BIT oral dog
study
o Severe skin irritation in BIT dermal study
Skin Sensitization: CMIT > DCOIT >OIT > MIT > BIT
Final Recommendation:
Based on a read-across comparison and weight of evidence (WOE) approach, that the chronic
toxicity/carcinogenicity study for BIT is not required at this time if the risk assessment is
protective of irritation. This recommendation was based on the following considerations: 1)
available cancer studies for the isothiazolone pesticides are negative; 2) there is a lack of
mutagenicity concern for BIT, and the other isothiazolone pesticides; 3) BIT and the other
isothiazolones are irritants following oral, dermal and inhalation exposures and produce similar
effects following subchronic exposures; 4) the isothiazolones as a group have a known mode of
action for antimicrobial activity; 5) irritation is the predominant effect and is the basis of the
PODs; 6) although the metabolism study for BIT showed an increased half life and accumulation
of radioactivity in thyroid compared to other isothiazolone chemicals, these observations were
determined to be not of toxicological significance, as the toxicological effects of BIT up to 90
days were not different than the effects observed with the other isothiazolone chemicals.
It is recommended that the available data be evaluated to inform the need for a UF to account for
subchronic to chronic exposure durations for BIT.