Documentation in Cleaning Validation 1692198553
Documentation in Cleaning Validation 1692198553
Documentation in Cleaning Validation 1692198553
The International Congress on Harmonization Quality Risk Management Guidance (ICH Q9)
lists both cleaning (in Annex II.4) and validation (in Annex II.6) as potential areas for the
application of quality risk management.1 This clearly implies that the ICH Q9 principle for
adjusting the level of "effort, formality, and documentation" based on the level of risk could
be applied to cleaning and its validation. Previous articles discussed how science-based and
data-derived scales could be created from HBELs (health-based exposure limits), from the
process capability (Cpu) of cleaning processes, from the detection limits for total organic
carbon (TOC) analyses of these compounds, or from visual inspection.2-5 Another article
discussed how these scales could be used to measure the level of risk in cleaning validation.6
This article builds on these discussions and shows how these HBEL-based and process
capability-based scales can be combined into a matrix that provides a clear visual guide for
adjusting the level of effort, formality, and documentation for cleaning validation based on
the level of risk.
• "The evaluation of the risk to quality should be based on scientific knowledge and
ultimately link to the protection of the patient; and
• The level of effort, formality and documentation of the quality risk management
process should be commensurate with the level of risk".
From these two primary principles, it can be understood that if we can determine the level of
risk to a patient from cleaning, then the level of cleaning validation effort, its formality, and
its documentation could be adjusted accordingly. Stated simply, cleaning validation efforts
for low-risk products should not require the same level of effort as for high-risk products.
This is perfectly logical. The level of effort, formality, and documentation for cleaning
validation should correspond to the level of risk, which includes the available knowledge of a
cleaning process and the nature of the product.
"...the ability to detect the harm (detectability) also factors in the estimation of risk".
Now, if the hazard is intrinsic to an active pharmaceutical ingredient (API) and the risk
being considered is harm to a patient from exposure to residues of that API after cleaning,
then this equation can be further refined to:
So, if we can measure these parameters of toxicity, exposure, and detectability we should be
able to locate our position on the continuum shown in Figure 1.
The scales presented in the preceding four articles2-5 offer scientifically based methods to
measure these risk parameters using actual data. Since the scales presented in the four articles
are all based on science and derived from actual data, they would consequently make good
choices for evaluating the risk in cleaning, including patient safety.
These scales were originally developed as replacements for the typical scales used in failure
modes and effects analysis (FMEA). The subjectivity of the scales typically used in FMEAs,
and the lack of a scientific/statistical basis for their risk priority numbers (RPNs), often
makes both the scales and their RPNs inappropriate for use in the pharmaceutical industry, as
discussed in an earlier article.6 If pharmaceutical manufacturing is to advance to a science-
and risk-based approach, the scales for severity, occurrence, and detectability used in FMEAs
must be scientifically justified using science, process knowledge, and statistics. Such scales
should be derived from, and based on, empirical data. Data such as this exists for cleaning
and is readily available in pharmaceutical manufacturing production. As stated in the
introduction, scales already exist that could be used for the following criteria:
For example, in a cleaning process, if a failure mode could result in residues of an API
remaining on equipment, then the HBEL-derived toxicity score of that API could be used as a
severity score. Furthermore, if the process capability of the cleaning process is known, then
its process capability-derived score could be used as an occurrence score (as the effectiveness
of the cleaning process and the probability of residues remaining are known). Finally, if
either the visual detectability index4 or the TOC detectability index5 is known, then one or
both of these could be used as a detectability score. Since these scores are derived directly
from empirical data, their values are specific and objective and should not be subject to
debate, as happens frequently with traditional FMEA scales. It should be noted that
detectability indexes may also be compiled from other available analytical test methods used,
such as UV, HPLC, etc.
The Process Capability-derived Scale3 depicts the process capability (Cpu) by converting
actual cleaning data into a scale from 1 to 10 by taking the reciprocal of the Cpu and
multiplying by 10. This results in a scale that has a high value (i.e., 10) associated with a high
probability of failure and a low value (i.e., 1) associated with a low probability of failure, as
is typical of an occurrence scale used in FMEAs. Note: The process capability for a Six
Sigma cleaning process (Cpu = 2) was chosen as the midpoint of this scale.
Figure 2 shows an example of how the Toxicity Scale and Process Capability Scale can be
combined into a matrix and groupings created based on the risk. This is called the
Shirokizawa Matrix, and it can be used as a guide to determine the level of effort, formality,
and documentation necessary for cleaning validation activities.
Figure 2: Shirokizawa Matrix of Cleaning Validation Effort – This matrix is an example
and a potential model of how toxicity and process capability scores can be used together to
determine an acceptable level of effort, formality, and documentation in cleaning validation
programs.
The matrix in Figure 2 is separated into eight groups based on the coordinate positions in the
matrix from the toxicity and process capability scores. The upper left quadrant contains the
highest risk situations: high hazard compounds with poor cleaning processes. Compounds
whose scores fall into this quadrant would require the highest levels of effort, formality, and
documentation in their cleaning validation programs. The lower right quadrant contains the
lowest risk situations: low hazard compounds with excellent cleaning processes. Compounds
whose scores fall into this quadrant would require lower levels of effort, formality, and
documentation in their cleaning validation programs.
Table 1 provides more detailed descriptions of the levels of effort, formality, and
documentation that might be required for those compounds whose toxicity and their cleaning
process capability positioned them in a particular group.
When the Risk-MaPP Guide7 first introduced the acceptable daily exposure (ADE) concept in
2010, one of the first reactions was that all the cleaning limits would become so low that
companies would not be able to pass any cleaning validations. This concern was not based on
any actual knowledge and was addressed by an article published in 2016 that evaluated 304
pharmaceutical products and compared their ADEs to 1/1,000th of their therapeutic dose
limits.8 This article showed that in 85 percent of these cases, the limits were higher, even
substantially higher, not lower, so this concern was unfounded. The next concern that was
raised was that these higher HBELs would allow companies to "relax their cleaning efforts."
This is another misguided assumption.
An original goal of Risk-MaPP, and subsequently of the ASTM E3106 Standard Guide,9 has
been to implement the ICH Q9 principle that the "level of effort, formality, and
documentation" of the cleaning validation process should be based on science and risk. A
quick examination of the matrix in Figure 2 and the details in Table 1 will reveal that any
"relaxation" of cleaning effort will lead to poor cleaning process capability and result in an
increase of validation efforts. Only very good cleaning efforts can allow a company to reduce
the levels of validation effort — even to the possibility of only a visual inspection program.
The EMA indicated in Annex 15 that visual inspection could be used alone in certain cases10
and recently provided guidance on how a visual inspection program can replace analytical
testing for release of equipment after cleaning.11 Such programs could provide significant
operational benefits to companies that can successfully implement them. But the benefits for
companies moving to the HBELs can only be realized if their cleaning processes are shown
to be effective, repeatable, and safe.
Readers should understand that the groupings shown here are our initial recommendations
and should undergo updating as experience is obtained using them. Further, their boundaries
should not be considered rigid or fixed. For example, scores that put the risk at the
intersection of 5 and 6 on the toxicity scale and 8 and 9 on the process capability scale could
equally fall into groups 1, 2, 3, or 5. These groupings and their levels of effort should not be
applied blindly, and serious consideration should be put into what efforts are truly necessary
for each particular situation. These groupings are only meant to help guide the decision
process. Therefore, practitioners are urged to exercise significant QRM efforts in which all
elements of the firm’s cleaning and cleaning validation programs are carefully identified,
analyzed, and evaluated.
The previous articles cited2-6 show how the application of scientific principles and statistical
tools can be used to measure the level of risk in cleaning. Moving into the future, there
should be a shift in the focus of cleaning validation programs to the derivation of HBELs and
the development of cleaning risk assessments. The HBELs and the cleaning risk assessments
will be used to inform the master plans and protocols for the level of cleaning validation that
is necessary (Figure 3). The practice of simply using master plans and protocols from
previous validations as templates should come to an end. The contents of master plans and
protocols should be determined based on science and risk analysis and not on what was done
the "last time."
Figure 3: The Cleaning Quality Risk Management Process – The activities in the BLUE
zone are where full cleaning QRM efforts are applied. When these cleaning QRM efforts are
completed, and the risk is now acceptable, the process can move into the GREEN zone. The
GREEN zone is where the cleaning QRM level of effort, formality, and documentation can be
reduced based on the results of the cleaning QRM efforts in the BLUE zone. This is where
the grouping criteria from the Shirokizawa Matrix are applied.
The three concepts of "effort, formality, and documentation" may appear to be separate ideas,
but in practice they are closely connected. For example, it is hard to imagine how a process
that requires very little effort with very little documentation could have a very high level of
formality or how a process that requires a very high level of effort with a very high level of
formality can have very little documentation. Clearly, these aspects scale up or down with
each other. Therefore, discussing them separately is not really possible or even useful. It is
more informative to look at them together at each step of the risk management process
(Figure 3).
The most important documentation in the cleaning validation process is found at the risk
(hazard) identification stage, and these are the HBEL monographs for chemical hazards. At
this stage, the levels of effort, formality, and documentation must remain at their highest,
since all subsequent activities, decisions, and calculations flow from the information found in
the HBEL monographs. There can be no "shortcuts" in the derivation of HBELs. An ASTM
Standard Guide is currently being finalized on the derivation of HBELs12 that will be of
immense help for companies at this stage.
The next most important documentation is the cleaning risk assessment. With the HBEL
monograph in hand, the risk assessment process can begin. While the ASTM E3106 standard
considers the HBEL at the risk (hazard) identification stage, microbiological hazards,
equipment design hazards, and procedural hazards must also be considered. For a new
product and a new facility, these other hazards may not be well understood, so a full risk
(hazard) identification may be necessary. This may trigger risk reduction activities (Figure 3).
On the other hand, for an established product in an established facility, many of these other
hazards may have already been identified, considered, and mitigated. This prior knowledge
should be leveraged and the level of effort during the risk (hazard) identification may be
reduced. The level of effort, formality, and documentation at this stage will be high, but it can
clearly be adjusted.
At the risk analysis stage, there may be no historical data to consider and analyze for a new
HBEL in a new facility. This may require collecting substantial cleaning data to satisfy
concerns about the level of risk. Cleaning process development studies are required and may
be extensive. The level of effort, formality, and documentation may remain high. The lack of
historical knowledge may likely trigger significant risk reduction activities. Alternatively,
there may be substantial historical data on the cleaning process and an analysis may show
that there is very little risk of cross contamination from this new product and, depending on
the analyzed risk, only one verification run or even a simple visual inspection may be all that
is necessary. A single cleanability test may be all that is needed to confirm that the cleaning
process can be considered adequate and reliable.13 Consequently, the subsequent levels of
effort, formality, and documentation may be reduced.
At the risk evaluation stage, where there is no historical data to consider and analyze, the
master plan and protocols may require the collection of substantial data and performance of
several cleaning performance qualification runs. Alternatively, if there is substantial historical
data and the risk analysis showed that there was little risk of cross contamination from this
new product, the protocol (or SOP) may require only one verification run or even just a visual
inspection. Such an approach is very appropriate for the cosmetics and personal care
industries, where there are many low-hazard compounds that could also be demonstrated to
be low risk.
At the risk control stage, based on the preceding risk analysis and risk evaluations, continued
monitoring using swab sampling after every changeover may be required, and if the
remaining risk is considerably high, then equipment may be released only after acceptable
test results. Or, the preceding risk analysis and risk evaluations may indicate that just visual
inspection is adequate.
Finally, in the risk review stage, when a new product is being introduced, the level of effort,
formality, and documentation should follow from the knowledge and understanding from the
preceding risk analysis, risk evaluation, and risk control activities. This is an important
subject, and a more detailed article on the introduction of new products is currently in
development that will provide a flow chart for how a new product should be introduced
within a cleaning QRM program.
One final observation: When it comes to “formality,” ICH Q9 does not go into any details
about how this factor is affected by a risk assessment; in fact, the word "formality" is only
used twice in ICH Q9 and both are simply mentioned. Therefore, ICH Q9 provides no real
guidance on this aspect. The term "formality" has many definitions and uses, but for the
pharmaceutical and related industries, it can be defined as "an established procedure or set of
specific activities which need to be followed." Many companies already have detailed
procedures for their cleaning validation programs, such as formal protocol templates, and
they may wish to maintain these as they are. However, companies with lower-risk products
and operations may want to simplify their cleaning validation process and move to SOPs and
checklists or cleaning records. While there is no risk to patient safety from a company
maintaining a high level of formality, there is the operational risk of performing excessive
and unnecessary work and being slow to introduce new products. The higher the risk, the
higher the level of formality, and the lower the risk, the lower the level of formality. Each
company will need to decide for itself what level of formality to apply.
Conclusion
One of the goals of the ASTM E3106-18 Standard Guide was to provide a framework for a
scientific risk- and statistics-based approach to cleaning processes and validation within an
ICH Q9 framework and based on the FDA's 2011 Process Validation Guidance. The benefit
of such an approach would be the ability to scale the level of "effort, formality, and
documentation" of the cleaning validation process commensurate with the level of risk. The
current ability to measure risk in cleaning provides an objective tool to focus cleaning
validation efforts on the risks that are the most significant, based on the science behind the
HBEL.
The authors know that most industry workers would agree that cleaning validation efforts for
low-risk products (e.g., low toxicity, demonstrably easy to clean) should not require the same
level of effort as for high-risk products (e.g., high toxicity, hard to clean). At the same time,
we recognize that most industry workers prefer specific guidance on what they need to do,
and simply stating "based on the level of risk" is not helpful or even useable. We have shown
previously how the level of risk can be measured,6 but the question of what should be done at
a particular level of risk still needed to be answered. The Shirokizawa Matrix described in
this article is our first attempt at providing specific guidance on what levels of "effort,
formality, and documentation" could be used. In this article we provide a science-based and
data-driven approach to guide the level of effort, formality, and documentation for the
cleaning of many healthcare products, including pharmaceuticals, biopharmaceuticals,
nutraceuticals, cosmetics, and medical devices.
Peer Review:
The authors wish to thank our peer reviewers: Bharat Agrawal, James Bergum, Ph.D., Sarra
Boujelben, Gabriela Cruz, Ph.D., Mallory DeGennaro, Parth Desai, Kenneth Farrugia,
Angela Garey, Laurence O'Leary, Tri Chanh Nguyen, Miquel Romero Obon, Prakash Patel,
Stephen Spiegelberg, Ph.D., Basundhara Sthapit Ph.D., and Joel Young for reviewing this
article and for providing many insightful comments and helpful suggestions.
References:
2. Walsh, Andrew, Ester Lovsin Barle, Michel Crevoisier, David G. Dolan, Andreas
Flueckiger, Mohammad Ovais, Osamu Shirokizawa, and Kelly Waldron. "An ADE-
Derived Scale For Assessing Product Cross-Contamination Risk In Shared Facilities"
Pharmaceutical Online May 2017
3. Walsh, Andrew, Ester Lovsin Barle, David G. Dolan, Andreas Flueckiger, Igor
Gorsky, Robert Kowal, Mohammad Ovais, Osamu Shirokizawa, and Kelly Waldron.
"A Process Capability-Derived Scale For Assessing Product Cross-Contamination
Risk In Shared Facilities" Pharmaceutical Online August 2017
4. Walsh, Andrew, Thomas Altmann, Alfredo Canhoto, Ester Lovsin Barle, David G.
Dolan, Andreas Flueckiger, M.D., Igor Gorsky, Robert Kowal, Mariann Neverovitch,
Mohammad Ovais, Osamu Shirokizawa and Kelly Waldron. "A Swab Limit-Derived
Scale For Assessing The Detectability Of Total Organic Carbon Analysis"
Pharmaceutical Online January 2018
5. Walsh, Andrew, Thomas Altmann, Alfredo Canhoto, Ester Lovsin Barle, David G.
Dolan, Mariann Neverovitch, Mohammad Ovais, Osamu Shirokizawa and Kelly
Waldron. "An MSSR-derived Scale for Assessing the Detectability of Compound-
Carryover in Shared Facilities" Pharmaceutical Online December 2017
6. Walsh, Andrew, Thomas Altmann, Alfredo Canhoto, Ester Lovsin Barle, David G.
Dolan, Andreas Flueckiger, Igor Gorsky, Jessica Graham, Ph.D., Robert Kowal,
Mariann Neverovitch, Mohammad Ovais, Osamu Shirokizawa and Kelly Waldron
"Measuring Risk in Cleaning: Cleaning FMEA and the Cleaning Risk Dashboard"
Pharmaceutical Online April 2018
8. Walsh, Andrew, Michel Crevoisier, Ester Lovsin Barle, Andreas Flueckiger, David
G. Dolan, Mohammad Ovais (2016) "Cleaning Limits—Why the 10-ppm and 0.001-
Dose Criteria Should be Abandoned, Part II," Pharmaceutical Technology 40 (8)
9. American Society for Testing and Materials E3106-18 Standard Guide for Science-
Based and Risk-Based Cleaning Process Development and Validation,
www.astm.org.
10. EudraLex, "Volume 4 – Guidelines for Good Manufacturing Practices for Medicinal
Products for Human and Veterinary Use, Annex 15: Qualification and Validation,"
https://ec.europa.eu/health/documents/eudralex/vol-4_en.
11. European Medicines Agency: Questions and answers on implementation of risk-based
prevention of cross-contamination in production and “Guideline on setting health-
based exposure limits for use in risk identification in the manufacture of different
medicinal products in shared facilities," 19 April 2018
EMA/CHMP/CVMP/SWP/246844/2018
12. American Society for Testing and Materials Work Item WK59975 - "Standard Guide
for the Derivation of Health Based Exposure Limits (HBELs)," www.astm.org.
13. Song, Ruijin, Alfredo Canhoto, Ph.D., and Andrew Walsh, "Cleaning Process
Development: Cleanability Testing and "Hardest-To-Clean" Pharmaceutical Products"
Pharmaceutical Online January 2019
An ADE-Derived Scale For Assessing
Product Cross-Contamination Risk In
Shared Facilities
By Andrew Walsh; Ester Lovsin Barle, Ph.D.; Michel Crevoisier, Ph.D.; David G. Dolan,
Ph.D.; Andreas Flueckiger, MD; Mohammad Ovais; Osamu Shirokizawa; and Kelly Waldron
From a cGMP (quality) perspective, Risk-MaPP provided guidance on how to manage the
risk of product cross-contamination in shared facilities, from the four main sources the Risk-
MaPP guide identified:
During the planning meetings for the Risk-MaPP Guide, the FDA specifically requested that
this guide:1
Risk, in terms of a hazard (i.e., the potential source of harm), can also be expressed as:
If the hazard is intrinsic to an active pharmaceutical ingredient (API), then this general
equation can be further refined to:
Patient risk for adverse effects increases as exposure to a given API or other compound rises
above the ADE (which is synonymous with the permissible daily exposure [PDE]; and
health-based exposure limits [HBEL], as used in the EU3). This risk is a function of the
unique dose-response-duration relationship for each compound and the level of exposure to
the compound from, for example, residual API carryover after cleaning. Note that this
methodology can be applied to any agent or compound, and is therefore not exclusive to
APIs. Consequently, the ADE provides a value that can be used as a surrogate for severity to
calculate the potential cleaning risk, as shown in the following equation:
This equation tells us that the risk to a patient from cleaning is a function of the toxicity of
the drug and the level of exposure (residues) found after cleaning.
Introducing The Toxicity Scale and Toxicity Scores
Since ADE values vary over several orders of magnitude, they are hard to compare directly.
However, to facilitate such a comparison, the values can be converted into a logarithmic scale
in a manner similar to that used to create the pH scale. By converting the ADE values into
units of grams per day and taking their negative logarithms, a continuous toxicity scale can be
generated, as shown in Table 1. The resulting toxicity scores are shown ranging from 0 to 10.
However, it should be understood that toxicity score values above 10 and below 0 can still be
legitimately obtained.
Similar to how the pH scale provides a convenient means of quickly assessing and comparing
the acidity or alkalinity of solutions, this toxicity scale can be used as a means of comparing
the toxicities or hazard levels of pharmaceutical compounds.
For example, in Figure 1, the ADE values for four well-known compounds were converted to
this toxicity scale. The first compound is dioxin (i.e., 2,3,7,8-Tetrachlorodibenzo-p-dioxin), a
well-known and very hazardous compound with an ADE of 35 picograms/day (toxicity score
= 10.5). The second compound is arsenic trioxide, a relatively hazardous compound used to
treat acute promyelocytic leukemia, with an ADE of 13 µg/day (toxicity score = 4.9). The
APIs also include two low-hazard compounds: aspirin, with an ADE of 5 mg/day (toxicity
score = 2.3), and sodium chloride, with an ADE of 26 mg/day (toxicity score = 1.6).
Figure 1: Comparison of pH and toxicity scales
While it is not unexpected to see aspirin and sodium chloride occupying the low end of the
toxicity scale and dioxin at the high end, readers may find it surprising to see arsenic trioxide,
a compound most people would consider to be quite hazardous, occupying a place at the
midpoint of the scale. This illustrates the ADE concept at work: The toxicity of a compound
is dose-dependent, so while higher doses of a compound may generate extreme adverse
effects, somewhat lower doses may be harmless. The toxicity scale helps to reveal this in
relation to other compounds/APIs. It should also be understood that exceeding the ADE by a
small amount does not necessarily put patients at risk. For example, a swab result exceeding
the ADE-derived limit by a small amount during a cleaning validation study is not an
immediate cause for alarm.
In our previous article3, the ADEs of 304 APIs were compared to their 0.001 dose-based
limits as a demonstration of how inaccurate and overly conservative the 0.001 dose-based
approach is in estimating safe levels for exposure in patients, which could lead to impractical
and unachievable limits. The distribution of these ADE values fits well within the presented
scale, with an average toxicity score of 3.7; a median of 3.6; a mode (16 compounds) of 3.3;
and a slightly positive skewness of 0.42, indicating a fairly normal distribution for these
values (Figure 2). While the analysis did not include all known pharmaceutical compounds, a
significant sample of these compounds was represented, with the results indicating that there
was not a bias toward highly toxic or relatively nontoxic compounds in the dataset. The
analysis also demonstrates that the toxicity scale encompasses the typical range of ADEs
quite well.
Figure 2: Histogram of ADE-derived toxicity scores for 304 APIs
The toxicity scale can be used to visualize and, quite quickly, understand the relative hazard
of different compounds that are manufactured in a common facility. The scale can also satisfy
the US FDA's desire for a tool that both identify the hazard of a drug from a maximum safe
carryover (MSC) perspective and provides a linkage to cleaning validation. Figure 3 provides
examples of how the toxicity scale can be used to visualize the hazards of drug products (as
expressed by their toxicity scores and, thus, their ADEAPI values) manufactured in facilities
where the hazards are low, moderate, and high; and one where the hazards are mixed.
Figure 3: Comparison of four facilities with five products each, using the toxicity scale
In the above examples we can clearly see that, in Facility A, the products are of very low
hazard. To manage the risks of operator and patient exposure, this facility would not need to
employ the same level of controls as a facility with high-hazard products. Conversely,
Facility C handles highly hazardous products and would need to have significant controls in
place. It should be understood that, all other factors being relatively equal, lower-hazard
products mean lower manufacturing risk and higher-hazard products mean higher
manufacturing risk. Using this model, a company can quickly assess whether one facility
requires a greater degree of controls than another facility, or, as suggested by the Risk-MaPP
Guide, whether a given product is appropriate for introduction and manufacture in an existing
facility with its equipment and cleaning practices, or whether adaptation of one or the other is
necessary.
From a cleaning perspective, it should also be clear that Facility A has very low hazards
associated with its products, and therefore should not have to put the same cleaning
validation program in place that Facility C may require. Facility A may actually be able to
use "visually clean" as its sole acceptance criterion for cleaning validation if the maximum
safe surface residues (MSSRs) calculated from these ADEs are well above the level that is
visible and all surfaces are capable of being inspected. There may also be limited need, or no
need, for any continued monitoring based on risk management criteria of an individual
facility. On the other hand, the MSSRs for Facility C would most likely be well below the
level that is visible for these APIs, so swab/rinse sampling would be required and specific
analytical methods may even be needed. Continued monitoring may be necessary. Facility D
presents a unique mixture of low, medium, and high hazards. In a case such as this,
comprehensive manufacturing controls, swab/rinse, sampling, and continued monitoring may
be necessary after Drug 1 (toxicity score = 9.6). Decontamination and other cross-
contamination mitigation steps may even be necessary. But Drug 4 (toxicity score = 2.7) and
Drug 5 (toxicity score = 2.5) may only require "visually clean" as their sole acceptance
criterion for cleaning validation, especially if they are followed by products with low
maximum daily doses.
Benefits To Regulators
The toxicity scale can benefit regulators, as well, since an inspector seeing an API toxicity
scaling graphic within a facility risk assessment may choose to focus less inspection effort on
the cross-contamination controls for Facility A, and move on to Facility C, which presents
higher cross-contamination risk. Based on the relatively hazardous products manufactured in
Facility C, it would be appropriate for an inspector to spend more energy exploring the cross-
contamination controls for such a facility. Facility D has a mix of low-, moderate-, and high-
hazard products, and may have different levels of controls in place for handling them. As
such, inspectors might elect to investigate how this facility handles these products and may
want to focus their inspection on the risk reduction and cross-contamination controls due to
Drug 1 and how these products interact with the other products (e.g., mix-up, retention,
mechanical transfer, and airborne transfer).
Conclusion
As stated earlier, the US FDA, EU, and other health authorities have, during the development
of the Risk-MaPP Guide and through other published guidelines, expressed a strong interest
in an "approach for identifying highly hazardous drugs." We believe that dividing
pharmaceutical compounds into two classes, highly hazardous and non-hazardous, cannot be
scientifically justified, and that the hazards that drugs present to patients should be viewed on
a continuum. The new toxicity scale described in this article provides such an approach, and
potentially could be used:
3. to communicate relative severity levels of hazards to internal (e.g., QA, RA) and
external (e.g., customers, regulators) stakeholders
5. to assist inspectors in identifying facilities and areas with the greatest risk to focus on
prior to or during their inspections, potentially accelerating inspections — a benefit to
both regulator and industry
Future Publications
The toxicity scale presented in this article provides a way to measure the relative toxicity of
compounds with regard to cross-contamination. A subsequent article will discuss a new scale
for probability based on process capability values. Another article will discuss a new scale for
detectability based on visual residue limits (VRLs) and maximum safe surface limits. A final
article will discuss how these three scales in combination can be used in FMEAs/FMECAs
for risk assessment of cleaning or other processes.
Peer Reviewers: Thomas Altmann; James Bergum, Ph.D.; Alfredo Canhoto, Ph.D.; Parth
Desai; Mallory DeGennaro; Igor Gorsky; Michael Hyrtzak, Ph.D.; Robert Kowal, Mariann
Neverovitch; and Joel Young
References:
This article discusses the concept and measurement of risk as it applies to the cleaning of
pharmaceutical products. Four previous articles discussed how science-based data-derived
scales could be created using compound HBELs (health-based exposure limits), from the
process capability (Cpu) of the products’ cleaning processes and from the detection limits for
visual inspection or for total organic carbon (TOC) analyses of these compounds.1-4 This
article continues the discussion about the potential use and application of these new scales in
cleaning failure modes and affects analysis (cleaning FMEA) to assist in measuring the risk
of cleaning process failures as well as how these scales can be applied to develop a cleaning
risk dashboard. The article will also discuss how these new scales can be utilized to
accelerate new product introductions.
Note: This article uses the term health-based exposure limit (HBEL), which is synonymous
with the terms acceptable daily exposure (ADE) and permitted daily exposure (PDE).
Most people will tell you they know what risk is, and they can give clear examples of risks in
their lives. But if asked, they will not know, or will have difficulty identifying, what the
underlying components of risk are. This is probably because most people have come to
understand risk through personal experience and not through any formal study of risk or its
measure. Historically, risk has not been very well understood or evaluated properly.5 For
example, many people consider all snakes to be dangerous and a risk although only some
snakes are actually poisonous and many are harmless and even beneficial. Similarly, while
some drugs may be hazardous, that does not mean all of them should be considered a high
risk. While risk management has been in use in various industries for many years, it has been
seriously misconstrued.6, 7 These problems also apply to the consistency of hazard
classification and risk assessment of chemicals.8
In 2005, risk was defined for the pharmaceutical industry in the International Council on
Harmonization Quality Risk Management Guideline (ICH Q9), which was formally adopted
by the FDA in 2006.9 As stated in ICH Q9:
"It is commonly understood that risk is defined as the combination of the probability
of occurrence of harm and the severity of that harm."
and further on:
"The ability to detect the harm (detectability) also factors in the estimation of risk."
In ICH Q9 we see risk deconstructed into two subparts: severity and probability, and a third
element of possible prevention, detectability. If we could measure these two (or three)
subparts as they apply to the cleaning of healthcare products, we could then determine what
the level of risk is for cleaning validation and ultimately for a cleaning process. Why would
measuring risk be important for cleaning validation? Most importantly because of a
regulatory concern of ICH Q9 asserting that the two primary principles of quality risk
management are:
"The evaluation of the risk to quality should be based on scientific knowledge and
ultimately link to the protection of the patient; and
The level of effort, formality, and documentation of the quality risk management
process should be commensurate with the level of risk."
From these two primary principles it can be understood that if we can determine the level of
risk to a patient from cleaning, then the level of cleaning validation effort, its formality, and
its documentation can be adjusted based on that risk. More simply, cleaning validation for
low-risk situations should not require the same level of effort as for high-risk situations. This
is quite logical. The level of effort, formality, and documentation of cleaning validation
should be scaled to the level of risk, as well as the available knowledge of a cleaning process.
ICH Q9 clearly states that these principles are applicable to validation (in Annex II.6).
Moreover, they apply to cleaning, including setting acceptance limits for cleaning processes
(in Annex II.4). So, cleaning validation efforts, formality, and documentation should be
adjusted based on the level of risk(s) identified in a risk assessment (RA) and managed
through a quality risk management system.
While that may be good news, an article in 2015 by Kevin O'Donnell of the Health Products
Regulatory Authority asserted that the implementation of quality risk management in the
pharmaceutical industry may have been riddled with misunderstandings.10 One of the issues
with risk management he identified was a lack of sound scientific principles being used in
that the "probability of occurrence estimates are not based on any kind of historical data,
preventative controls, or on modeling data," and that there have been "assumptions regarding
risk severity and detection that are totally unsound." Another issue was making "important
decisions based on Risk Priority Number (RPN) values which fail to recognize that those
values are derived only from ordinal scale numbers" and "are not mathematically
meaningful" and that these RPNs are often "associated with high levels of subjectivity,
uncertainty and guesswork."10 Other recent articles have explored the weaknesses of the use
of risk matrices to derive RPNs.11-18
Clearly, it would be very helpful if the pharmaceutical industry had the means to measure
these elements of risk based on sound scientific principles. The scales presented in the first
four articles1-4 offer science-based answers to these issues – specifically with regard to
cleaning – that can be readily utilized in meaningful, measurable, and practical risk-based
approaches.
Going back to ICH Q9, we see risk can be formally expressed as:
Risk = f (Severity of Hazard, Level of Exposure to Hazard, Detectability of Hazard)
Now, if the hazard is intrinsic to an active pharmaceutical ingredient (API) and the risk
being considered is harm to a patient from exposure to residues of that API after cleaning,
then this equation can be further refined to:
Since the scales presented in the previous four articles are all based on good science and
derived from actual data, they would consequently make good choices to use for evaluating
the risk in cleaning.
One of the most commonly used tools for risk assessment, widely used in the pharmaceutical
industry, is the FMEA. The FMEA is considered a systematic, comprehensive, and powerful
tool for performing risk management and has also been adapted for the evaluation of
processes, so it fits well into the assessment of cleaning processes. The FMEA was developed
by the U.S. military shortly after World War II and published as MIL-P-1629.19 It was
adopted for use by NASA and the aviation industry in the early 1960s, then in the 1970s by
the automotive industry. It was adopted later by many other industries, eventually making its
way into international standards such as ASTM and ISO, but only in recent years has it been
implemented in the pharmaceutical industry.
FMEAs typically use three criteria in their evaluation of failure modes or hazards that fit well
in the ICH Q9 definition of risk:
Once a failure mode is identified, the severity of the effect of the failure, the likelihood of its
occurrence, and the ability to detect this failure are then determined. In the FMEA, these
three criteria are normally evaluated using ordinal scales that can range from 1-10, 1-5, 1-3
(Low/Medium/High), or other combinations, with 1 being the lowest score and 3, 5, or 10
being the highest. Table 1 shows some general rating scores used in FMEAs.20
A review of the descriptions and definitions in Table 1 will quickly reveal that these factors
do not directly translate to many pharmaceutical operations. The consequences of
manufacturing failures affecting pharmaceutical products, such as a cleaning failure, are
substantially different from the failures that might affect other unrelated industries. There is
therefore a need for pharmaceutical companies to establish more appropriate definitions and
descriptions for each of these values within their organizations that are truly reflective of the
realities of their operations. Compounding this challenge are issues with different
stakeholders, such as QA, technical services, and operations, having widely different
opinions on what is a correct score, since most definitions are general, subjective, and
debatable.
Beyond these difficulties and the issues mentioned above,11-18 there are other issues with the
traditional FMEA approach that have been identified and described by Donald J. Wheeler.21
In his article, Wheeler points out that while the possible RPNs range from 1 to 1,000, an
actual calculation of these RPNs results in a very skewed distribution of only 120 possible
actual results (Figure 1).
Figure 1: Distribution of RPN results (used with permission of the author)
Wheeler goes on to show that there are no fewer than 15 combinations that could result in an
RPN of 360, some of which could be considered critical and others, perhaps, not so much. So
the RPN numbers derived using these subjective scales have the potential to be very
misleading (Figure 2).
Figure 2: Fifteen "equivalent" problems having an RPN = 360 (used with permission of the
author)
Wheeler further explains that the ordinal scales typically used in FMEAs cannot be
multiplied legitimately. Looking at the definitions of the scores in Table 1 and the example
results in Figure 2, it quickly becomes obvious that the RPN values from their multiplication
have no particular or practical meaning.
Wheeler goes on to suggest that instead of multiplying them, these scores should remain as
they are and the severity (S), occurrence (O), and detectability (D) scores could simply be
expressed as a numerical string -- SOD. For example, SOD = 937, or SOD = 396. This
approach would maintain the integrity of the original scores, which could allow for more
appropriate ordering. This also enables a reviewer to see where quantitative improvements
were made after any recommended actions were taken. For example, if a failure mode had an
SOD of 978, and the new score was 965, it would be clear that a small decrease was made in
the occurrence and a greater improvement made in the detectability. However, when the
scores are converted to RPN values, they would be 504 and 270, which would seem to be a
significant overall improvement, while in reality there only was a small improvement.
Therefore, the magnitude of calculated numbers is very misleading and the actual “how it
happened” is unclear.
The subjectivity of the FMEA scales typically used, and the lack of a scientific/statistical
basis for their RPN numbers, make both these scales and their RPNs unacceptable for use in
the pharmaceutical industry. If pharmaceutical manufacturing is to advance to a science- and
risk-based approach, the scales for severity, occurrence, and detectability used in FMEAs
must be scientifically justified using scientific principles, process knowledge, and statistics.
These scales should be derived from, and based on, empirical data. Such data exists for
cleaning and is readily available in pharmaceutical manufacturing production. As stated in
the introduction, scales already exist that can be used for the following criteria:
For example in a cleaning process, if a failure mode could result in residues of an API
remaining on equipment, then the HBEL-derived toxicity score of that API would replace the
severity score. Furthermore, if the process capability of the cleaning process is known, then
its Cpu-derived score could replace the occurrence score (as the cleaning process
effectiveness and the probability of residues are known). Finally, if either the visual
detectability index3 or the TOC detectability index4 is known, one or both of these could
replace the detectability score. Since these scores are derived directly from empirical data,
their values are specific, objective, and nondebatable.
In the previous articles on detectability scales,3, 4 it was suggested that the selection of the
analytical methods used in cleaning validation studies should be based on the level of risk.
These articles showed a diagram (Figure 3) that linked the selection of analytical methods to
the toxicity scores of compounds. Compounds of low toxicity (lower risk) might only use
visual inspection, while compounds of high toxicity (higher risk) might require advanced
selective methods. However, when to transition from one group of methods to another is
unclear from this figure, and these articles presented detectability scales for visual inspection
and TOC that could guide the selection process based on actual data.
Figure 3: Risk hierarchy of analytical methods [Note: Toxicity scale is based on –log(HBEL)
where HBEL is the health-based exposure limit in grams]
Table 2 shows how detectability scores derived using the calculations from the detectability
articles3, 4 could be used to determine the most advisable risk-based approach for 10 drugs.
For the 10 drugs in Table 2, the hypothetical criteria used for selecting a TOC method was at
least 1 log below zero and for using visual inspection was at least 2 logs below zero. (Note:
Companies will need to select their own criteria based on their level of risk acceptance.) So
for Drugs 2, 3, and 8, selective methods are necessary as they are well above zero. For Drugs
5 and 9, TOC is acceptable, but visual inspection is not, and for Drugs 4 and 7, both TOC and
visual inspection are acceptable. Visual inspection alone would be acceptable for Drugs 1, 6,
and 10, as they are well below -2 logs.
As both the HBEL-based toxicity scale for severity of hazard and the Cpu-based process
capability scale for probability of exposure (occurrence) are not arbitrary values, they
consequently have real significance. The toxicity and probability of exposure may be
evaluated first, and then detectability can be considered for prioritization when the toxicity
and probability of exposure of two hazards are equal. Table 3 shows the toxicity and process
capability scales side by side from the highest to the lowest possible values.
Table 3: Calculating Cleaning Risk Using the Toxicity and Cpu Scales
In the article on Cpu-based process capability scale,2 a table was shown (Table 4) asking the
reader to select the risk ranking for 10 hypothetical drugs based on these SO scores.
The following considerations are proposed to answer the question in that article:
Drug 1 and Drug 2 have the same RPN scores, but the cleaning procedure for Drug 1
needs considerable improvement to assure that any residues after cleaning are at safe
levels, while Drug 2 does not. However, the traditional RPNs assign them an equal
level of risk.
The traditional RPN method puts Drug 9 as the highest risk (RPN = 48), but it is not
highly toxic, although its cleaning process is not very effective. Based on its high
RPN, it is followed by Drug 5, which is highly toxic, although its cleaning process is
very effective.
Conversely, Drug 6, with a low toxicity, has a very poor cleaning process that is
assured to leave residues, but it has the second lowest RPN score.
It should be evident that multiplying these scores obscures the important information found in
the individual scores. More importantly, it can lead to poor risk analysis and decisions. So,
keeping the raw scores is appropriate. The remaining question is how the risk is objectively
analyzed. One possible way is to give priority to the toxicity scores. Table 5 shows the same
data as Table 4 sorted from the highest toxicity score to the lowest.
Now we see that Drug 5 is ranked as the highest risk, as it has the highest toxicity
score, but its cleaning procedure is very effective and the risk of patient exposure to
residues is very low.
Drug 2 has the next highest toxicity score, but its cleaning procedure is more effective
than Drug 5 (refer to Table 4) and the risk of patient exposure to residues is even
lower.
Drug 9 has a moderate toxicity score, but the cleaning procedure is much worse than
both Drugs 5 and 2 and has a high probability of leaving residues leading to cross
contamination and patient exposure.
Drugs 1 and 6 present low hazards, but their cleaning procedures will definitely leave
residues leading to cross contamination and therefore have high risks for patient
exposure. It becomes apparent that simply ranking compounds by their toxicity scores
is not a suitable way to measure cleaning risk.
Table 6: Ranking Level of Risk by Cpu Score
Table 6 shows the same data as Table 5 but sorted from the highest process capability (Cpu)
score to the lowest.
Now we see that Drug 6 is the highest risk since it has the worst probability score due
to poor cleaning process capability and will leave residues. Although Drug 6 is not
very hazardous, it clearly poses the highest risk for cross contamination.
Drug 1 has the next highest cleaning process capability score. Although Drug 1 is
slightly more hazardous than Drug 6, its cleaning procedure is more capable of
reducing residues than Drug 6. This example shows that while Drug 1 is not very
hazardous, it poses a high risk for cross contamination due to poor process cleaning.
Drug 9 is next as its cleaning procedure is not very good and, although Drug 9 has a
moderate toxicity score and is likely to leave residues and pose a high risk for cross
contamination, the probability of residues is lower than for Drugs 6 or 1.
These drugs are now ordered from 10 to 1 based on their risk of cross contamination. It
appears that ranking by cleaning process capability followed by toxicity is a promising
approach to risk management in cleaning. Detectability scores for visual inspection and TOC
can be added into the analysis for more refinement of the level of risk.
But what of the ICH Q9 promise of the quality risk management process being commensurate
with the level of risk? Can these Cpu and toxicity scores be used for managing cleaning
programs and developing a control strategy based on the risk? Table 7 shows a proposed
high-level evaluation of the 10 drugs in the above example that may be classified into
different risk levels based on these scores. (Note: The reader should understand that the
toxicity and Cpu scales are continuous scales and can have intermediate values [e.g., 6.3, 4.7,
etc.], so these classifications are for example only and should not be considered definitive in
any way.)
Table 7: Example Risk Evaluation Based on Cpu Score and Toxicity Score
Based on the example evaluations shown in Table 7, an action plan for each drug could be
put in place to reduce risk or to mitigate the unacceptable risks or, if the risk is determined to
be acceptable, to develop a control plan to maintain that acceptable level of risk.
Table 9 shows an example formal cleaning FMEA using the scales in this article. In this
hypothetical, a number of basic possible cleaning failures are listed, such as "cleaning
solution concentration too low." While the listed product has a toxicity score of 7.7, the
cleaning process is very effective and residues can be easily detected visually and by TOC.
This detectability should be included in the risk analysis of these failures and then become
part of the control strategy. However, the cleaning process capability shown may not always
be the same if the cleaning agent solution is not made correctly. Similar concerns can arise
about the cleaning agent contact time not being long enough or the temperature being too
low. How should this be addressed? Such questions can be answered using data from design
of experiments combined with Monte Carlo analysis and will be discussed in the next article.
Dashboards are widely used in business to provide simple "at-a-glance" tools that can quickly
show visual representations of complex relationships among many business metrics, key
performance indicators (KPIs), or any other data important to making decisions about a
business process. Dashboards communicate knowledge efficiently and simplify the decision-
making process in business and other endeavors by making multiple sources of data and their
relationships easy to visualize. Ultimately, a critically important process such as QRM would
benefit from a dashboard that could easily present the multiple sources of data so decisions
concerning risk can be made efficiently and with confidence.
The scales discussed in this article and in the previous four articles can be used to develop
such a dashboard. Figure 4 shows an example of how new compounds can be quickly and
easily evaluated to determine whether the current cleaning process and analytical methods
allow these compounds to be manufactured in a shared equipment facility. Their HBELs are
determined and evaluated against the facility's existing cleaning data that compares its
cleaning process capability against the known detection limits to determine if the existing
methods are capable of detecting these new compounds.
Note: Excel spreadsheets for creating these scales can be downloaded for free:
Immediately, it can be seen that Drug 1 is a very toxic compound and that the current
cleaning process cannot adequately clean it to prevent cross contamination issues. (Note:
Process capability can be evaluated based on existing cleaning data compared to the limits
required by the new compound). In addition, residues cannot be detected at a safe level,
visually or even by TOC. Introducing this drug would require substantial improvements in
both the cleaning process and analytical methodologies. Most likely, a manufacturer would
need to dedicate equipment or an entire facility to the manufacture of this drug.
Drug 2, on the other hand, is not highly toxic, and the current cleaning process can easily
clean it to prevent cross contamination issues and any residues can be easily detected visually
or by TOC. Introducing this drug would not require any improvements and would potentially
require evaluation of initial manufacturing by visual inspection only.
Drug 3 is somewhat toxic, but the current cleaning process could adequately clean it to
prevent cross contamination issues, and while residues cannot be detected visually, the TOC
method is acceptable for detection. Introducing this drug would also not require any
improvements.
There are other issues to consider in introducing a new product; however, this dashboard
provides an effective screening tool for making decisions on whether cleaning process
development is needed, what analytical methods can be used, and if analytical method
development is needed to justify the introduction of new products. Such a dashboard also
provides an easy, high-level view of manufacturing operations for rapid measurement of risk
in a facility, department, or manufacturing line.
Conclusion
One of the stated goals of the ASTM E3106-17 Standard Guide for Science and Risk Based
Cleaning Process Development and Validation was to provide a framework for a scientific
risk- and statistics-based approach to cleaning processes and validation based on ICH Q9 and
the FDA's 2011 Process Validation Guidance. Again, the benefit of such an approach would
be the ability to scale the level of effort, formality, and documentation of the cleaning
validation process commensurate with the level of risk, while providing a visual tool for
communicating these risks. Objective tools to measure risk in cleaning can focus cleaning
validation efforts where the risks are the greatest based on: the science behind the HBEL
score, which informs us which products are the most hazardous; the Cpu score of the cleaning
process, informing us what the probability of residues are; and, as we saw in Table 2, the
detectability scores, which can determine the appropriateness of analytical methods and guide
their selection.
Table 10 offers an example of how the toxicity score and the Cpu score could be used to
make decisions on whether additional cleaning process development is necessary, whether
continued or periodic monitoring or simple visual inspection may be appropriate, and even
when product dedication may be necessary.
Table 10: Example of Possible Actions Based on Toxicity and Cpu Scores
Note: For all cases, a visual inspection must still be done.
Table 10 offers a road map for a decision-making process for selecting cleaning validation
activities and developing an ongoing control strategy based on data. However, this is just an
example of how choices could be decided, and each company would need to decide how to
implement this. In his book “Against the Gods: The Remarkable Story of Risk,” Peter
Bernstein5 notes that:
"The essence of risk management lies in maximizing the areas where we have some
control over the outcome while minimizing the areas where we have no control over
the outcome and the linkage between effect and cause is hidden from us."
We can maximize the cleaning process capability to reduce residues to the lowest practical
levels while focusing on those parameters that lower our detection limits. Since the toxicity
of APIs is intrinsic and cannot be influenced, we can minimize the likelihood for toxic
compounds to cross contaminate other products. But this is only possible if we truly
understand where the risks are. The recent requirement for all companies to determine
HBELs for their compounds23 provided a data-based measure of a compound's toxicity for
determining cleaning limits and set the stage for the measurement of risk in cleaning based on
scientific principles. In this article we have presented science- and data-based visual tools to
advance the scientific rigor in the cleaning of healthcare products, such as pharmaceuticals,
biopharmaceuticals, cosmetics, and medical devices.
Peer Review:
The authors wish to thank our peer reviewers Bharat Agrawal; James Bergum, Ph.D.; Sarra
Boujelben; Gabriela Cruz, Ph.D.; Mallory DeGennaro; Kenneth Farrugia; Ioanna-Maria
Gerostathi; Miquel Romero Obon; Laurence O'Leary; Joel Young; Ersa Yuliza; and Mark
Zammit for reviewing this article and for their many insightful comments and helpful
suggestions.
References:
1. Walsh, Andrew, Ester Lovsin Barle, Michel Crevoisier, David G. Dolan, Andreas
Flueckiger, Mohammad Ovais, Osamu Shirokizawa, and Kelly Waldron. "An ADE-
Derived Scale For Assessing Product Cross-Contamination Risk In Shared Facilities"
Pharmaceutical Online, May 2017
2. Walsh, Andrew, Ester Lovsin Barle, David G. Dolan, Andreas Flueckiger, Igor
Gorsky, Robert Kowal, Mohammad Ovais, Osamu Shirokizawa, and Kelly Waldron.
"A Process Capability-Derived Scale For Assessing Product Cross-Contamination
Risk In Shared Facilities" Pharmaceutical Online August 2017
3. Walsh, Andrew, Thomas Altmann, Alfredo Canhoto, Ester Lovsin Barle, David G.
Dolan, Mariann Neverovitch, Mohammad Ovais, Osamu Shirokizawa and Kelly
Waldron. "An MSSR-derived Scale for Assessing the Detectability of Compound-
Carryover in Shared Facilities" Pharmaceutical Online December 2017.
4. Walsh, Andrew, Thomas Altmann, Alfredo Canhoto, Ester Lovsin Barle, David G.
Dolan, Andreas Flueckiger, M.D., Igor Gorsky, Robert Kowal, Mariann Neverovitch,
Mohammad Ovais, Osamu Shirokizawa and Kelly Waldron. "A Swab Limit-Derived
Scale For Assessing The Detectability Of Total Organic Carbon Analysis"
Pharmaceutical Online January 2018
5. Bernstein, Peter L., "Against the Gods: The Remarkable Story of Risk" John Wiley &
Sons, Inc. 1998.
6. Hubbard, Douglas W. "The Failure of Risk Management: Why It's Broken and How
to Fix It" Wiley; 2009, 1 edition
7. Taleb, Nassim Nicholas "Fooled by Randomness: The Hidden Role of Chance in Life
and in the Markets" Random House 2008
8. Hansson Sven Ove and Christina Rud´en "Evaluating the risk decision process"
Toxicology 218 (2006) 100–111
9. International Conference on Harmonization of Technical Requirements for
Registration of Pharmaceuticals for Human Use, ICH Harmonized Tripartite
Guideline, Quality Risk Management – Q9, Step 4, 9 November 2005,
http://www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Quality/Q9
/Step4/Q9_Guideline.pdf.
10. O'Donnell, Kevin "QRM in the GMP Environment: Ten Years On — Are Medicines
Any Safer Now A Regulators Perspective" Journal of Validation Technology
December 21, 2015.
11. Ball, D. J., & Watt, J. (2013). Further thoughts on the utility of risk matrices. Risk
Anal, 33(11), 2068-2078.
12. Baybutt, P. (2014). Allocation of risk tolerance criteria. Process Safety Progress,
33(3), 227-230.
13. Baybutt, P. (2016). Designing risk matrices to avoid risk ranking reversal errors.
Process Safety Progress, 35(1), 41-46.
14. Cox, L. A., Jr. (2008). What's wrong with risk matrices? Risk Anal, 28(2), 497-512.
15. Cox, L. A., Jr., Babayev, D., & Huber, W. (2005). Some limitations of qualitative risk
rating systems. Risk Anal, 25(3), 651-662.
16. Levine, E. S. (2012). Improving risk matrices: the advantages of logarithmically
scaled axes. Journal of Risk Research, 15(2), 209-222.
17. Li, J., Bao, C., & Wu, D. (2018). How to Design Rating Schemes of Risk Matrices: A
Sequential Updating Approach. Risk Anal, 38(1), 99-117.
18. Waldron, Kelly ,"Risk Analysis and Ordinal Risk Rating Scales—A Closer Look"
Journal of Validation Technology December 21, 2015.
19. MIL-STD-1629A - Procedures for Performing a Failure Mode, Effects and Criticality
Analysis. U.S. Department of Defense. 1949. (MIL–P–1629).
20. Textron, Inc. (https://www.slideshare.net/handbook/qm085design-andd-process-fmea)
21. Wheeler, D., “Problems with Risk Priority Numbers – Avoiding More Numerical
Jabberwocky,” Quality Digest, June 2011, www.qualitydigest.com/inside/quality-
insider-article/problems-risk-priority-numbers.html
22. Killilea, M.C., “Cleaning Validation: Viracept, 2007,” Journal of Validation
Technology, November 2012, Vol. 18, Issue 4, www.ivtnetwork.com/jvt-journal.
23. "Guideline on setting health based exposure limits for use in risk identification in the
manufacture of different medicinal products in shared facilities" European Medicines
Agency 20 November 2014
Health-Based Exposure Limits: How Do
The EMA's Q&As Compare With New And
Forthcoming ASTM Standards?
By Andrew Walsh; Thomas Altmann; Alfredo Canhoto; Ph.D.; Ester Lovsin Barle, Ph.D.;
Joel Bercu, Ph.D.; David G. Dolan, Ph.D.; Andreas Flueckiger, M.D.; Igor Gorsky; Jessica
Graham, Ph.D.; Robert Kowal; Mariann Neverovitch; Mohammad Ovais; and Osamu
Shirokizawa
This article reviews the European Medicines Agency (EMA) April 16, 2018 Questions and
answers on implementation of risk-based prevention of cross-contamination in production
and Guideline on setting health-based exposure limits for use in risk identification in the
manufacture of different medicinal products in shared facilities.1 We will discuss the
significance of these changes to cleaning validation programs and how they compare to the
new ASTM E3106-18 Standard Guide for Science-Based and Risk-Based Cleaning Process
Development and Validation2 and the forthcoming ASTM Standard Guide for the Derivation
of Health-Based Exposure Limits (HBELs).3 The authors of this article are members of either
one or both ASTM standard teams.
Table 1 shows a comparison of the original questions and answers in the EMA’s draft
document versus the modified questions and answers.
Table 1
The immediate impression readers will get from Table 1 is that the draft questions and
answers (Q&As) underwent significant changes. One of the issues with the draft Q&As was
that there was no public solicitation by the EMA of pharma industry companies and
organizations in the form of working groups prior to publishing the draft Q&As.
Consequently, the draft Q&As came as a surprise to most of the industry. Many of the
answers were seen as problematic, as they were directing the industry away from the science-
and risk-based approach that regulators have been promoting since 2001. In addition, the
draft Q&As contradicted the 2012 Concept Paper6 and reverted to traditional approaches that
have been demonstrated to need replacement with modern science- and risk-based
approaches.7-12 As a result, 32 industry organizations, companies, and individuals, including
the ASTM E3106 team, submitted extensive comments on the draft Q&As. These comments
were significant enough for the EMA to hold a workshop on June 20-21, 2017 to discuss the
comments. Subsequently, in its final Q&A document, posted on its website on April 16,
2018, the EMA introduced numerous changes.
The full text of each of the Q&As and their updates are shown in the boxed areas and are
discussed one by one.
Q1 (2016 Draft) − Do companies have to establish Health-Based Exposure Limits
(HBELs) for all products?
A: Yes, HBELs should be established for all products. HBELs for highly hazardous
products are expected to be completed in full as per the EMA guide
(EMA/CHMP/CVMP/SWP/169430/2012) or equivalent. See Q2 for products/active
substances considered to be highly hazardous. Products that do not fall into the highly
hazardous category may be addressed as per Q4.
Discussion: There are two important changes to Q1. The first is that the establishment of
HBELs has been narrowed to "medicinal" products. This apparently clarifies that not all
chemicals used in manufacturing by a pharmaceutical company may be medicinal. This was
unnecessary, as EMA's November 2014 Guidance1 already stated that it applied to medicinal
products. What is not obvious is that this now can be interpreted to exclude compounds that
are intermediates, degradants, and cleaning agents, for example. This is a concern since
intermediates, degradants, and cleaning agents are all possible hazards for cross
contamination of another product. This indirectly ties into the changes in Q5 (now Q10),
where the LD50 is only excluded from calculating HBELs in drug products, possibly
indicating that its use is acceptable for other compounds used in drug production. This is a
serious concern, as the historical use of the LD50 divided by an arbitrary adjustment factor for
calculating cleaning limits for cleaning agents has been shown to be very unreliable and
significantly overly conservative. This is due to the large composite adjustment factors
historically and unjustifiably used to extrapolate from an LD50 (and endpoint of mortality) to
a human safe dose.9, 13 In most cases, these LD50-based limits for cleaning agents are so
conservative that some companies have felt forced to stop using cleaning agents completely.
Furthermore, the LD50 limit is not convertible or relatable to any meaningful toxicological
hazard assessment that could be used to derive health-based limits; therefore, this needs to
change.
A: Highly hazardous products are those that can cause serious adverse effects at low
doses and that therefore would benefit from a full toxicological assessment in order to
derive a safe HBEL.
Highly hazardous products are identified based on their inherent toxicological and
pharmacological characteristics and include the groups below (this list is not an
exhaustive list and if evidence is available indicating that the product may cause
adverse effects at low doses by other mechanisms it should be considered as highly
hazardous).
Manufacturers should consider, via a safety assessment against the guidance below, if
products/active substances should be considered highly hazardous. Evidence
indicating a product or active substance falls within any of the categories below
should result in a product being considered highly hazardous. If in doubt,
manufacturers should consider the product potentially highly hazardous and apply the
EMA guide (EMA/CHMP/CVMP/SWP/169430/2012) in full to derive a safe HBEL.
3. Compounds that can produce serious target organ toxicity or other significant
adverse effects at low doses, for example where evidence exists of such effects being
caused by a clinical dose of <10 mg/day (veterinary dose equivalent 0.2 mg/kg/day)
or dosages in animal studies of ≤1 mg/kg/day.
Q2 (2018 Final) − changed to: Is there a framework that could be used to define
the significance of the Health-Based Exposure Limit (HBEL) such that there can
be broad guidance on the extent of quality risk management (QRM) and control
measures required?
A: Firstly, it should be recognized that hazard varies on a continuum scale and that
there are no firm cut off points, risk should be controlled on a proportionate basis.
However, as a broad hypothetical model, the following figure could be considered to
show the increasing level of hazard (red being highest hazard) presented by products
and there should be a commensurate increase in the level of control to prevent
potential cross contamination in a shared facility. Actual HBEL values should be used
in QRM studies to determine the actual controls required.
Figure 1: Diagram developed from an original concept published by ISPE. Source: ISPE
Baseline® Pharmaceutical Engineering Guide, Volume 7 – Risk-Based Manufacture of
Pharmaceutical Products, International Society for Pharmaceutical Engineering (ISPE),
Second Edition, July 2017.
Discussion: Q2 underwent the most extensive changes of all the Q&As and has dropped the
original “highly hazardous” classification and categorization approach.
Discussion: Presumably, based on the many serious objections by many organizations and
companies, including the ASTM Cleaning Standard Team, regarding using OELs or OEBs,
Q3 was eliminated completely. While the simple conversion of an OEL to a provisional
HBEL can be done, the point of departure and the study used for an OEL may not be
applicable for HBELs for other routes of exposure. In addition, adjustment factors may need
to be changed due to potential differences in the target population (worker vs. patient), the
route and the duration of exposure, etc. Rather than manipulating an OEL calculation, it is
more sensible to perform the correct HBEL derivation. OEBs should not be used due to their
highly variable default values and inconsistent practice from company to company. Since the
effective date of the EMA guideline, some companies may have used existing OELs and
OEBs for estimating the HBELs; however, the majority of companies simply moved forward
by providing a full toxicological assessment with a focus on the patient safety. Now, nearly
four years after publication, these approaches are unnecessary. Instead of expending their
time and resources justifying the conversion of OELs and OEBs into provisional HBELs,
companies should just simply proceed with developing appropriately derived HBELs.
Q4. (2016 Draft) − Can calculation of HBELs be based on clinical data only (e.g.,
to establish the HBEL on 1/1000th of the minimum therapeutic dose)?
A: Many existing commercial products and new products for which clinical safety
profiles are well established and that do not belong to the highly hazardous category
(see response to Q2) have a favourable therapeutic index (also referred to as the
therapeutic window or safety window). This means that unwanted or adverse health
effects (that may have been identified as toxic effects in animal studies at high doses)
may occur - if at all - at dose levels orders of magnitude above the therapeutic dose
range and the pharmacological activity would therefore be the most sensitive/critical
effect. In this situation, therapeutic dose information could be used as the ‘Point of
Departure’ for calculation of an HBEL (e.g., the PDE). Under these circumstances,
HBEL based on the 1/1000th minimum therapeutic dose approach would be
considered as sufficiently conservative and could be utilized for risk assessment and
cleaning purposes.
Q4. (2018 Final) − This Q&A was eliminated and changed to a different
question.
Discussion: Thankfully, Q4 was eliminated. This answer contained two major problems.
First, it incorporated the draft categorization of drugs into "highly hazardous" and "non-
hazardous drugs." Second, it perpetuated the continued use of an arbitrary assessment factor
of 1/1,000th of the minimum therapeutic dose approach as an appropriate means for
establishing HBELs for "non-hazardous" drugs. As mentioned above, the use of an arbitrary
assessment factor, in particular 1/1,000th of a dose, is very unreliable and may significantly
over- or underestimate a chronic dose that is safe for humans. Thus, it may have been
underprotective for patients administered a residual carryover of some drugs and
overprotective for other drugs, resulting in an unnecessary burden to low-hazard and low-risk
product manufacturers prior to the adoption of HBELs.7-12 Nonetheless, while Q4 has been
eliminated, use of an arbitrary adjustment factor of 1/1,000th still persists in the form of
"historical limits" in Q6. Furthermore, the continued use of the 1/1000th of a dose limit
contradicted EMA’s previous Concept Paper.6
A previously published article described cases where the 1/1,000th of the minimum
therapeutic dose approach is not adequately protective.12 The benefit of an HBEL, also
known as a permitted daily exposure (PDE) and acceptable daily exposure (ADE), is that it is
calculated by an expert in a form of a daily dose. Conversely, the identification of minimum
therapeutic doses and their extrapolation to a safe dose has typically been performed by
cleaning validation personnel and did not involve an evaluation by an expert in a
toxicological risk assessment. Again, dividing the minimum therapeutic dose by a factor of
1,000 for a drug with only minor hazards is overly conservative and has imposed unnecessary
burdens upon low-risk drug manufacturers. The upcoming ASTM Standard for the
Derivation of Health-Based Exposure Limits (HBELs) will provide guidance from expert
pharmaceutical toxicologists on how to handle setting limits for low-risk drugs.
A: No, LD50 is not an adequate point of departure to determine a HBEL for drug
products.
Discussion: The change in Q5 is subtle, and the EMA may have been convinced it is a minor
point to clarify that it only applies to drug products. However, the original answer was
correct, and the LD50 should never be used to calculate an HBEL for any compound and, in
particular for cleaning validation, for the components of cleaning agents. Members of the
ASTM Cleaning Standard Team published an article in 2013 demonstrating why the LD50
cannot be used as a point of departure to derive a NOEL (no observable effects level) and
how the historical LD50 approaches being used to calculate limits for cleaning agents are
highly inaccurate and lead to excessively low limits. In many cases, companies have been
unable to reliably achieve these limits analytically. 12 The use of the LD50 has led to the
pharmaceutical industry moving away from or even ceasing to use, cleaning agents, which
appears to be completely unjustified, as many cleaning agent components are quite innocuous
and safe. The elimination of cleaning agents can lead to inefficient cleaning cycles and may
have even led to product residues remaining on manufacturing surfaces in some cases.
The use of the LD50 is not endorsed scientifically for calculation of any HBELs. LD50s are the
expression of single-dose (acute) toxicity, but HBELs, by definition, must be set for long-
term (chronic) exposure. An attempt to statistically correlate known LD50 values (acute
toxicity) with their corresponding NOELs (chronic toxicity) was so poor that it was shown
that the LD50 cannot be used to set a limit of chronic toxicity.9 For drugs, there is no need to
use an LD50 to establish an HBEL, since much more relevant chronic toxicity data exists.
Also, oral LD50 testing is seldom performed for small molecule drugs and is of no value for
development of HBELs for parenterally-administered follow-on drugs. The selection of an
appropriate point of departure (PoD) for HBEL calculations of commercially available
chemicals, or pharmaceutical intermediates that do not have full toxicological datasets, can be
a challenge, but does not warrant the use of the LD50. Alternative and scientifically
acceptable recommendations are described in the upcoming ASTM Standard Guide for the
Derivation of Health-Based Exposure Limits (HBELs) to provide guidance on this issue.
Q6 (2016 Draft & 2018 Final) − How can limits for cleaning purposes be
established?
New answer
Results above the alert cleaning limit should trigger an investigation and, where
appropriate, corrective action to bring the cleaning process performance within the
alert cleaning limits. Repeated excursions above the alert cleaning limit will not be
considered acceptable where these indicate that the cleaning method is not in control.
Recognized appropriate statistical methods may be used to determine whether the
cleaning process is in control or not.
Discussion: Q6 is the most significant of the Q&As regarding cleaning validation, and EMA
is to be commended for adopting its emphasis on a statistical approach; however, we have
some suggestions for improvement. While the changes to this answer are an improvement on
the original answer, the new answer continues several misunderstandings of the principles of
science and risk involved with setting cleaning limits that the ASTM E3106-18 standard
provides. The new answer can be misinterpreted, and some companies/practitioners are
claiming that this answer allows the old limits to stay in place and they have no plans to make
any changes to their cleaning programs or toward increasing understanding of their cleaning
processes. That is a serious misconception if one understands the context around this Q&A
and reads carefully what is being stated.
The first sentence remains the same and correctly reflects the understanding that HBELs
should be used for assessing the risks involved in cleaning but should not be used for setting
the day-to-day control limits for use in cleaning validations. Any control limits used for
cleaning should be set based on cleaning data using statistical process control. The reasons
why HBELs should not be used for setting cleaning limits, and should only be used for risk
assessment, were explained in 2011 in a two-part article,7,8 and these are now embodied in
the ASTM E3106-18 standard. 2 Basically, while the safe residue levels calculated from
HBELs must be safe regarding patient exposure, they may be well above the level of residues
easily achievable in cleaning, even easily visible, and therefore are inappropriate to use as
cleaning limits. These same reasons apply in many cases to the 1/1,000th and 10 ppm limits
as well. Consequently, a different approach is necessary for setting control limits for these
compounds in either case.
The second paragraph states that "historical" limits should be retained as alert limits,
provided they can assure data will not exceed the HBELs. On the basis of the following
analysis, the ASTM teams do not agree with its practice. Afterward, we present our
proposed solution.
In a study of 304 pharmaceutical compounds, it was found that HBELs were absolutely
necessary for determining the safe residue levels for evaluating cleaning validations.12 For
about 15 percent of these compounds, the limits based on 1/1,000 dose criterion were found
to be not safe enough. However, for the remaining 85 percent of the compounds, the safe
residue levels calculated from HBELs start to rise well above the residue levels easily
achievable for these compounds and therefore are unsuitable for use in controlling cleaning
processes.
Figure 3 shows calculated total organic carbon (TOC) swab limits for these 304 compounds,
both for their PDEs and their corresponding historical limits. Hypothetical TOC swab data
has been superimposed on these limits at the 100 ppb level to evaluate how the data compares
to both limits. 100 ppb was chosen because it is representative of good cleaning, but typical
TOC results can be above or below this level. As described in the Figure 3 caption, there are
very few instances where the historical limits might be used as alert limits for cleaning
processes. If TOC swab results begin to approach 1 ppm (1,000 ppb), fewer and fewer
historical limits are useable, even though the process capability for the corresponding PDE
may be excellent.
Figure 3: Graph of TOC swab limits calculated for the 304 compounds for both PDE-based
limits and the 1/1,000th dose or 10 ppm-based limits (note: these TOC limits are plotted on a
logarithmic scale). A hypothetical run of TOC swab results has been superimposed on the
graph at the 100 ppb level. As we see in region ①, the 1/1,000th limits are below the TOC
swab data, so TOC cannot be used at all. There are even some compounds (region ②) that
defaulted to the 10 ppm limit, but the PDE limits are below these. In region ③ there are
many compounds where the 1/1,000 limits are well above the TOC data and their use as alert
limits is debatable. In region ④ nearly all of the historical limits (both the 1/1000th and 10
ppm) are too far above the TOC data to be used as alert limits. (Note: the 10 ppm limit is set
on the level of the batch and not as an analytical limit. The TOC limit for TOC being 10 ppm
is coincidental.) Graph created in R by Mohammad Ovais
The third paragraph then discusses how to use these "historical" limits as alert limits to
initiate investigation and corrective actions for excursions or indicate whether a cleaning
process is not in control. The following examples will show how the use of many historical
limits as alert limits will not provide appropriate warning of cleaning process issues.
TOC swab data from eight cleaning validation runs were obtained from a recent successful
cleaning validation study and used for this analysis. This is very good cleaning data with very
low TOC residues found on the equipment surfaces. An Xbar Control Chart of the data was
created using Minitab (Figure 4). An Xbar Control Chart is used to study how a process
changes over time using the averages of data plotted in sequence. The Xbar Chart in Figure 4
plots the average of seven TOC data points obtained for each cleaning run. An Xbar control
chart always has a central line derived from the average of the sample averages, an upper line
for the upper control limit (average +3 standard deviations), and a lower line for the lower
control limit (average -3 standard deviations).
Figure 4: Minitab Xbar Chart of TOC swab data for eight cleaning runs (seven swab results
each) showing that the data averaged 85 ppb of TOC. Minitab calculated an upper control
limit (UCL = 132) from the average and standard deviation of this data set. (Note: while
Minitab also calculates a lower control limit, with cleaning data only a UCL is needed.)
Xbar Control Charts are one of the basic tools of statistical process control (SPC) and have
been used to set statistical limits for controlling processes based on data collected from the
process itself. The control chart was first described by Dr. Walter Shewhart nearly 100 years
ago17 and has since been used across nearly every industry, with applications wherever data is
collected. The new ASTM E3106-18 standard promotes the use of SPC for setting cleaning
limits based on data derived from the cleaning process. ASTM E3106-18 states that data that
has been collected in the risk evaluation stage can have SPC limits calculated from this data
and that these SPC limits should then be used for monitoring cleaning processes. This use of
SPC is aligned with the FDA's Process Validation Guidance, where it states that in-process
specifications "...shall be derived from previous acceptable process average and process
variability estimates where possible and determined by the application of suitable statistical
procedures where appropriate." The FDA's Process Validation Guidance also expects
manufacturers to understand the sources of variation, detect the presence and degree of
variation, understand the impact of variation on the process, and control the variation in a
manner commensurate with the risk it represents to the process. Accordingly, the use of SPC
with cleaning data will allow manufacturers to detect, understand, and control variations in
cleaning processes in a manner commensurate with the risk. Applying SPC to cleaning data
was first suggested and described in 20118 and is now included in the ASTM E3106-18
Standard Guide under the section on Risk Control. As discussed in a previous article, this can
even be applied to HPLC data, where much data may be below detection limits.18
Our recommended solution is to establish cleaning limits based on SPC techniques. In Figure
5 we see statistical process control limits derived from the cleaning data in Figure 4 being
used as control limits, with two new hypothetical cleaning runs, nine and 10, added. Clearly
the data for these runs is significantly different from the cleaning data of the previous eight
runs and should be immediately investigated and brought back under control. Considering
that almost all HBELs would be much higher than the data shown here, there would probably
be no risk to patient safety and no product would have been compromised. Regardless, the
deviation of the data from the previously experienced data is very significant and should be
investigated.
Figure 5: Minitab Xbar Chart of TOC swab data showing the upper control limit (UCL) only
from the previous eight runs. The averages for runs nine and 10 are significantly higher than
the previous eight runs, and Minitab has flagged them as red points. Note: The LCL was
removed from this graph, as lower limits are not appropriate for cleaning data.
As the third paragraph suggests, historical limits can be used as alert limits, but this is not
reasonable or scientifically justifiable. To illustrate why historical limits are inappropriate as
alert limits, six compounds from across the spectrum of the 304 compounds shown in Figure
1 were selected from the four regions shown and swab limits for TOC were calculated for
them. In Figure 6 we see the historical limits for these six compounds shown as alert limits
for the data in this example.
Figure 6: Evaluation of the use of historical limits for six compounds as alert limits for
monitoring cleaning data. Note: Lower control limit (LCL) was not removed from this graph.
Why are these historical limits not useful as alert limits? Even the lowest of these historical
limits (Basimglurant) is more than 25X higher than the UCL (132 ppb). The others are far
worse. Although the residue data is climbing significantly above the previously obtained
residue data, none of these limits will raise an alert, as they are all too far above the data.
Compounding this problem even further is the fact that all historical limits are calculated
from several parameters (maximum daily doses, total surface areas, etc.) that can vary
significantly from product to product, manufacturing train to manufacturing train, facility to
facility, and company to company. In some cases, they will go down and in some they will go
up even more. There will never be any consistency to alert limits based on historical limits
and, what we believe is worse, historical limits do not put any focus on the cleaning process
and keeping it under control.
It is likely that the inclusion of these historical limits by the EMA was a compromise with
some companies and individuals who have objected to the movement to HBELs and want to
keep the historical approaches in place without providing a scientific, risk-based rationale. As
mentioned above, the HBEL-based limits and the historical limits are both inappropriate for
setting cleaning limits. Instead, limits should be set using SPC techniques after data has been
collected during qualification of the cleaning process and the risk has been shown to be
acceptable by evaluating the collected data against HBEL-based limits using process
capability.
We hope that the final sentence in Q&A6 will get the industry moving in the right direction
for the future of cleaning validation in that "recognized appropriate statistical methods" will
eventually become required "to determine whether the cleaning process is in control or not."
Once it is realized that the historical limits provide no real assurance of control, their
continued use should be rapidly abandoned.
We hope this discussion helps to explain why Q&A6 is still problematic and should be
modified again in the very near future to remove any references to the non-scientifically
based 1/1000th of a dose or 10 ppm in the next batch limits.
Discussion: The only change to this Q&A is the addition of the exception "If a HBEL cannot
be determined...."
A: The guideline on setting health- based exposure limits indicates that the carry over
limit should generally be derived using the human PDE. However, in cases where
there is particular concern relating to known sensitivity of a particular species (e.g.,
Monensin in horses) a Health-Based Exposure Limit (HBEL) approach taking into
account specific animal toxicity knowledge should be used. For non-highly hazardous
products, the approach described in the response to question 6 can also be applied.
Changed to Q12 (2018 Final) − What needs to be taken into account when
manufacturing Veterinary Medicinal Products for different species in the same
facility?
A: The guideline on setting health-based exposure limits indicates that the carry over
limit should generally be derived using the human HBEL.
Discussion: The change to this Q&A eliminates the last sentence in the draft, removing the
reference to non-highly hazardous products.
Q9 (2016 Draft) − How can inspectors determine the competency of the
Toxicology expert developing the health-based exposure limit?
Changed to Q4 (2018 Final) − What competencies are required for the person
developing the Health-Based Exposure Limits (HBEL)?
Discussion: HBELs must be prepared by individuals with specific education and training in
toxicology/pharmacology/pharmacotherapy and risk assessment methods who can apply the
principles of toxicology and pharmacology to deriving a safe limit. An ASTM Standard
Guide for Derivation of Health Based Exposure Limits (HBELs) is currently in development
and will provide guidance.3 A curriculum vitae (CV) should be available on request that
demonstrates the educational background (e.g., toxicology, pharmacology, medical, or other
related disciplines), any certifications such as the Diplomate of the American Board of
Toxicology (DABT) or European Registered Toxicologist (ERT), the years of experience in
the field, and any publications related to the field. While all of these are not required for a
“qualified expert,” the appropriate documentation in these areas demonstrates the expertise to
work in this area.
Q10. How can the HBEL model be applied to early phase Investigational
Medicinal Products (IMPs) where limited data is available?
A: Health- based exposure limits should be established based on all available data and
as such assessments associated with IMPs should be regularly reviewed for presence
of new data. Toxicology experts should also make judgments about the future
potential of the material to demonstrate critical effects where key toxicological testing
has not been completed (e.g., this may be based on comparison to other similar
molecules where available) and any additional adjustment factors that may be
appropriate. This would allow manufacturers to assume worst case and make sound
judgments on the level of organizational and technical control measures required.
Discussion: The change to Q&A10 dropped the requirement that "toxicology experts" predict
what HBELs should be without data in hand and what adjustment factors should be applied.
Q12 (2016 Draft) − What role do HBELs play in meeting the requirements of
GMP Chapter 5 section 20?
A: Once the health- based assessment has been completed and HBEL confirmed,
these data should be used via a Quality Risk Management process to assess if current
organisational and technical control measures are adequate, or in the case of new
equipment/facility to determine what control measures are required. It is expected that
the higher the hazard of products/active substances, the higher the inherent risk and
the more significant organisational and technical control measures will be required.
Health based exposure limits provide an accepted safe level of cross contamination
and they should be used to justify cleaning limits.
Q12 (2018 Final) - This Q&A was eliminated and changed to a different
question.
A: Manufacturers cannot just segregate highly hazardous products from other lower
risk products as a means of dealing with the risk to patient safety. This may protect
less hazardous products from contamination but it does not address the possibility for
cross contamination between highly hazardous products. The approach taken to
address cross contamination between individual highly hazardous products produced
in the same dedicated area should be justified, taking account of the clinical
application and toxicological profile of the individual products within the group of
products manufactured in the dedicated area. This should include implementation of
appropriate technical and organisational control measures.
A: Manufacturers cannot just segregate common products from other product types as
a means of dealing with the risk to patient and animal safety. Although this may
prevent contamination of other product classes, it does not address the possibility for
cross contamination within product classes. The approach taken to control cross
contamination between individual products within a class produced in the same
dedicated area should follow the principles in Q&A 3. This should include
implementation of appropriate organisational and technical control measures to
prevent contamination between such products within product specific HBELs.
Discussion: The draft Q13 was moved to Q9. The modifications may appear minor but are
very significant.
The question was changed from segregating "highly hazardous products" to segregating
"products of a common therapeutic classification." This may raise the question in the industry
of whether or not EMA considers that equipment dedication should fall along common
therapeutic classes.
The first sentence dropped the term "highly hazardous products" and "lower risk products"
and replaced them with the term "common products," which is not defined. This sentence was
also expanded to include animals. The second sentence eliminated the distinction between
"less hazardous" and "highly hazardous" products and referred to "product classes," but this
new term is also undefined. It is not clear whether the EMA may consider products that share
common equipment to be in a particular "product class" or is expecting companies to define
classes of products that can share common equipment. The third sentence was modified to
remove "... should be justified taking account of the clinical application and toxicological
profile of the individual products within the group of products manufactured in the dedicated
area" and replace it with "... should follow the principles in Q&A3." As stated above in the
discussion of Q&A3, we are relieved to see a return to a risk-based approach. Finally, the
phrase "... to prevent contamination between such products within product specific HBELs"
was added to the last sentence.
While these modifications are improvements on the original answer, it is our opinion that the
possibility of sharing equipment should be decided solely by the level of risk presented and
the ability of the company to control or mitigate that risk, not by a shared therapeutic action
or other type of classification. Conversely, the requirement to dedicate any product should be
decided solely based on the level of risk being unacceptable and/or the inability of the
company to mitigate or control the risk presented.
Q13 (2018 Final) − Should the HBEL be re-assessed throughout the phases of
development of Investigational Medicinal Products (IMPs)?
Discussion: The HBEL should be reassessed periodically when significant data emerges that
may affect the HBEL calculation and/or its assumptions. At a minimum, the IMP HBELs
should be assessed once there is an adequate dataset (repeat-dose animal studies,
mutagenicity information, PK/PD data, FIH data) and reevaluated prior to marketing when
there is additional clinical experience. Ideally, the HBEL is also periodically reviewed as
new relevant clinical data and longer nonclinical study data emerges. The HBEL review
should become a part of the cleaning validation risk management program to increase
manufacturers' maturity levels in that area.
A: Yes, except in the case of highly sensitising active substances and products.
Discussion: Q14 was eliminated. However, it should be noted that in the case of a compound
with a limited dataset, where the material may pose a mutagenicity hazard as the critical
effect, the application of the TTC of 1.5 µg/person/day could be applied as an interim limit
until additional information is available to aid in the calculation of a proper HBEL.
Once the health-based assessment has been completed and the HBEL confirmed,
these data should be used via a Quality Risk Management process to determine what
controls need to be put in place and to assess if existing organisational and technical
control measures are adequate or if they need to be supplemented. This Quality Risk
Management process should be carried out prospectively in the case of new
equipment/facility to determine what control measures are required.
The level of detail in the QRM process should be commensurate with the potential
harm as indicated by the HBEL and the suitability of control measures supported by
practical and science-based evidence.
Discussion: This new Q3 is a very welcome return to the application of science and risk to
cleaning that had its beginnings in the Risk-MaPP Guide and is now formalized in the new
ASTM E3106-18 standard. The ASTM E3106-18 science- and risk-based approach as
applied to cleaning has been described in more detail in the article Measuring Risk in
Cleaning: Cleaning FMEA and the Cleaning Risk Dashboard.16 That article describes how
the toxicity scale for HBELs can be combined with a process capability scale derived from
data collected during cleaning to quantify the risk involved in a particular cleaning process so
that appropriate control measures can be put in place. The use of detectability scales for the
appropriate selection of analytical methods, including visual inspection, is also described.
This article provides more detail than is contained in the ASTM E3106-18 standard, along
with a case study, on how to use the HBEL and implement a science- and risk-based
approach that is completely consistent with EMA's new guidance in Q3.
Discussion: This new Q5 is a welcome clarification that it is the responsibility of the contract
giver/contracting company to provide appropriate quality support to the contract
acceptor/contracted company, which includes the provision of HBELs or data to support their
derivation.
the repeatability of the cleaning process (manual cleaning is generally less repeatable
than automated cleaning);
the hazard posed by the product;
whether visual inspection can be relied upon to determine the cleanliness of the
equipment at the residue limit justified by the HBEL.
Q8. What are the requirements for conducting visual inspection as per Q&A 7?
A. When applying visual inspection to determine cleanliness of equipment,
manufacturers should establish the threshold at which the product is readily visible as
a residue. This should also take into account the ability to visually inspect the
equipment, for example, under the lighting conditions and distances observed in the
field.
Visual inspection should include all product contact surfaces where contamination
may be held, including those that require dismantling of equipment to gain access for
inspection and/or by use of tools (for example, mirror, light source, boroscope) to
access areas not otherwise visible. Non-product contact surfaces that may retain
product that could be dislodged or transferred into future batches should be included
in the visual inspection.
Written instructions specifying all areas requiring visual inspection should be in place
and records should clearly confirm that all inspections are completed.
Discussion: These two new Q&As come as quite a surprise since there was no open
discussion or warning given to industry about these new topics being included in the final
version of the Q&A. The opening statement in Q7 will come as a real surprise to many in the
industry in that the EMA states that it now requires analytical testing after product
changeover. This adds a new burden to industry, as not all manufacturers have been
performing testing after every changeover once cleaning validation studies have been
completed, and it undermines the validity of cleaning validation programs.
However, the alternative offered to this is to perform a documented QRM process that meets
three criteria. These three criteria must include documentation of the following:
It would have been very helpful for both industry and the EMA for the agency to have
solicited input on visual inspection prior to releasing a final document. Fortunately, two of
this article's authors recently published an article on how to demonstrate whether visual
inspection is acceptable at a facility where the cleaning process was shown to be very
repeatable, the hazard level was very low, and the cleaning of the manufacturing equipment
was very safe (points 1 and 2). This article also showed how to qualify a large number of
personnel in a reasonable manner that is statistically valid and can be easily documented and
maintained (point 3).15 Subsequent articles expanding on this work are planned.
Members of the ASTM cleaning and HBEL teams have already published a series of articles
that detail how to evaluate and measure the risk in cleaning15, 17-20 and described how a
cleaning control strategy could be developed that includes visual inspection.21 So, these two
Q&As are very welcome.
Additionally, it should be noted that instead of performing sampling and analytical testing for
residuals upon completion of validation, the cleaning team should be encouraged to set up a
continued verification program using visual inspection limits outlined above, as well as
trending of cleaning results. This would instill a culture of cleaning process understanding
and improve knowledge management in this area [ICH Q10].
Summary
These Q&As and their updates mark an important turning point in the pharmaceutical
industry regarding cleaning validation and QRM. The release of the Q&As coincides with the
publication of the ASTM E3106-18 Standard Guide for Science-Based and Risk-Based
Cleaning Process Development and Validation and the upcoming publication of the ASTM
Standard Guide for the Derivation of Health-Based Exposure Limits (HBELs). These new
standards will provide guidance to the industry on the appropriate development of HBELs,
for the implementation of science- and risk-based approaches to cleaning validation, and on
the use of statistical techniques for measuring the risk in cleaning and assessing the
effectiveness of cleaning processes.
Recently, Graeme McKilligan, a senior GMP inspector in the Medicines and Healthcare
products Regulatory Agency (MHRA), posted a statement on the MHRA blog that clarified
the EMA's position on the Q&As and their relationship to cleaning validation. 22 McKilligan
makes it clear that HBELs are now required and the industry should proceed with
implementation:
"...it is important that those manufacturers who have not developed HBELs (such as
PDE) should do so without delay. Whilst some pragmatism will be applied for
manufacturers with products that present lesser hazard, those manufacturers with
products likely to have lower HBEL (toward the dark orange and red area of the
continuum diagram) should complete these assessments without delay (a triage
approach may be helpful for prioritisation)". (emphasis added)
From the authors’ perspective this is welcome news, but, unfortunately, there has been
resistance within the industry against moving to more science-based, risk-based, and
statistics-based approaches to cleaning validation and setting limits, and arguments have been
made for remaining with the traditional approaches. While these arguments against the
science for moving to HBELs have been effectively addressed,7-12 there are other reasons for
this resistance that need to be addressed as well.
Some of the resistance has come from the belief that HBELs are an unnecessary financial
burden. Regardless of cost, the undeniable importance for determining HBELs is to identify
any compounds that pose a high hazard to patient safety so that appropriate controls will be
utilized, so this financial argument is without merit. The only way to demonstrate that a
compound is not a high hazard is to derive an HBEL for it, so all medicinal products must
have an HBEL. However, and as was anticipated, the initial determinations of HBELs have
already shown that about 85 percent of drugs have had cleaning limits that were too low, in
many cases much too low, and these have been causing unnecessary operational difficulties
for many years.12 By using HBELs in an ICH Q9 risk-based approach, such as described in
ASTM E3106, companies manufacturing low-hazard/low-risk compounds would be able to
justify minimizing their cleaning validation efforts, reducing their operational costs and even
possibly converting to using only visual inspection.21 Although there is an initial cost
involved in the derivation of an HBEL, there can be significant cost benefits to be gained
from implementing them.
But based on companies' experience with inspections, which are often unpredictable, most
companies will still be hesitant to take advantage of such operational benefits to minimize
their cleaning validation efforts or to implement visual inspection. Many companies will
adopt a "wait and see" stance. There are issues with regulators and inspectors, as well. Most
companies will not make any changes without knowing how inspectors are going to react to
the changes. Although regulators have embraced and adopted ICH Q9, there is no clear
evidence that they will accept the second principle that "the level of effort, formality
and documentation of the quality risk management process should be commensurate
with the level of risk." This also applies to the validation efforts (II.6 Quality Risk
Management as Part of Production - Validation). As mentioned above, low-hazard/low-risk
companies could reduce their validation efforts. But if inspectors don't respect this aspect of
risk management in their inspections, there is little incentive for companies to adopt science-
based, risk-based, and statistics-based approaches. So, regulators need to publicly support the
reduction in validation efforts/documentation for lower-hazard/lower-risk companies that
implement "science-, risk-, and statistics-based approaches" to cleaning. Q&As 7 and 8 are
good examples of where a regulator has made such a start, and the EMA should be applauded
for that.
This brings us to issues around the ongoing conflicts in regulatory guidance. The EMA has
been proactive in writing the original guideline on HBELs and updating Annex 15 to require
a "toxicological evaluation," which PIC/S subsequently adopted along with the guidance on
HBELs. The health authorities of 49 countries are part of PIC/S, including the FDA.
However, the FDA still has its original 1993 cleaning guide23 in use, which refers to the older
approaches, but it has now suggested that the 2011 Process Validation Guidance applies to
cleaning. 24 The FDA has also adopted ICH Q9, and this is very significant for the ASTM
E3106 since it was structured on both ICH Q9 and the FDA's 2011 guidance. But it is not
clear whether the FDA will follow Annex 15 or its 2011 guidance. In addition to this, there
remains guidance from other regulatory organizations, such as WHO25 and Health Canada,26
that still refers to the older approaches, and many companies are confused on how to handle
this. If regulators could consolidate all their guidance around the HBEL, that would be very
helpful.
There has also been an argument against the need for HBELs from the standpoint of
pharmacovigilance. Basically, some in industry have stated that there is no indication from
the ongoing pharmacovigilance that there are any problems using the traditional limits.
However, it is not possible for pharmacovigilance to reveal problems with cleaning. A
cleaning failure may go completely undetected during processing, and hazardous residues
may cross contaminate a product and even reach a patient.27 While the Viracept problem was
detected by patients from the obvious odor, there is a strong probability that similar events
have taken place and gone undetected, even though patients may have been adversely
affected. How would a physician know that a complaint, or even an adverse reaction in a
patient, is due to residues from some other product that the physician has no knowledge of?
How would pharmacovigilance detect an adverse reaction to a cleaning validation failure?
Regardless, it is poor QRM to wait for a problem to happen before taking preventive action.
This is why the ASTM E3106 standard recommends that cleaning procedures are subjected to
risk assessments (e.g., FMEA/FMECA) to identify possible failure modes and take
preventive actions before these failures can happen. Such risk assessments should take the
HBEL into consideration.15, 17
Some of the building blocks to create a triage approach for prioritization have already been
put in place. The first, a toxicity scale mathematically derived directly from the HBEL, has
been developed that allows HBELs to be displayed together on a scale from 0 to 10, where
their relative hazards can be quickly and easily visualized.15 A process capability scale has
also been developed that allows existing cleaning data to be evaluated against their HBELs
and displayed together also on a scale from 0 to 10 where the relative reliability of the
cleaning processes can also be quickly and easily visualized.18 These two scales provide
visualization of the two main aspects of risk, that is, hazard plus exposure. Two detectability
scales were also developed for total organic carbon analysis and visual inspection (but can be
used for any analytical method) that can visualize the justification for these analytical
methods to be used with the compound and the acceptability of any cleaning data that has
been collected. Finally, these scales can be combined into a cleaning risk dashboard that can
allow simple and easy visualization of the level of risk posed by any cleaning process.16
Spreadsheets have been developed to create all of these scales and display the results. They
are available to the industry free of charge.28-30 Indeed, these scales should be used together,
as a high hazard in itself is not necessarily a direct indicator of high risk, as the level of
exposure must also be known.31 Using these spreadsheets is simple and straightforward.
Implementing a triage approach using these scales could be the first step in a transitional
period for companies to move from a traditional to a science-based approach to cleaning
validation.
While such an approach using the HBELs would be science- and risk-based and relatively
simple to implement, it should be understood that the industry rarely takes the initiative
unless spurred on by regulator action. Regulators are already requesting risk assessments, and
this simple approach can quickly and easily identify whether a company has cleaning risks or
not.
Peer Review
The authors wish to thank our peer reviewers: Bharat Agrawal James Bergum, Ph.D.;
Gabriela Cruz, Ph.D.; Mallory DeGennaro; Parth Desai; Kenneth Farrugia; Ioanna-Maria
Gerostathi; Miquel Romero Obon; Laurence O'Leary; and Joel Young for reviewing this
article and for their insightful comments and helpful suggestions.
References
1. EMA Guideline on Setting Health Based Exposure Limits for Use in Risk
Identification in the Manufacture of Different Medicinal Products in Shared Facilities,
EMA/CHMP/CVMP/ SWP/169430/2012, 20 November 2014.
www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2014/11/W
C500177735.pdf.
2. American Society for Testing and Materials (ASTM) E3106-17 "Standard Guide for
Science-Based and Risk-Based Cleaning Process Development and Validation"
www.astm.org.
3. ASTM Work Item WK59975 - the Derivation of Health Based Exposure Limits
(HBELs)
7. Walsh, Andrew. (2011). "Cleaning Validation for the 21st Century: Acceptance
Limits for Active Pharmaceutical Ingredients (APIs): Part I", Pharmaceutical
Engineering, July/August 2011. Vol. 31 (No. 4),
8. Walsh, Andrew. (2011). "Cleaning Validation for the 21st Century: Acceptance
Limits for Active Pharmaceutical Ingredients (APIs): Part II", Pharmaceutical
Engineering, September/October 2011. Vol. 31 (No. 5)
11. Crevoisier, M. et. al., "Cleaning Limits - Why The 10-ppm Criterion Should Be
Abandoned," Pharmaceutical Technology January 2016, Vol. 40 (No. 1): 52-56
12. Walsh, Andrew, Michel Crevoisier, Ester Lovsin Barle, Andreas Flueckiger, David
G. Dolan, Mohammad Ovais (2016) "Cleaning Limits—Why the 10-ppm and 0.001-
Dose Criteria Should be Abandoned, Part II," Pharmaceutical Technology 40 (8).
13. Faria, E. C., Bercu, J. P., Dolan, D. G., Morinello, E. J., Pecquet, A. M., Seaman, C., .
. . Weideman, P. A. (2016). Using default methodologies to derive an acceptable
daily exposure (ADE). Journal of Regulatory Toxicology and Pharmacology, 79 Suppl
1.
14. ISPE Baseline® Guide: Risk-Based Manufacture of Pharmaceutical Products: A
Guide to Managing Risks Associated with Cross-Contamination. (ISPE, Tampa, FL,
First Edition, 2010), Vol. 7, p. 186.
15. Walsh, Andrew, Michel Crevoisier, Ester Lovsin Barle, Andreas Flueckiger, David
G. Dolan, Mohammad Ovais, Osamu Shirokizawa, and Kelly Waldron "An ADE-
derived Scale for Assessing the Risk of Compound Carryover in Shared Facilities,"
PharmaceuticalOnline May 22, 2017
16. Walsh, Andrew, Thomas Altmann, Alfredo Canhoto, Ester Lovsin Barle, David G.
Dolan, Andreas Flueckiger, Igor Gorsky, Jessica Graham, Ph.D., Robert Kowal,
Mariann Neverovitch, Mohammad Ovais, Osamu Shirokizawa and Kelly Waldron
"Measuring Risk in Cleaning: Cleaning FMEA and the Cleaning Risk Dashboard,"
PharmaceuticalOnline April 2018
17. Best M. & Neuhauser D., "Walter A Shewhart, 1924, and the Hawthorne factory,"
Quality and Safety in Health Care. 2006 Apr;15(2):142-
3 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2464836/.
18. Walsh, Andrew, Ester Lovsin Barle, David G. Dolan, Andreas Flueckiger, Igor
Gorsky, Robert Kowal, Mohammad Ovais, Osamu Shirokizawa and Kelly Waldron
"A Process Capability-derived Scale for Assessing the Risk of Compound Carryover
in Shared Facilities," PharmaceuticalOnline, August 2017
19. Walsh, Andrew, Thomas Altmann, Alfredo Canhoto, Ester Lovsin Barle, David G.
Dolan, Andreas Flueckiger, Igor Gorsky, Robert Kowal, Mariann Neverovitch,
Mohammad Ovais, Osamu Shirokizawa and Kelly Waldron "An MSSR-derived Scale
for Assessing the Detectability of Visual Inspection," PharmaceuticalOnline,
December 2017
20. Walsh, Andrew, Thomas Altmann, Alfredo Canhoto, Ester Lovsin Barle, David G.
Dolan, Andreas Flueckiger, M.D., Igor Gorsky, Robert Kowal, Mariann Neverovitch,
Mohammad Ovais, Osamu Shirokizawa and Kelly Waldron. "A Swab Limit-Derived
Scale For Assessing The Detectability Of Total Organic Carbon Analysis,"
PharmaceuticalOnline, January 2018
21. Andrew Walsh, Dongni (Nina) Liu and Mohammad Ovais, "Justification and
Qualification of Visual Inspection for use in Cleaning Validation for a Low Risk,
Multi-Product Facility," PharmaceuticalOnline, August 2018
22. https://mhrainspectorate.blog.gov.uk/2018/10/22/cross-contamination-control-and-
health-based-exposure-limits-hbel-qas/
24. International Pharmaceutical Quality October 29, 2015 (Clarifying Questions Upfront
is Key in Process Validation, US and EU PV Principles in Alignment —CDER’s
McNally) https://www.ipqpubs.com/2015/10/29/clarifying-questions-upfront-is-key-
in-process-validation-us-and-eu-pv-principles-in-alignment-cders-mcnally-stresses/
25. World Health Organization - WHO Technical Report Series, No. 937, 2006 Annex
4 Supplementary guidelines on good manufacturing practices: validation, Appendix 3
Cleaning validation
26. Health Canada: Health Products and Food Branch Inspectorate Guidance Document -
Cleaning Validation Guidelines GUIDE-0028
30. Spreadsheet for create Detectability Scales from TOC and Visual Inspection
Detection Limits -
https://www.researchgate.net/publication/324260010_Spreadsheet_for_create_Detect
ability_Scales_from_TOC_and_Visual_Inspection_Detection_Limits
31. Lovsin-Barle, Ester and Andrew Walsh, "Are high potency active pharmaceutical
ingredients (HPAPI) also high risks for cross-contamination?" December
2017Chimica oggi 35(6)
فایلهای اضافه برای بعد
https://www.pharmaceuticalonline.com/doc/developing-a-science-risk-statistics-based-approach-
to-cleaning-process-development-validation-0001
https://www.pharmaceuticalonline.com/doc/an-mssr-derived-scale-for-assessing-detectability-of-
visual-inspection-0001
https://www.pharmaceuticalonline.com/doc/justification-qualification-of-visual-inspection-for-
cleaning-validation-in-a-low-risk-multiproduct-facility-0001
https://www.pharmaceuticalonline.com/doc/using-preliminary-hazard-analysis-to-determine-
equipment-and-instrument-requalification-frequency-0001