Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Cap 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

OVERVIEW

The Scope of Experimental Design


I. Introduction
Designing an experimental strategy
The purpose of statistical design
A pharmaceutical example
II. Plan of the book
Stages in experimentation
Designs and methods described
Examples in the text
Statistical background needed
III. Starting out in experimental design
Some elementary definitions
Some important concepts
Recommended books on experimental design
Choosing computer software

I. INTRODUCTION

A. Designing an Experimental Strategy

In developing a formulation, product or process, pharmaceutical or otherwise, the


answer is rarely known right from the start. Our own past experience, scientific
theory, and the contents of the scientific and technical literature may all be of help,
but we will still need to do experiments, whether to answer our questions or to
confirm what we already believe to be the case. And before starting the
experimentation, we will need to decide what the experiment is actually going to
be. We require an experimental strategy.
Any experimentalist will, we hope and trust, go into a project with some
kind of plan of attack. That is, he will use an experimental design. It may be a
"good" design or it may be a "bad" one. It may even seem to be quite random, or
at least non-systematic, but even in these circumstances the experimenter, because

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


of his experience and expertise or pharmaceutical intuition, may arrive very quickly
at the answer. On the other hand he could well miss the solution entirely and waste
weeks or months of valuable development time. The "statistical" design methods
described in this book are ways of choosing experiments efficiently and
systematically to give reliable and coherent information. They have now been
widely employed in the design and development of pharmaceutical formulations.
They are invaluable tools, but even so they are in no way substitutes for experience,
expertise, or intelligence.
We will therefore try to show where statistical experimental design methods
may save time, give a more complete picture of a system, define systems and allow
easy and rapid validation. They do not always allow us to find an individual result
more quickly than earlier optimization methods, but they will generally do so with
a far greater degree of certainty.

Experimental design can be defined as the strategy for setting up experiments in


such a manner that the information required is obtained as efficiently and precisely
as possible.

B. The Purpose of Statistical Design

Experimentation is carried out to determine the relationship (usually in the form of


a mathematical model) between factors acting on the system and the response or
properties of the system (the system being a process or a product, or both). The
information is then used to achieve, or to further, the aims of the project
We therefore aim to plan and carry out the experiments needed for the
project, or part of a project, with maximum efficiency. Thus, use of the budget (that
is, the resources of money, equipment, raw material, manpower, time, etc. that are
made available) is optimized to reach the objectives as quickly and as surely as
possible with the best possible precision, while still respecting the various
restrictions that are imposed.
The purpose of using statistical experimental design, and therefore of this
book, is not solely to minimize the number of experiments (though this may well
happen, and it may be one of the objectives under certain circumstances).

C. A Pharmaceutical Example

1. Screening a large number of factors

We now consider the extrusion-spheronization process, which is a widely used


method of obtaining multiparticulate dosage forms. The drug substance is mixed
with a diluent, a binder (and possibly with other excipients), and water, and
kneaded to obtain a wet plastic mass. This is then extruded through small holes to
give a mass of narrow pasta-like cylinders. These are then spheronized by rapid

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


rotation on a plate to obtain more or less spherical particles or pellets of the order
of 1 mm in diameter.
This method depends on a large number of factors, and there are many
examples of the use of experimental design for its study in the pharmaceutical
literature. It is treated in more detail in the following two chapters. Some of these
factors and possible ranges of study are shown in table 1.1. At the beginning of a
project for a process study, the objective was to discover, as economically as was
reasonably possible, which of these factors had large effects on the yield of pellets
of the right size in order to select ones for further study. We shall see that this
involves postulating a simple additive or first-order model.

Table 1.1 Factors in Extrusion-Spheronization

Factor Lower limit Upper limit


% ot binder % 0.5 1
amount of water % 40 50
granulation time min 1 2
spheronization load kg 1 4
spheronization speed rpm 700 1100
extruder speed rpm 15 60
spheronization time mm 2 5

Varying one factor at a time


One way of finding out which factors have an effect would be to change them one
at a time. We carry out, for example, the experiment with all factors at the lower
level, shown in the third column of table 1.1 (experiment 1, say). Then we do
further experiments, changing each factor to the upper limit in turn. Then we may
see the influence of each of these factors on the yield by calculating the difference
in yield between this experiment and that of experiment 1. This is the "one-factor-
at-a-time" approach, but it has certain very real disadvantages:

• Eight experiments are required. However the effect of each factor is


calculated from the results of only 2 experiments, whatever the total number
of experiments carried out. The precision is therefore poor. None of the
other experiments will provide additional information on this effect.
• We cannot be sure that the influence of a given factor will be the same,
whatever the levels of the other factors. So the effect of increasing the
granulation time from 1 to 2 minutes might be quite different for 40% water
and 50% water.
• If the result of experiment 1 is wrong, then all conclusions are wrong.
(However, experiment 1, with all factors at the lower level could be
replicated. Nor does each and every factor need to be examined with respect
to this one experiment.)

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


The one-factor-at-a-time method (which is in itself an experimental design) is
inefficient, can give misleading results, and in general it should be avoided. In the
vast majority of cases our approach will be to vary all factors together.

Varying all factors together


A simple example of this approach for the above extrusion-spheronization problem,
one where each factor again takes only 2 levels, is given in table 1.2.

Table 1.2 Experimental Plan for Extrusion-Spheronization with All Factors Varied
Together

Run Binder Water Granul. Spher. Spher. Extrud. Spher.


no. (%) (%) time load speed speed time
(min) (kg) (rpm) (rpm) (min)
1 0.5 40 1 2 700 60 5
2 1.0 40 1 1 1100 15 5
3 1.0 50 1 1 700 60 2
4 0.5 50 2 1 700 15 5
5 1.0 40 2 2 700 15 2
6 0.5 50 1 2 1100 15 2
7 0.5 40 2 1 1100 60 2
8 1.0 50 2 2 1100 60 5

Examination of this table shows that no information may be obtained by comparing


the results of any 2 experiments in the table. In fact, to find the effect of changing
any one of the factors we will need to use the results of all 8 experiments in the
design. We will find in chapter 2 that, employing this design:

• The influence of each factor on the yield of pellets is estimated with a far
higher precision than by changing the factors one at a time. In fact, the
standard error of estimation is halved. To obtain the same precision by the
one-factor-at-a-time method each experiment would have to be done 4
times.
• The result of each experiment enters equally into the calculation of the
effects of each factor. Thus, if one experiment is in error, this error is
shared evenly over the estimations of the effects, and it will probably not
influence the general conclusions.
• The number of experiments is the same as for the one-factor-at-a-time
method.

Clearly, this second statistical design is far better than the first. The screening
design is adapted to the problem, both to the objectives (that is, screening of
factors) and to the constraints (7 factors studied between maximum and minimum

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


values). If we had been presented with a different problem, needing for example to
compare in addition the effects on the yield of 3 different binders, or those of 3
models of spheronizing apparatus, then the design would have needed to have been
quite different. Designs and strategies for all such situations are given in the various
chapters.

2. More complex designs

The above design would have allowed average effects to be estimated with
maximum efficiency. However the effect of changing the spheronization speed
might be quite different depending on whether the extrusion rate is high or low
(figure 1.1). For a more complete analysis of the effects, experiments need to be
carried out at more combinations of levels. The equivalent to the one-factor-at-a-
time approach would involve studying each pair of factors individually. Again we
will find that a global solution where all factors are varied together will be the most
efficient, requiring fewer experiments and giving more precise and reliable
estimations of the effects. One of a number of possible approaches might be to take
the original design of table 1.2 and carry out a complementary design of the same
number of runs, where all the levels of certain columns are inverted. The 16
experiments would give estimations of the 7 effects, and also information on
whether there are interactions between factors, and some indications (not without
ambiguity) as to what these interactions might be.

spheronization
30 ——
high speed

CO

1 20
CD
Q. low speed
0
c 10

low high
extrusion rate
Figure 1.1 Percentage mass of particles below 800 pm: main effect of
spheronization speed and interaction with extrusion rate.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


In the same way the sequential one-run-at-a-time search for an optimized
formulation or process is difficult and often inefficient and unreliable. Here also,
a structured design where all factors (less factors than for screening) are changed
together, and the data are analysed only at the end, results in a more reliable and
precise positioning of the optimum, allowing prediction of what happens when
process or formulation parameters are varied about their optimum values,
information almost totally lacking when the alternative approach is used. Only
experimental design-based approaches to formulation or process development result
in a predictive model. These also are highly efficient in terms of information
obtained for the number of experimental runs carried out. Optimization on the three
most influential factors would probably require between 10 and 18 runs, depending
on the design.

II. PLAN OF THE BOOK

It is necessary to say something to the reader about how this book is organized.
There are a number of excellent texts on experimental design and the fact that we
have sometimes approached matters differently does not indicate that we think our
approach better, only that it may be a useful alternative.

A. Stages in Experimentation

The various steps of an experiment using statistical design methodology are


typically those given in table 1.3. The many examples in this book will be
described for the most part according to such a plan. It is simplified, as there is
almost certain to be a certain amount of "coming and going", with revision of
previous stages. For example, it might be found that all designs answering the
objectives were too large, with respect to the available resources. All stages in the
process should be formally documented and verified as they are approached in turn
by discussion between the various parties concerned. We will take this opportunity
to emphasise the importance of the first two (planning) stages, consisting of a
review of the available data, definition of the objectives of the experimentation and
of likely subsequent stages, and identification of the situation (screening, factor
study, response surface methodology etc.).
Statistical design of experiments cannot be dissociated from the remainder
of the project and it is necessary to associate all the participants in the project -
from the project manager to those who actually carry out the experiments. It is
important that the "statistician" who sets up the experimental design be integrated
in the project team, and in particular that he is fully aware of the stages that
preceded it and those that are likely to follow on after. Planning of the experiment
is by far the most important stage. For a full discussion of the planning of a
"designed" experiment see for example the very interesting article by Coleman and
Montgomery (1), the ensuing discussion (1), and an analysis by Stansbury (2).

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


It is necessary for the entire team to meet together, to define the objectives
and review existing data, before going to the next stage. Before attempting to draw
up any kind of protocol, it will be necessary to identify the situation (such as
screening, factor studies or optimizationj. The book is to a major extent organized
around them, as described in the next section. Most improper uses of experimental
designs result from errors in or neglect of these planning stages, that is, insufficient
analysis of the problem or a wrong identification of the situation or scenario.

Table 1.3 Stages in a Statistically Designed Experiment

1 Description of problem 7 Constraints and limits (definitive)


2 Analysis of existing data 8 Experimental design matrix
3 Identification of situation 9 Experimental plan
4 List of factors (definitive) 10 Experimentation
5 List of experimental responses 11 Analysis of data
6 Mathematical model 12 Conclusions and next stage

There follows a detailed listing of the variable and fixed factors, and the constraints
operating, which together make up the domain of experimentation. Then, and only
then, can the model and experimental design be chosen, and a protocol
(experimental plan) be drawn up. These steps are defined in section II.B of this
chapter.

B. Designs and Methods Described

1. The "design situation"

It is essential to recognise, if one is to choose the right design, treatment or


approach, in what situation one finds oneself. Do we want to find out which factors
amongst a large number of factors are significant and influence or may influence
the process or formulation? If so the problem is one of screening and is covered
to a major extent in chapter 2. If we have already identified 4 to 5 factors which
have an influence, we may then wish to quantify their influence, and in particular
discover how the effect of each factor is influenced by the others. A factor
influence study is then required. This normally involves a factorial design (chapter
3). If on the other hand we have developed a formulation or process but we wish
to predict the response(s) within the experimental domain, then we must use an
appropriate design for determining mathematical models for the responses. This and
the method used (response surface methodology or RSM) are covered in chapter 5.
These 3 subjects are closely related to one another and they form a continuous
whole.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


Designs discussed in detail in these chapters are given in full in the text. Where the
rules for constructing them are simple, the designs for a large number of factors
may be summarized in a single table. Other designs are tabulated in the appendices.

2. Optimization

Screening, factor studies and RSM are all part of the search for a product or
process with certain characteristics. We are likely to require a certain profile or a
maximized yield, or to find the best compromise among a large number of
sometimes conflicting responses or properties. We show in chapter 6 how to
identify the best combination of factors by graphical, algebraic, and numerical
methods, normally using models of the various properties of the system obtained
by the RSM designs. However, we will also indicate briefly how the sequential
simplex optimization method may be integrated with the model-based approach of
the rest of the book.

3. Process and formulation validation

Validation has been a key issue in the industry for some time and it covers the
whole of development. Since validation is not an activity that is reserved for the
end of development, but is part of its very conception, systematic use of statistical
design in developing a formulation or process ensures traceability, supports
validation, and makes the subsequent confirmatory validation very much easier and
more certain. It is discussed in a number of the later chapters, especially in the
final section of chapter 6.

4. Quality of products and processes

There is variability in all processes. It is assumed constant under all conditions in


the first part of the book, but in chapter 7 we look at the concept in more detail.
First we will describe how to use the methods already described under
circumstances of non-constant variability. This is followed by the study of
variability itself with a view to minimizing it.
This leads to the discussion of how modern statistical methods may be
introduced to assure that quality is built into the product. For if experimental design
has become a buzz-word, quality is another. The early work in this field was done
in Japan, but the approach of Taguchi and others in building a quality that is
independent of changes in the process variables has been refined as interest has
widened. These so-called Japanese methods of assuring quality are having a large
effect on engineering practice both in Europe and North America. It seems that the
effect on the pharmaceutical industry is as yet much less marked - although there
is much interest, it is not yet transformed into action. This may happen in time.
In the meanwhile we explore some ways in which Taguchi's ideas may be
applied using the experimental design strategy described in the rest of the book, and
the kind of problems in pharmaceutical development that they are likely to help
solve, in assuring reliable manufacture of pharmaceutical dosage forms.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


5. Experimental design quality

The concept of the quality of an experimental design is an essential one, and is


introduced early on. Most of the current definitions of design quality, optimality,
and efficiency are described more formally and systematically in chapter 8, leading
to methods for obtaining optimal experimental designs for cases where the standard
designs are not applicable. The theory of the use of exchange algorithms for
deriving these designs is discussed in very simple terms. Mathematical derivations
are outside the scope of the book and the reader is referred to other textbooks, or
to the original papers.

6. Mixtures

The analysis and mathematical modelling of mixtures is significant in


pharmaceutical formulation and these present certain particular problems. The final
chapters cover methods for optimizing formulations and also treat "mixed" problems
where process variables and mixture composition are studied at the same time. In
both chapters 9 and 10 we continue to illustrate the graphical and numerical
optimization methods described earlier.

7. Some general comments on the contents

The emphasis throughout this book is on those designs and models that are useful,
or potentially useful, in the development of a pharmaceutical formulation. This is
why such topics as asymmetric factorial designs and mixtures with constraints are
discussed, and why we introduce a wide variety of second-order designs for use in
optimization. We stress the design of experiments rather than analysis of
experimental data, though the two aspects of the problem are intimately connected.
And we indicate how different stages are interdependent, and how our choice of
design at a given stage depends on how we expect the project to continue, as well
as on the present problem, and the knowledge already obtained.

C. Examples in the Text

1. Pharmaceutical examples

Many of the examples, taken either from the literature or from unpublished studies,
are concerned with development of solid dosage forms. Particular topics are:
• drug excipient compatibility screening,
• dissolution testing,
• granulation,
• tablet formulation and process study,
• formulation of sustained release tablets,
• dry coating for delayed release dosage forms,
• extrusion-spheronization,

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


• nanoparticle synthesis,
• microcapsule synthesis,
• oral solution formulation,
• transdermal drug delivery.

However, the methods used may be applied, with appropriate modifications, to the
majority of problems in pharmaceutical and chemical development. Some of the
general themes are given below.

2. Alternative approaches to the same problem

There is such a variety of designs, of ways of setting up the experiments for


studying a problem, each with its advantages and disadvantages, that we have
indicated alternative methods and in some cases described the use of different
designs for treating the same problem. For example, a study of the influence of
various factors - type of bicarbonate used, amount of diluent, amount of acid,
compression - on the formulation of an effervescent paracetamol tablet is described
in chapter 3. A complete factorial design of 16 experiments was used, and in many
circumstances this would be the best method. However, it would have been possible
to investigate the same problem using at least 4 other designs that would each have
given similar information about the formulation, information of lesser quality it is
true, but also requiring fewer experiments. The experiments of all these other
smaller designs were all found in the design actually used, so the results of the 5
methods could be compared using the appropriate portion of the data. We shall see,
however, that there is no need for any actual experimental data in order to compare
the quality of information of the different methods.
The problem of excipient compatibility screening is also discussed and it is
shown how, according to the different numbers of diluents, lubricants, disintegrants,
glidants requiring testing, we need to set up different kinds of screening designs.
There is usually a variety of possible methods for treating a given problem
and no one method is necessarily the best. Our choice will depend in part on the
immediate objectives, and the immediate restraints that operate, but it will also be
influenced by what we see as the likely next stage in the project. The method which
is the most efficient if all goes well, is not always going to be the most efficient
if there are problems. In this sense, design is not totally unlike a game of chess!

3. Linking designs

No experimental design exists on its own, but it is influenced by the previous phase
of experimentation and the projected future steps. Its choice depends partly on the
previous results. The strategy is most effective if statistical design is used in most
or all stages of development and not only for screening, or optimizing the
formulation or the process. It is sometimes possible to "re-use" experiments from
previous studies, integrating them into the design, thus achieving savings of time
and material. This may sometimes be anticipated.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


For example, certain designs make up a part of other more complex designs,
that may be required in the next stage of the project. We will illustrate these links
and this continuity between steps as some of the examples given will be referred
to in a number of the chapters. For example, the central composite design used in
response surface modelling and optimization may be built up by adding to a
factorial design. This sequential method is simulated using a literature example, the
formulation of a oral solution. A part of the data were given and analysed in
chapter 3 in order to demonstrate factor influence studies. Then in chapter 5 all the
data are given so as to show how the design might be augmented to enable
response-surface modelling. Finally, the estimated models are used to demonstrate
graphical optimization in chapter 6.
Another factorial design, used for studying solubility in mixed micelles,
introduces and demonstrates multi-linear regression and analysis of variance. It is
then extended, also in chapter 5, to a central composite design to illustrate the
estimation of predictive models and their validation.

4. Building designs in stages

In addition to the possibility of reusing experiments in going from one stage to


another, it is worth noting that many designs may be carried out in steps, or blocks.
It has already been mentioned that extrusion-spheronization, as a method for
producing multiparticulate dosage forms, has been much studied using statistical
experimental design. We use it here to introduce methods for choice and
elimination of factors (factor influence studies), and at the same time, to
demonstrate the sequential approach to design. A factor-influence study is carried
out in several stages. The project may therefore be shortened if the first step gives
sufficient information. It may be modified or, at the worst, abandoned should the
results be poor, or augmented with a second design should the results warrant it.
Quite often we may be unable to justify carrying out a full design at the
beginning of a project. Yet with careful planning, the study may be carried out in
stages, with all parts being integrated into a coherent whole.

5. Different methods of optimization

When optimizing a formulation or process, there are a number of different methods


for tackling the problem and the resulting data may also be analysed in a number
of different ways. By demonstrating alternative treatments of the same data, we will
show advantages and weaknesses of the various optimization methods and how they
complement one other.
For example, the production of pellets by granulation in a high-speed mixer
is used to illustrate properties of the uniform shell (Doehlert) design, and it is
shown how the design space may be expanded using this kind of design. The
resulting mathematical models are also used to demonstrate both the optimum path
and canonical analysis methods for optimization. Both graphical and numerical
optimization are described and compared for a number of examples: an oral liquid

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


formulation, nanosphere synthesis, a tableting process study, and a placebo tablet
formulation.

D. Statistical Background Needed

We aim to give an ordered, consistent approach, providing the pharmaceutical


scientist with those experimental design tools that he or she really needs to develop
a product. But much of the theory of statistical analysis is omitted, useful
background information though it may be. A basic understanding of statistics is
assumed - the normal distribution, the central limit theorem, variance and standard
deviation, probability distributions and significance tests. The theory of distributions
and of significance testing, if required, should be studied elsewhere (3, 4, 5). Other
than this, the mathematics needed are elementary. Proofs are not given. Analysis
of data is by multi-linear least squares regression and by analysis of variance. For
significance testing, the F-test is used. These methods are introduced in chapter 4,
together with discussion of the extremely important X'X (information) and (X'X)"1
(dispersion) matrices. Thus, although linear regression is used to analyse screening
and factorial designs, in general an understanding of the method is not necessary
at this point. A brief summary of the matrix algebra needed to understand the text
is to be found in appendix I.

HI. STARTING OUT IN EXPERIMENTAL DESIGN

A. Some Elementary Definitions

1. Quantitative factors and the factor space

Quantitative factors are those acting on the system that can take numerical values,
rate, time, percentage, amount... They are most often continuous, in that they may
be set at any value within prescribed limits. Examples are: the amount of liquid
added to a granulate, the time of an operation such as spheronization or granulation,
the drying temperature and the percentage of a certain excipient. Thus, if the
minimum granulation time is 1 minute and the maximum is 5 minutes the time may
be set at any value between 1 and 5 minutes (figure 1.2).
However, because of practical limitations, a quantitative factor may
sometimes be allowed only discrete levels if only certain settings are available -
for example, the sieve size used for screening a powder or granulate, the speed
setting on a mixer-granulator. Unless otherwise stated, a quantitative factor is
assumed continuous.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


Natural variables for quantitative factors
The natural variable associated with each quantitative factor takes a numerical value
corresponding to the factor level. The level of a quantitative factor ;', expressed in
terms of the units in which it is defined (so many litres, so many minutes ...), will
be written as Uit as in figure 1.2. We do not usually find it necessary to distinguish
between the factor and natural variable. However it is possible to define different
natural variables for the same factor. The factor temperature would normally be
expressed in units K or °C, but could equally well be expressed as K"1.

Associated (coded) variables


With each natural variable, we associate a non-dimensional, coded variable X:. This
coding, sometimes called normalization, is obtained by transforming the natural
variable, usually so that the level of the central value of the experimental domain
(see below) is zero. The limits of the domain in terms of coded variables are
usually identical for all variables. The extreme values may be "round numbers", ±
1, though this is by no means always so.

03
E
~o
u,
15 litres +1

10 litres -
-1 +1

5 litres - -1 granulation
time

2 min 3 min 4 min


Figure 1.2. Quantitative factors and the factor space. The axes for the natural
variables, granulation time and volume are labelled U,, U2 and the axes of the
corresponding coded variables are labelled X,, X2. The factor space covers the page
and extends beyond it, whereas the design space is the square enclosed by A", = ± 1,
X2=± 1).

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


Factor space
This is the k dimensional space defined by the k coded variables Xi for the
continuous quantitative factors being investigated. If we examine only two factors,
keeping all other conditions constant, it can be represented as a (two dimensional)
plane. Except for the special case of mixtures, factor space is defined in terms of
independent variables. If there are 3 factors, the factor space can be represented
diagrammatically in three dimensions.
We are only interested in studying quite a small part of the factor space, that
which is inside the experimental domain. Sometimes called a region of interest, it
is the part of the factor space enclosed by upper and lower levels of the coded
variables, or is within an ellipsoid about a centre of interest.
The design space is the factor space within this domain defined in terms of
the coded variables X,.

Factor space and design space for mixtures


For the important class of mixture experimental designs, the variables (proportions
or percentages of each constituent of the mixture) are not independent of one
another. If we represent the factor space by two-dimensional or pseudo-three-
dimensional drawings, the axes for the variables are not at right angles and the
factor space that has any real physical meaning is not infinite.

2. Qualitative factors

These take only discrete values. In pharmaceutical development, an example of a


qualitative variable might be the nature of an excipient in the formulation, for
example "diluent", the levels being "lactose", "cellulose", "mannitol" and
"phosphate". Or it might be the general type of machine, such as mixer-granulator,
used in a given process, if several models or sizes are being compared. In medicinal
chemistry it might be a group on a molecule. So if all the factors studied are
qualitative, the factor space consists of discrete points, equal in number to the
product of the available levels of all the factors, each representing a combination
of levels of the different factors, as shown in figure 1.3 for two factors.
Mixed factor spaces, where some factors are qualitative, or quantitative and
discrete and other quantitative factors are continuous, are also possible.

Coded variables for qualitative factors


The levels of qualitative factors are sometimes referred to by numerical quantities.
If we were comparing three pieces of equipment, the machine might take levels 1,
2, and 3 as in figure 1.3. However, unlike the qualitative continuous variable, no
other values are allowed and these numbers do not have any physical significance.
Level 3 is not (necessarily) "greater than" level 1. Among the 4 diluents in the
same figure, phosphate (level 4) is not greater than cellulose (level 2).

Design space for qualitative factors


This consists of the points representing all possible combinations of the levels that
are being studied for the associated coded variable for each factor. The total number

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


of points in the design space (and also the total number of possible distinct
experiments) is obtained by multiplying together the numbers of levels of each
factor.

Q) <B
u,
X 3

E|
0)

machine C - ——

machine B

machine A
X,
...4.
1 diluent

lactose cellulose mannitol phosphate

Figure 1.3. Qualitative factors and the factor space. The order, and spacing of the
factors is arbitrary.

3. Experimental runs

An experimental run is a practical manipulation or series of manipulations, carried


out under defined conditions (the levels of the different factors, those that are
allowed to vary in the design, and those which are held constant) resulting in a
(single) datum for each of the responses to be measured.
The combination of factor levels being studied in a run is represented by a
point in the design space. In an experimental design, all the runs may be under
different conditions (distinct points in the design space) or certain runs may be
replicates, being carried out under the same conditions. Each experimental run is
normally set up independently of every other run. Thus, even if conditions are
repeated, the apparatus or machine (assuming here that we are studying a process)
is set up afresh for each run. The term experiment in this book is normally used
to mean experimental run. The term experimental unit is also sometimes used.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


Responses are measured properties of the process (for example, yield, dissolution
rate, viscosity) sometimes referred to as dependent variables. The symbol for a
measured response will be y. The measured response for experiment i will be
written as yt.

5. Mathematical model

Normally referred to simply as the model, it is an expression defining the


dependence of a response variable on the independent variables - that is the
variables whose influence we are studying or whose optimum values we are trying
to find. For example, we have assumed above that the response could be described
by a first-order model. It is generally written in terms of the coded variables.
The models used in this book are all empirical, where the system is
essentially a "black box". The models are most frequently, but not invariably,
polynomials of a given order. These are examples of linear models (not to be
confused with first-order models). For example the second-order model in two
factors Xl and X2:

y = po + (3,*! + P^2+ (3ux,2

is a linear model, as all terms may be represented by a constant, P,, P,v, P;;,
multiplied by a variable, *,, x?, x-fy e represents the experimental error.
Theoretical or mechanistic models may exist, or be proposed. They might
be thermodynamic or kinetic in origin. They are most often non-linear models (6),
generally formulated as differential equations, and when an analytical solution is
available this is most often exponential in form. Transformation to a linear function
may occasionally be possible. It is rare to use these theoretical relationships directly
in pharmaceutical experimental design and the designs are generally not adapted for
determining or testing them. However, they often enter into the choice of factors
chosen for study, empirical models and constraints, and in the interpretation of the
results of factor-influence studies.

6. Experimental design

The design is the arrangement of experimental runs in design space (that is, defined
in terms of the coded variables). The design we choose depends on the proposed
model, the shape of the domain, and the objectives of the study (screening, factor
influence, optimization ...).

7. Experimental plan

This is the design transformed back into the real or natural variables of each factor,
normally with the runs in the order they are to be performed.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


8. R-efficiency

We have said that experimental design is to do with efficiency, and so we need to


be able to measure a cost-benefit ratio. The cost is easy to measure, and is,
essentially, the number of experiments. The "benefit" is more difficult to quantify.
It may be considered as the amount of information from the design. The R-
efficiency is the simplest measurement of a design's efficiency. It is the ratio of the
number of parameters in the model (p) to the number of experiments (N):

*=-£.<!
'ff N
This simply tells us the number of experiments required to calculate so
many parameters, but it tells us nothing of the quality of the estimation, which must
include not only the number of coefficients in the model, but also the precision to
which they are calculated. There are ways of quantifying the quality of information
obtained per experiment so that different experimental approaches can be compared,
and these, along with other definitions of efficiency which take such considerations
into account, will be discussed later in the book, especially in chapter 8.
The R-efficiency may also be expressed as a percentage.

B. Some Important Concepts

The following ideas will be developed over much of the book, but we will state
them here, immediately, for emphasis.

1. Improved precision by varying all factors

It is common to determine the effects of a number of factors by varying each in


turn. We can get a much better estimate (in terms of improved precision) using
designs where all factors are varied at the same time.

2. Reproducibility

Before starting the experimentation of a design it is necessary to have some idea


of the reproducibility of the experiments. This may be the result of a number of
repeated experimental runs (repeatability), but it may also be the result of the
experimenter's own experience if the method is well-known or a standard one. It
is, of course, normal procedure to estimate the repeatability by replicating
experiments within the design.
Repeating only part of an experimental procedure, for example testing
multiple units from the same batch of tablets, is not replication and indicates only
part of the repeatability. Such multiple testing is frequent and often necessary, but
the response in such cases is the mean value obtained and is considered as that of
a single experimental run.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


3. Planning and sequencing within a project

Steps and sequences in design


The optimum strategy is rarely that of carrying out the project in a single step. It
is unusual for one design to be enough. Research and development is in logical
stages and one design may follow on from another. Designs may be augmented if
the information they give is interesting but insufficient. The team must therefore be
made aware of the ways that designs may be carried out in several stages - or that
one design may re-use information from an earlier one.

Experimental work on a project may be carried out in the order:


• Screening
• Quantitative factor study
• Response surface modelling
• Optimization
• Validation

Statistical experimental design may be used at all these stages, though


evidently we do not need to carry through all of these steps for every project.
Screening may be omitted if the process is well enough known from analogous
studies, and we can go directly to a quantitative factor study of a limited number
of factors. Separate optimization stages may be required for the process and the
formulation. Scaling-up studies may be needed after optimization at a laboratory
scale.

Sequential and batch approaches


The sequential strategy is not always the best. The batch approach is particularly
appropriate to problems like stability testing. Although sequential testing may well
reduce the number of man-hours, much time could be lost in checking the stability
of one set of experiments before doing the rest, involving this could involve
unacceptable delays. Therefore a much larger design would normally be set up.
Increased speed can only be achieved here by a concurrent batch approach.
Each batch or block of experiments is done by a different operator, probably using
separate (identical or at least similar) machines. Each block will certainly have to
be carried out. The batch approach is thus likely to be expensive in terms of
resources, but more rapid.
Note that even in the case of stability testing it is possible to combine the
advantages of batch and sequential methodology by putting the total number of
experiments on store as a single batch, but by analysing them as sequential blocks,
all non-analysed samples going into the deep freeze.

Choice of design
Do not adapt the problem to the method chosen. Rather, choose a design that is
right for treating the problem that has been set.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


4. Preliminary experimentation

Some preliminary experimentation is usually necessary for the experimenter to


practice the technique, to estimate the repeatability, or ensure adequate
reproducibility before beginning the main study. However, because experimental
designs often comprise quite large numbers of experiments some workers are quite
reasonably concerned about getting it right before beginning the design. This can
give rise to the preliminary experiments syndrome! This can mean carrying out a
few runs, without any structure, just to "get the feel of it". These can easily add up
to a premature attempt at an optimization, as the experimenter does experiment run
after another, knowing that the next is going to be the "good one". Now preliminary
experiments are usually necessary, but there is a very good argument for structuring
them, so that they themselves are part of a planned design, and their results can be
interpreted. They can be done in three ways, replicated experiments, small
experimental designs, and few experiments chosen from a larger plan

Repeated experiments
In this way it is possible to check that the process is reasonably reproducible before
starting experimentation on the design, and may enable stabilisation of the
operator's technique.

Small screening or factorial designs


A small design which varies just two or three of the factors expected to be most
influential can be carried out. An example might be the 23"1 factorial design
described in chapter 3, section IV.E, requiring 4 runs. Replicated centre points may
be included. In this way, interpretable results can be obtained without sacrificing
too many resources.
Depending on our assessment of the situation, we may stop here and start
a new design, perhaps increasing the number of factors. Alternatively we may
decide to shift the experimental region to one centred on other values of X,. Or, in
the ideal case, this design might serve for part of a larger design for three factors
by the addition of one or two further blocks: a complete 23 factorial (see chapter
3), a central composite design, or a small central composite design (chapter 5,
section ni.A), respectively.

A few experiments chosen from a larger plan


Once a probable design has been selected, preliminary experiments can be taken
from this, either chosen at random or by taking especially interesting or potentially
difficult runs. Thus, they may be incorporated in the plan - when and if it is
completed.

5. Quality of a design

No experimental run contains any information on its own. The information it


provides depends on its position (in factor space) with respect to that of other
experimental runs. It is this arrangement of experiments, each with respect to all the

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


others, that is an experimental design. Therefore, careful reflection is required
before beginning experimentation.
The quality of the design depends on the mathematical model that is
assumed for the relationship being studied and on the arrangement of the
experiments in the factor space. It does not depend on the experimental results.
The quality of an experimental design can and should be assessed before the
experiments are started, and not after. This does not mean that the quality of the
results does not depend on the precision and accuracy of the experimental
measurements. These are just as important as one would suppose, but the quality
of measurement can be separated from the question of quality of design.

6. Efficiency and cost-benefit ratio

No single parameter describes a design's quality, and the choice of the "best"
design will almost invariably be a compromise. We will ask how many experiments
are needed per effect calculated and whether the estimates of the effects or
parameters calculated are independent. Quite simply, the cost is the number of
experiments and the benefit is the quality of the design. We will assess the amount
of information obtained per experiment and whether the precision of a calculated
response over the design space is relatively constant (as is generally preferred) or
varies widely from one part of the design space to another.
We often see statements like: "Experimental design enabled the necessary
information to be obtained using a small number of experiments". The reality does
not always bear this out. One reason is that the statistical design method enables
us to see very clearly exactly what information is available from a given design,
before doing the experiments. Analysis of the design will also indicate what will
remain unknown, and a perceived level of ignorance is not always acceptable.
In fact, the object of experimental design is to do the necessary
experimentation efficiently. We get what we pay for. The results are, one might say,
good value, but not necessarily cheap. Careful reading of this book will demonstrate
ways of improving efficiency and, sometimes, of achieving rapid development.

7. Significance

We use this term in the restricted sense of statistical significance. In other words,
if an effect is "significant", there is a high probability (95%, 99%, 99.9%) that the
effect is "real" - that is, different from zero. The determination of the significance
of an effect or of a mathematical model is an essential tool for the experimenter.
However, he should be careful here not to confuse (statistically) "significant"
with "important" (7). A significant effect may still be quite small. It is for the
researcher to decide whether statistically significant effects are of "practical" or
pharmaceutical significance, or no. If the experiment is highly reproducible, the
effects of certain factors which in practice are unimportant may show up as
significant in the statistical analysis. He may reasonably choose not to study them
further. On the other hand, there may be large effects which, if the estimated values
are correct, would be highly important, but which are not found to be significant

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


because of insufficient experimental precision. He may then come to a decision
based on the estimated effect, but also on the possible inaccuracy of the estimate.

C. Recommended Books on Experimental Design

A number of excellent books have been written on statistical experimental design


and references to these, and to useful articles, may be found in the bibliographies
of the various chapters. This present work, which attempts a practical introduction
to the range of approaches useful in pharmaceutical development, does not attempt
to replace these in any way. Two works in particular are essential reading -
Statistics for Experimenters, by Box, Hunter, and Hunter (6), and Cornell's book,
Experiments with Mixtures (8). Other valuable texts are given here (9, 10, 11) or
referenced later in the book.

D. Choosing Computer Software

The experimenter wishing to use statistical design will quickly find he needs to buy
a program for multi-linear regression. Specialized software is also necessary for
setting up non-standard designs in irregular shaped experimental domains and also
for optimization methods such as desirability, optimum path, and possibly canonical
analysis. Integrated packages for design allow the construction of standard designs
as well, and enable one to switch easily from coded to natural variables.
A large number of such packages are available to assist us, both at the
experimental design and the data analysis stages, and as a result some quite difficult
and "advanced" methods have become possible. Many of these methods are directly
applicable to the kind of problems likely to be encountered by the formulator and
development scientist and, consequently, they are described here in some detail.
This is so that the experimenter may best choose a design method when using such
a program, and also use the method correctly, and understand the significance of
the output which may refer to some quite complex statistical theory. Because we
have concentrated on these particular aspects of experimental design, screening,
factor studies, and optimization, our book does not cover such topics as the
comparison of two or more data sets, nor linear regression with a single variable,
in any detail. We have described factorial and fractional factorial designs relatively
briefly.
The choice of computer software is vital, and must be made with respect to
completeness of the range of designs, statistical methodology, data treatment,
particularly its graphical aspects. Ease of use, flexibility, and clarity are all
important, and of course the price is also to be considered. Programs are developing
and changing rapidly; anything we write here about individual programs is likely
to be out of date even before the time of publication. No individual program is
likely to be complete as far as all users are concerned. In this book we have used
three programs (12, 13, 14), but there are many others. A check-list of useful
features is given in appendix IV, to help the reader in his choice.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.


References

1. D. E. Coleman, D. C. Montgomery, B. H. Gunter, G. J. Hahn, P. D. Haaland,


M. A. O'Connell, R. V. Leon, A. C. Schoemaker and K.-L. Trui, A systematic
approach to planning for a designed experiment, Discussion, Response,
Technometrics, 35, 1-27 (1993).
2. W. F. Stansbury, Development of experimental designs for organic synthetic
reactions, Chemom. Intell. Lab. Syst. 36, 199-206 (1997),
3. D. C. Montgomery, Design and Analysis of Experiments, 2nd edition, J. Wiley,
N. Y., 1984.
4. J. C. Miller and J. N. Miller, Statistics for Analytical Chemistry, Ellis
Horwood, Chichester, U. K., 1984.
5. J. R. Green and D. Margerison, Statistical Treatment of Experimental Data,
Elsevier, Amsterdam, 1978.
6. G. E. P. Box, W. G. Hunter, and J. S. Hunter, Statistics for Experimenters, J.
Wiley, N. Y., 1978.
7. D. N. McCloskey, The insignificance of statistical significance, Scientific
American, 272, 32 (1996).
8. J. A. Cornell, Experiments with Mixtures, 2nd edition, J. Wiley, N. Y., 1990.
9. G. E. P. Box and N. B. Draper, Empirical Model-Building and Response
Surface Analysis, J. Wiley, N. Y., 1987.
10. P. D. Haaland, Experimental Design in Technology, Marcel Dekker, N. Y.,
1989.
11. G. Taguchi, System of Experimental Design, Vol. 1 and 2, N. Y. Unipub.,
1987.
12. RS/Discover, BBN Software Products, Bolt Beranek and Newman Inc.,
Cambridge, MA 02140, USA.
13. Design Expert®, Stat-Ease Inc., Minneapolis, MN 55413, USA.
14. NEMROD®, D. Mathieu and R. Phan-Tan-Luu, LPRAI SARL, Marseilles,
F-13331, France.

TM

Copyright n 1999 by Marcel Dekker, Inc. All Rights Reserved.

You might also like